In this section

LX1.5 The Collection Sequence

3-4 hours · Module 1 · Free

Putting It Together: The Complete Ordered Collection Procedure

Operational Objective
The Procedure Execution Question: It is 03:00 and BASTION-NGE01 is confirmed compromised. You know how to run UAC, how to capture volatile data, how to snapshot a cloud disk, and how to hash evidence — but in what order? If you image the disk before capturing memory, you lose volatile state. If you collect logs before volatile data, the attacker's running processes may exit while you are copying auth.log. If you forget the cloud audit trail, you miss the attacker's API-level activity entirely. You need a single, ordered, end-to-end procedure that sequences every collection technique correctly — so that at 03:00 with adrenaline shaking your hands, you execute from a checklist instead of reconstructing the procedure from memory.
Deliverable: The complete eight-phase collection procedure (Pre-Collection → Memory → Volatile → Cloud API → Container → Persistent → Disk → Close) with environment-specific decision points, ready to print and deploy as your operational collection checklist.
⏱ Estimated completion: 40 minutes

The Master Collection Sequence

Every Linux evidence collection follows the same logical order: most volatile evidence first, least volatile last. The specific commands vary by environment, but the sequence does not. What follows is the complete procedure, ordered and annotated.

This is the procedure you will follow — or delegate to a junior analyst to follow — on every Linux incident. Print it. Laminate it. Keep it in your IR toolkit alongside UAC and your forensic workstation. When the call comes at 03:00 and your hands are shaking from adrenaline, you do not want to be reconstructing this from memory.

Phase A: Pre-Collection (2 minutes)

# On your forensic workstation — create the case directory
export CASE="IR-2026-0402"
export TARGET="BASTION-NGE01"
mkdir -p ~/cases/$CASE/$TARGET/{volatile,logs,config,disk,memory,timeline,notes}

# Start your collection log
cat > ~/cases/$CASE/$TARGET/notes/collection_log.md << LOG
# Evidence Collection Log
# Case: $CASE
# Target: $TARGET
# Investigator: $(whoami)
# Collection started: $(date -u +"%Y-%m-%dT%H:%M:%SZ")
# -----------------------------------------------
LOG

# Verify your forensic workstation's time is NTP-synced
timedatectl status | grep "synchronized"
# Transfer LiME to the compromised system
# LiME must be compiled for the target's exact kernel version
scp lime-$(uname -r).ko investigator@target:/tmp/

# On the target — load LiME and dump memory to your workstation
ssh investigator@target "sudo insmod /tmp/lime-*.ko 'path=tcp:4444 format=lime'"

# On your workstation — receive the memory dump
nc target 4444 > ~/cases/$CASE/$TARGET/memory/memory.lime

# Hash immediately
sha256sum ~/cases/$CASE/$TARGET/memory/memory.lime > ~/cases/$CASE/$TARGET/memory/memory.lime.sha256

# Remove LiME from the target
ssh investigator@target "sudo rmmod lime"
# Transfer UAC to the target
scp -r uac/ investigator@target:/tmp/uac/

# Run UAC — output piped back to your workstation
ssh investigator@target "cd /tmp/uac && sudo ./uac -p ir_triage -o /tmp/uac-output/"

# Copy UAC output to your workstation
scp -r investigator@target:/tmp/uac-output/ ~/cases/$CASE/$TARGET/volatile/uac/

# Hash the UAC output
find ~/cases/$CASE/$TARGET/volatile/uac/ -type f -exec sha256sum {} \; > ~/cases/$CASE/$TARGET/volatile/uac_hashes.sha256
# Execute the 7-step live response sequence from subsection 02
# Pipe all output to your workstation

ssh investigator@target "date -u" > ~/cases/$CASE/$TARGET/volatile/timestamp.txt
ssh investigator@target "who; echo '---'; w" > ~/cases/$CASE/$TARGET/volatile/sessions.txt
ssh investigator@target "ps auxf" > ~/cases/$CASE/$TARGET/volatile/processes.txt
ssh investigator@target "ss -tlnp; echo '==='; ss -tnp; echo '==='; cat /proc/net/tcp" > ~/cases/$CASE/$TARGET/volatile/network.txt
ssh investigator@target "sudo lsof +L1" > ~/cases/$CASE/$TARGET/volatile/deleted_open_files.txt
ssh investigator@target "lsmod" > ~/cases/$CASE/$TARGET/volatile/kernel_modules.txt
ssh investigator@target "ls -laR /tmp/ /dev/shm/" > ~/cases/$CASE/$TARGET/volatile/staging_areas.txt
ssh investigator@target "find / -mtime -1 -type f -not -path '/proc/*' -not -path '/sys/*' 2>/dev/null" > ~/cases/$CASE/$TARGET/volatile/recent_files.txt
ssh investigator@target "find / -perm -4000 -type f 2>/dev/null" > ~/cases/$CASE/$TARGET/volatile/suid_files.txt

# /proc deep collection — every process
ssh investigator@target 'for p in /proc/[0-9]*/; do echo "=== PID=$(basename $p) ==="; cat $p/cmdline 2>/dev/null | tr "\0" " "; echo; readlink -f $p/exe 2>/dev/null; echo "CWD: $(readlink -f $p/cwd 2>/dev/null)"; echo; done' > ~/cases/$CASE/$TARGET/volatile/proc_deep.txt

# Hash all volatile evidence
find ~/cases/$CASE/$TARGET/volatile/ -type f -exec sha256sum {} \; > ~/cases/$CASE/$TARGET/volatile/hashes.sha256
# Copy suspicious files from staging areas
scp investigator@target:/dev/shm/.cache/worker ~/cases/$CASE/$TARGET/volatile/recovered_binary
scp investigator@target:/tmp/.hidden/payload.sh ~/cases/$CASE/$TARGET/volatile/recovered_script

# Hash recovered files
sha256sum ~/cases/$CASE/$TARGET/volatile/recovered_* >> ~/cases/$CASE/$TARGET/volatile/hashes.sha256
# AWS example — adapt for Azure/GCP using commands from subsection 03

# Disk snapshot (no-login collection)
aws ec2 create-snapshot --volume-id vol-0abc123 \
  --description "$CASE forensic snapshot $TARGET" \
  > ~/cases/$CASE/$TARGET/disk/snapshot_response.json

# CloudTrail events
aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=ResourceName,AttributeValue=i-0abc123 \
  --start-time "2026-03-01T00:00:00Z" \
  --output json > ~/cases/$CASE/$TARGET/logs/cloudtrail.json

# Instance metadata (security group, IAM role, network config)
aws ec2 describe-instances --instance-ids i-0abc123 \
  --output json > ~/cases/$CASE/$TARGET/config/instance_metadata.json

# VPC Flow Logs (if available)
# [Commands depend on flow log configuration]
# Docker
docker inspect $CONTAINER > ~/cases/$CASE/$TARGET/config/container_inspect.json
docker logs $CONTAINER > ~/cases/$CASE/$TARGET/logs/container_stdout.txt 2>&1
docker diff $CONTAINER > ~/cases/$CASE/$TARGET/volatile/container_diff.txt
docker export $CONTAINER > ~/cases/$CASE/$TARGET/disk/container_filesystem.tar

# Kubernetes
kubectl describe pod $POD -n $NS > ~/cases/$CASE/$TARGET/config/pod_describe.txt
kubectl logs $POD -n $NS --all-containers > ~/cases/$CASE/$TARGET/logs/pod_logs.txt
kubectl get events -n $NS --field-selector involvedObject.name=$POD > ~/cases/$CASE/$TARGET/logs/k8s_events.txt
# Log files — complete /var/log directory
scp -r investigator@target:/var/log/ ~/cases/$CASE/$TARGET/logs/var_log/

# System configuration
scp -r investigator@target:/etc/ ~/cases/$CASE/$TARGET/config/etc/

# User artifacts — all home directories
ssh investigator@target "sudo tar czf /tmp/home_dirs.tar.gz /home/ /root/" 
scp investigator@target:/tmp/home_dirs.tar.gz ~/cases/$CASE/$TARGET/config/
ssh investigator@target "rm /tmp/home_dirs.tar.gz"

# Cron jobs — all users
scp -r investigator@target:/var/spool/cron/ ~/cases/$CASE/$TARGET/config/cron/

# Systemd custom services
ssh investigator@target "ls /etc/systemd/system/*.service" > ~/cases/$CASE/$TARGET/config/custom_services.txt
scp investigator@target:/etc/systemd/system/*.service ~/cases/$CASE/$TARGET/config/systemd/ 2>/dev/null

# Hash all persistent evidence
find ~/cases/$CASE/$TARGET/logs/ ~/cases/$CASE/$TARGET/config/ -type f -exec sha256sum {} \; > ~/cases/$CASE/$TARGET/persistent_hashes.sha256
# Live disk image over SSH (bare-metal or VM with SSH access)
ssh investigator@target "sudo dc3dd if=/dev/sda hash=sha256" | \
  dc3dd of=~/cases/$CASE/$TARGET/disk/disk.raw hash=sha256 log=~/cases/$CASE/$TARGET/disk/acquisition.log

# Cloud VM — use the snapshot created in Phase D
# Attach snapshot as volume to forensic VM, then image from there
# Complete the collection log
echo "# Collection ended: $(date -u +"%Y-%m-%dT%H:%M:%SZ")" >> ~/cases/$CASE/$TARGET/notes/collection_log.md

# Generate master hash manifest
find ~/cases/$CASE/$TARGET/ -type f -not -name "master_hashes.sha256" -exec sha256sum {} \; > ~/cases/$CASE/$TARGET/master_hashes.sha256

# Verify the evidence directory structure
tree ~/cases/$CASE/$TARGET/ > ~/cases/$CASE/$TARGET/notes/evidence_tree.txt
Expand for Deeper Context

Before touching the compromised system, prepare your collection infrastructure.

Decision point 1: Can you access the system? If SSH access is available, proceed to Phase B. If the server is in a cloud environment and you have API access but not SSH, skip to Phase D (Cloud API Collection). If the system is a container, skip to Phase E (Container Collection). If the system is physical and you have no remote access, arrange physical access and skip to Phase F (Disk Imaging).

Phase B: Memory Acquisition (5–15 minutes)

Memory is the most volatile evidence. Acquire it before running any other commands on the system. Once you run ps, ss, or any other command, you have modified the memory state — processes are created, memory is allocated, caches are updated.

If LiME is not available (you do not have a pre-compiled module for the target kernel), skip memory acquisition and proceed to Phase C. Document that memory was not acquired and why. You can still perform a partial memory analysis using /proc/kcore if the kernel exposes it, but this is limited compared to a full LiME dump.

If this is a cloud VM and the cloud provider offers hypervisor-level memory acquisition (rare — most do not expose this to customers), use that instead of LiME. It does not require loading a kernel module on the compromised system.

Phase C: Live Volatile Collection (10–20 minutes)

This is the live response sequence from the previous subsection, executed in order. If UAC is available, run UAC with the ir_triage profile first, then supplement with manual commands for anything UAC does not capture. If UAC is not available, run the manual commands.

Option 1: UAC triage (recommended)

Option 2: Manual collection (if UAC unavailable)

After volatile collection — collect the attacker's staging files. If you identified suspicious files in /tmp or /dev/shm during the volatile collection, copy them to your workstation now — before they are deleted or the system is rebooted:

Phase D: Cloud API Collection (5–10 minutes)

For cloud VMs, collect the cloud-layer evidence that exists outside the VM.

Phase E: Container Collection (3–5 minutes)

For containers, speed is paramount. Collect before the container restarts.

Phase F: Persistent Evidence Collection (15–30 minutes)

After volatile evidence is secured, collect the persistent evidence — log files, configuration, and user artifacts. These survive reboots and are less time-sensitive, but should be collected before log rotation runs.

Phase G: Disk Image (30–120 minutes)

If a full disk image is required (legal proceedings, comprehensive forensic analysis, or you need to recover deleted files), acquire it last — it is the slowest step and the evidence it captures is the least volatile.

Phase H: Post-Collection (5 minutes)

COMPLETE COLLECTION SEQUENCE — DECISION FLOW A: PREP Case dir, log 2 min B: MEMORY LiME dump 5-15 min C: VOLATILE UAC / manual 10-20 min D: CLOUD Snapshot + API 5-10 min F: PERSIST Logs, config 15-30 min G: DISK Full image 30-120 min H: CLOSE Master hash 5 min Cloud VM? Skip B, use D for disk. Container? Use E instead of C+F+G. Total: 70-200 min bare-metal, 25-50 min cloud Every phase produces hashed evidence files. Master manifest generated at close. Chain of custody documented throughout.
Figure LX1.5 — The complete eight-phase collection sequence from pre-collection through post-collection close. Environment determines which phases apply: cloud VMs use Phase D instead of G, containers use Phase E instead of C+F+G. Every phase produces hashed evidence files.

Adapting the Sequence

Time-critical / attacker active: Skip Phase B (memory) if LiME is not pre-compiled. Run Phase C with UAC ir_triage profile only. Skip Phase G (disk) initially. Get volatile evidence and begin analysis immediately. Return for comprehensive collection later.

Business-critical server / cannot go offline: Full sequence Phases A through F. Skip Phase G — do not image the disk while the server is serving production traffic. Use the cloud disk snapshot (Phase D) as your disk evidence instead.

Legal proceedings anticipated: Full sequence, all phases. Dual hashing (SHA256 + MD5). Witness present during collection. Evidence sealed in tamper-evident bags. Disk image with hardware write blocker if bare-metal.

Container environment: Phases A and E only. Container collection is self-contained — the docker export or kubectl cp commands capture the filesystem, and the container logs capture the application output. Add Phase D if the container runs on a cloud VM and you need the cloud audit trail.

Worked artifact — Collection sequence decision worksheet:

Complete this worksheet when the incident is declared to determine which phases apply. Pin it to the case directory.

Case: INC-2026-XXXX Target system: [hostname] Type: ☐ Bare-metal ☐ Cloud VM ☐ Container ☐ Mixed

Environment assessment: - SSH access available: ☐ Yes ☐ No (alternative access: ___) - Cloud API access available: ☐ Yes (provider: ___) ☐ No ☐ N/A - LiME pre-compiled for target kernel: ☐ Yes (version: ___) ☐ No → skip Phase B - UAC available on IR toolkit: ☐ Yes ☐ No → use manual collection in Phase C - Attacker currently active: ☐ Yes → prioritize Phases C then B ☐ No ☐ Unknown - Legal proceedings anticipated: ☐ Yes → elevated handling throughout ☐ No ☐ Uncertain → use elevated

Phases to execute (check all that apply): - ☐ Phase A: Pre-Collection (all investigations) - ☐ Phase B: Memory acquisition (bare-metal/VM with SSH + LiME available) - ☐ Phase C: Live volatile collection (bare-metal/VM with SSH access) - ☐ Phase D: Cloud API collection (cloud VMs — snapshot + audit trail) - ☐ Phase E: Container collection (Docker/Kubernetes) - ☐ Phase F: Persistent evidence (bare-metal/VM — logs, config, user artifacts) - ☐ Phase G: Full disk image (legal proceedings, deleted file recovery needed) - ☐ Phase H: Post-collection (all investigations)

Estimated total time: ___ minutes Actual completion time: ___ minutes

Troubleshooting: common collection sequence issues

You arrive on scene and do not know what environment the target system is. Ask the infrastructure team or check the asset inventory: is it bare-metal, a VM, or a container? If nobody knows, SSH in and check: systemd-detect-virt reports the virtualization technology (kvm, vmware, docker, lxc, none). cat /proc/1/cgroup shows container cgroup namespaces if running inside a container. dmidecode -s system-manufacturer reports the hardware or VM vendor. Identify the environment before choosing your collection phases.

LiME is not compiled for the target kernel version and you cannot compile it. Skip Phase B (memory) and document the gap. Proceed directly to Phase C (volatile collection). The /proc deep reads in the live response sequence capture significant process-level information that partially compensates for the missing memory dump. If rootkit detection is critical, you will need to revisit memory acquisition after obtaining or compiling the correct LiME module — but do not delay the remaining collection while resolving the LiME issue.

The SSH session drops during volatile collection and you cannot reconnect. The attacker may have detected your session and modified the SSH configuration or firewall rules. Attempt alternative access: cloud serial console, out-of-band management (IPMI/iLO/iDRAC), or physical console. If you collected partial volatile data before disconnection, hash and preserve what you have. Document the disconnection time and the evidence gap. The evidence you collected before the drop is still valid — just incomplete.

Log rotation runs during your collection and rotates auth.log. The original auth.log is now auth.log.1, and the current auth.log contains only events after the rotation. Collect both: the rotated file (auth.log.1) contains the historical evidence, and the current file contains any new events. UAC collects all rotated log copies automatically. For manual collection, always use wildcards: scp target:/var/log/auth.log* evidence/logs/.

You need to collect from 5 systems simultaneously but you are the only analyst. Prioritize by volatility and attacker activity. Run Phase C (volatile) on the system where the attacker is currently active first. For the remaining systems, script the collection: a bash loop running UAC via SSH on each system in parallel (for host in srv1 srv2 srv3; do ssh $host "cd /tmp/uac && sudo ./uac -p ir_triage -o /tmp/uac-out/" & done; wait). Parallel UAC runs on 5 systems complete in the same time as a single run — the bottleneck is network bandwidth, not analyst time.

Beyond this investigation: LX16 (IR Readiness) automates this collection sequence through Ansible playbooks that collect evidence from multiple hosts simultaneously, reducing collection time from hours to minutes.

Myth: "You must follow every phase of the collection procedure in order, or the evidence is compromised."

Reality: The sequence represents the optimal order for maximum evidence preservation — most volatile first, least volatile last. But real incidents rarely allow the optimal path. If the attacker is actively exfiltrating data, you may jump directly to Phase C (volatile collection) to capture their activity, skipping Phase B (memory) entirely. If you only have cloud API access and no SSH, you skip Phases B, C, and F and rely on the disk snapshot and audit trail. The sequence is a framework, not a rigid script. What matters is that you document what you collected, what you skipped, and why. An investigation with Phases A, C, D, and H (skipping memory, persistent collection, and disk imaging) is a valid investigation if you document the decisions and their rationale.

Try it yourself

Exercise

Create the case directory structure on your forensic workstation: mkdir -p ~/cases/TEST-001/TEST-VM/{volatile,logs,config,disk,memory,timeline,notes}. Run the Phase C manual collection against a test VM (not production). Time yourself. The first time takes 20–30 minutes. By the third practice run, you will complete it in 10–12 minutes. Speed matters in incident response — practice the procedure before you need it.

Beyond This Investigation

This collection sequence is the starting point for every investigation scenario in this course. LX4 through LX13 all begin with "evidence has been collected from the compromised system." The evidence they reference is the output of this procedure — the UAC triage data, the live response output, the log files, the disk image. When a scenario says "examine the process listing," it means the processes.txt or UAC live_response/process/ output from Phase C. When it says "analyze the authentication logs," it means the auth.log copies from Phase F.

Check your understanding:

1. Why is memory acquisition (Phase B) performed before live volatile collection (Phase C)? 2. The attacker is actively logged in and you see their session in who. Which phases do you prioritize, and which can you defer? 3. You are investigating a cloud VM but do not have SSH access — only AWS console access. Which phases can you still execute? 4. After completing the full collection sequence, how do you verify that no evidence files were modified during transfer to your forensic workstation?

Lab Exercise: LX01 — Evidence Collection

If you have the Linux IR Lab Pack installed, run the generator and practice the full evidence collection workflow: volatile state capture, log collection, filesystem timeline analysis, bash history mapping, auditd correlation, and chain of custody documentation. The generator creates ~3,500 lines of auth.log with attack indicators buried in 7 days of legitimate noise.

Verify: ./verification/Verify-LX01.sh [your-output-dir]

Decision point

You are investigating a Linux server and discover evidence of both a cryptominer (resource abuse) and an SSH key theft (lateral movement preparation). The cryptominer is consuming 95% CPU and impacting production. Which do you address first?

Address the lateral movement first. The cryptominer is visible, noisy, and contained to this server — it is causing performance impact but not spreading. The SSH key theft is silent, potentially already exploited, and may have given the attacker access to additional servers. Contain the lateral movement risk: rotate the stolen SSH keys, check the target servers for unauthorized access, and apply network restrictions. Then address the cryptominer: kill the process, remove the binary and persistence mechanisms. Prioritizing the noisy but contained threat over the silent but spreading threat is the most common Linux IR prioritization mistake.

Unlock the Full Course See Full Course Agenda