In this section
LX1.10 Collection Scripting and Automation
Collection Scripting: Automating the Evidence Capture
Why Automate Collection
Manual evidence collection — typing each command individually during an incident — is slow, error-prone, and inconsistent. At 03:00 with adrenaline running, you will forget a command, type a path wrong, or collect items out of order. A script runs the same commands in the same order every time, captures all output to files, generates hashes automatically, and completes in a fraction of the time.
The script does not replace understanding. You must know what each command does, what the output means, and when to deviate from the script (because every incident is different). But the script provides the baseline — the standard collection that ensures you never miss the fundamentals while you focus on the investigation-specific decisions.
The Live Response Collection Script
#!/bin/bash
# live-response.sh — Linux Live Response Collection Script
# Usage: sudo ./live-response.sh [output_directory]
# Run on the compromised system. Output to external media.
set -euo pipefail
CASE_DIR="${1:-/tmp/lr-$(hostname)-$(date +%Y%m%d-%H%M%S)}"
mkdir -p "$CASE_DIR"/{volatile,logs,config,hashes}
log() { echo "[$(date -u +%H:%M:%S)] $*" | tee -a "$CASE_DIR/collection.log"; }
log "=== LIVE RESPONSE COLLECTION ==="
log "Host: $(hostname)"
log "Kernel: $(uname -r)"
log "Investigator: $(whoami)"
log "Start: $(date -u)"
log "Output: $CASE_DIR"
# Phase 1: Timestamp and system info
log "Collecting system info..."
date -u > "$CASE_DIR/volatile/timestamp.txt"
uname -a > "$CASE_DIR/volatile/uname.txt"
uptime > "$CASE_DIR/volatile/uptime.txt"
cat /etc/os-release > "$CASE_DIR/volatile/os-release.txt" 2>/dev/null
hostnamectl > "$CASE_DIR/volatile/hostnamectl.txt" 2>/dev/null
# Phase 2: Logged-in users
log "Collecting user sessions..."
who > "$CASE_DIR/volatile/who.txt"
w > "$CASE_DIR/volatile/w.txt"
last -50 > "$CASE_DIR/volatile/last-50.txt"
lastb -50 > "$CASE_DIR/volatile/lastb-50.txt" 2>/dev/null
# Phase 3: Running processes
log "Collecting process data..."
ps auxf > "$CASE_DIR/volatile/ps-auxf.txt"
ps -eo pid,ppid,user,stat,%cpu,%mem,vsz,rss,tty,start_time,time,comm,args \
--sort=-%cpu > "$CASE_DIR/volatile/ps-detailed.txt"
# /proc deep read — bypasses rootkits
log "Collecting /proc data (rootkit-resistant)..."
for pid in /proc/[0-9]*/; do
p=$(basename "$pid")
{
echo "=== PID $p ==="
echo "CMDLINE: $(cat /proc/$p/cmdline 2>/dev/null | tr '\0' ' ')"
echo "EXE: $(readlink -f /proc/$p/exe 2>/dev/null)"
echo "CWD: $(readlink -f /proc/$p/cwd 2>/dev/null)"
echo "USER: $(stat -c '%U' /proc/$p 2>/dev/null)"
echo "---"
} >> "$CASE_DIR/volatile/proc-deep.txt" 2>/dev/null
done
# Phase 4: Network state
log "Collecting network state..."
ss -tlnp > "$CASE_DIR/volatile/ss-tcp-listen.txt"
ss -tnp > "$CASE_DIR/volatile/ss-tcp-established.txt"
ss -ulnp > "$CASE_DIR/volatile/ss-udp-listen.txt"
cat /proc/net/tcp > "$CASE_DIR/volatile/proc-net-tcp.txt"
cat /proc/net/tcp6 > "$CASE_DIR/volatile/proc-net-tcp6.txt" 2>/dev/null
ip addr > "$CASE_DIR/volatile/ip-addr.txt"
ip route > "$CASE_DIR/volatile/ip-route.txt"
ip neigh > "$CASE_DIR/volatile/ip-neigh.txt"
cat /etc/resolv.conf > "$CASE_DIR/volatile/resolv.conf" 2>/dev/null
iptables-save > "$CASE_DIR/volatile/iptables.txt" 2>/dev/null
nft list ruleset > "$CASE_DIR/volatile/nftables.txt" 2>/dev/null
# Phase 5: Open files and deleted files
log "Collecting open files..."
lsof -i > "$CASE_DIR/volatile/lsof-network.txt" 2>/dev/null
lsof +L1 > "$CASE_DIR/volatile/lsof-deleted.txt" 2>/dev/null
lsof +D /tmp > "$CASE_DIR/volatile/lsof-tmp.txt" 2>/dev/null
lsof +D /dev/shm > "$CASE_DIR/volatile/lsof-devshm.txt" 2>/dev/null
# Phase 6: Kernel modules
log "Collecting kernel module data..."
lsmod > "$CASE_DIR/volatile/lsmod.txt"
dmesg > "$CASE_DIR/volatile/dmesg.txt" 2>/dev/null
# Phase 7: Volatile filesystem contents
log "Collecting staging area contents..."
ls -laR /tmp/ > "$CASE_DIR/volatile/tmp-listing.txt" 2>/dev/null
ls -laR /dev/shm/ > "$CASE_DIR/volatile/devshm-listing.txt" 2>/dev/null
find / -mtime -1 -type f -not -path "/proc/*" -not -path "/sys/*" \
-not -path "/run/*" 2>/dev/null | head -500 \
> "$CASE_DIR/volatile/recent-files-24h.txt"
find / -perm -4000 -type f 2>/dev/null \
> "$CASE_DIR/volatile/suid-files.txt"
# Phase 8: Key log files
log "Collecting log files..."
cp -a /var/log/auth.log* "$CASE_DIR/logs/" 2>/dev/null
cp -a /var/log/secure* "$CASE_DIR/logs/" 2>/dev/null
cp -a /var/log/syslog* "$CASE_DIR/logs/" 2>/dev/null
cp -a /var/log/messages* "$CASE_DIR/logs/" 2>/dev/null
cp -a /var/log/kern.log* "$CASE_DIR/logs/" 2>/dev/null
cp -a /var/log/audit/audit.log* "$CASE_DIR/logs/" 2>/dev/null
cp -a /var/log/wtmp* "$CASE_DIR/logs/" 2>/dev/null
cp -a /var/log/btmp* "$CASE_DIR/logs/" 2>/dev/null
cp -a /var/log/lastlog "$CASE_DIR/logs/" 2>/dev/null
cp -ra /var/log/journal/ "$CASE_DIR/logs/journal/" 2>/dev/null
# Phase 9: Persistence locations
log "Collecting persistence artifacts..."
cp -a /etc/crontab "$CASE_DIR/config/" 2>/dev/null
cp -ra /etc/cron.d/ "$CASE_DIR/config/cron.d/" 2>/dev/null
cp -ra /var/spool/cron/ "$CASE_DIR/config/spool-cron/" 2>/dev/null
ls -la /etc/systemd/system/*.service > "$CASE_DIR/config/systemd-services.txt" 2>/dev/null
cp -a /etc/systemd/system/*.service "$CASE_DIR/config/" 2>/dev/null
cat /etc/ld.so.preload > "$CASE_DIR/config/ld-so-preload.txt" 2>/dev/null
cp -a /etc/passwd "$CASE_DIR/config/"
cp -a /etc/shadow "$CASE_DIR/config/" 2>/dev/null
cp -a /etc/sudoers "$CASE_DIR/config/" 2>/dev/null
cp -a /etc/ssh/sshd_config "$CASE_DIR/config/" 2>/dev/null
# Phase 10: User artifacts (all home directories)
log "Collecting user artifacts..."
for home in /home/* /root; do
user=$(basename "$home")
udir="$CASE_DIR/config/users/$user"
mkdir -p "$udir"
cp -a "$home/.bash_history" "$udir/" 2>/dev/null
cp -a "$home/.ssh/authorized_keys" "$udir/" 2>/dev/null
cp -a "$home/.ssh/known_hosts" "$udir/" 2>/dev/null
stat "$home/.bash_history" > "$udir/bash_history_stat.txt" 2>/dev/null
done
# Generate hashes
log "Generating evidence hashes..."
find "$CASE_DIR" -type f -not -path "*/hashes/*" \
-exec sha256sum {} \; > "$CASE_DIR/hashes/evidence_hashes.sha256"
log "=== COLLECTION COMPLETE ==="
log "End: $(date -u)"
log "Evidence directory: $CASE_DIR"
log "Files collected: $(find "$CASE_DIR" -type f | wc -l)"
log "Total size: $(du -sh "$CASE_DIR" | cut -f1)"# Deploy and run remotely via SSH
scp live-response.sh investigator@target:/tmp/
ssh investigator@target "sudo /tmp/live-response.sh /tmp/evidence"
# Copy results back
scp -r investigator@target:/tmp/evidence/ ~/cases/IR-2026-0402/BASTION-NGE01/Myth: "An automated collection script is less forensically sound than manual collection because the investigator did not personally execute each command."
Reality: The opposite is true. A scripted collection executes the same commands in the same order every time, with automatic logging and hashing — eliminating the human errors (missed commands, wrong paths, forgotten hashes) that plague manual collection under pressure. Courts and forensic standards evaluate whether the collection was documented, repeatable, and complete — a script that logs every action and hashes every file meets all three criteria more reliably than manual execution. The investigator's expertise is applied to reviewing the script's output and making investigation-specific decisions, not to typing commands at 03:00 with shaking hands.
Try it yourself
Copy the live response script above to your forensic workstation.
Copy the live response script above to your forensic workstation. Run it against a lab VM: ssh investigator@bastion-nge01 "sudo bash -s" < live-response.sh. Examine the output directory structure. Check the collection log for timing — how long did each phase take? Check the hash file — verify a few hashes manually with sha256sum. This script is now part of your IR toolkit.
Beyond This Investigation
The collection script from this subsection is the automated version of the manual procedures taught throughout LX1. Every scenario module (LX4–LX13) assumes evidence was collected using either this script, UAC, or the equivalent manual commands. Customize the script for your infrastructure: add distribution-specific paths (LX0.7), add environment-specific collection (cloud API calls from LX1.3, container commands from LX1.8), and add organization-specific evidence targets (application-specific log files, custom monitoring output).
Check your understanding:
1. Why does the script redirect errors to /dev/null on many commands, and what risk does this create? 2. The script runs find / -mtime -1 to locate recently modified files. What evidence does this provide, and what are its limitations? 3. You need to run the collection script on 8 compromised servers. How would you modify the deployment approach to collect from all 8 efficiently? 4. The script collects /etc/shadow. Why is this forensically relevant, and what security consideration applies to handling this file?
Lab Exercise: Production DFIR Collection Script
The lab pack includes a 148-line production-ready collection script (artifacts/linux-dfir-collection.sh) that automates 12 collection phases: system info, user sessions, processes, network, open files, scheduled tasks, services, kernel modules, SSH artifacts, logs, history files, and package integrity. Compare it with the script you build in this sub.
You are investigating a Linux server and discover evidence of both a cryptominer (resource abuse) and an SSH key theft (lateral movement preparation). The cryptominer is consuming 95% CPU and impacting production. Which do you address first?
Address the lateral movement first. The cryptominer is visible, noisy, and contained to this server — it is causing performance impact but not spreading. The SSH key theft is silent, potentially already exploited, and may have given the attacker access to additional servers. Contain the lateral movement risk: rotate the stolen SSH keys, check the target servers for unauthorized access, and apply network restrictions. Then address the cryptominer: kill the process, remove the binary and persistence mechanisms. Prioritizing the noisy but contained threat over the silent but spreading threat is the most common Linux IR prioritization mistake.
Get weekly detection and investigation techniques
KQL queries, detection rules, and investigation methods — the same depth as this course, delivered every Tuesday.
No spam. Unsubscribe anytime. ~2,000 security practitioners.