LX1.10 Collection Scripting and Automation

3-4 hours · Module 1 · Free

Collection Scripting: Automating the Evidence Capture

Learning objective: Build automated collection scripts that execute the evidence gathering procedures from this module — consistently, completely, and fast. Understand when to use automated scripts vs manual commands, how to handle errors during automated collection, and how to integrate collection scripts with your IR toolkit so they are ready to deploy instantly when an incident is confirmed.

Why Automate Collection

Manual evidence collection — typing each command individually during an incident — is slow, error-prone, and inconsistent. At 03:00 with adrenaline running, you will forget a command, type a path wrong, or collect items out of order. A script runs the same commands in the same order every time, captures all output to files, generates hashes automatically, and completes in a fraction of the time.

The script does not replace understanding. You must know what each command does, what the output means, and when to deviate from the script (because every incident is different). But the script provides the baseline — the standard collection that ensures you never miss the fundamentals while you focus on the investigation-specific decisions.

The Live Response Collection Script

This script executes the complete live response sequence from LX1.2, saves all output to a structured directory, and generates SHA256 hashes for every evidence file.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
#!/bin/bash
# live-response.sh — Linux Live Response Collection Script
# Usage: sudo ./live-response.sh [output_directory]
# Run on the compromised system. Output to external media.

set -euo pipefail

CASE_DIR="${1:-/tmp/lr-$(hostname)-$(date +%Y%m%d-%H%M%S)}"
mkdir -p "$CASE_DIR"/{volatile,logs,config,hashes}

log() { echo "[$(date -u +%H:%M:%S)] $*" | tee -a "$CASE_DIR/collection.log"; }

log "=== LIVE RESPONSE COLLECTION ==="
log "Host: $(hostname)"
log "Kernel: $(uname -r)"
log "Investigator: $(whoami)"
log "Start: $(date -u)"
log "Output: $CASE_DIR"

# Phase 1: Timestamp and system info
log "Collecting system info..."
date -u > "$CASE_DIR/volatile/timestamp.txt"
uname -a > "$CASE_DIR/volatile/uname.txt"
uptime > "$CASE_DIR/volatile/uptime.txt"
cat /etc/os-release > "$CASE_DIR/volatile/os-release.txt" 2>/dev/null
hostnamectl > "$CASE_DIR/volatile/hostnamectl.txt" 2>/dev/null

# Phase 2: Logged-in users
log "Collecting user sessions..."
who > "$CASE_DIR/volatile/who.txt"
w > "$CASE_DIR/volatile/w.txt"
last -50 > "$CASE_DIR/volatile/last-50.txt"
lastb -50 > "$CASE_DIR/volatile/lastb-50.txt" 2>/dev/null

# Phase 3: Running processes
log "Collecting process data..."
ps auxf > "$CASE_DIR/volatile/ps-auxf.txt"
ps -eo pid,ppid,user,stat,%cpu,%mem,vsz,rss,tty,start_time,time,comm,args \
  --sort=-%cpu > "$CASE_DIR/volatile/ps-detailed.txt"

# /proc deep read — bypasses rootkits
log "Collecting /proc data (rootkit-resistant)..."
for pid in /proc/[0-9]*/; do
  p=$(basename "$pid")
  {
    echo "=== PID $p ==="
    echo "CMDLINE: $(cat /proc/$p/cmdline 2>/dev/null | tr '\0' ' ')"
    echo "EXE: $(readlink -f /proc/$p/exe 2>/dev/null)"
    echo "CWD: $(readlink -f /proc/$p/cwd 2>/dev/null)"
    echo "USER: $(stat -c '%U' /proc/$p 2>/dev/null)"
    echo "---"
  } >> "$CASE_DIR/volatile/proc-deep.txt" 2>/dev/null
done

# Phase 4: Network state
log "Collecting network state..."
ss -tlnp > "$CASE_DIR/volatile/ss-tcp-listen.txt"
ss -tnp > "$CASE_DIR/volatile/ss-tcp-established.txt"
ss -ulnp > "$CASE_DIR/volatile/ss-udp-listen.txt"
cat /proc/net/tcp > "$CASE_DIR/volatile/proc-net-tcp.txt"
cat /proc/net/tcp6 > "$CASE_DIR/volatile/proc-net-tcp6.txt" 2>/dev/null
ip addr > "$CASE_DIR/volatile/ip-addr.txt"
ip route > "$CASE_DIR/volatile/ip-route.txt"
ip neigh > "$CASE_DIR/volatile/ip-neigh.txt"
cat /etc/resolv.conf > "$CASE_DIR/volatile/resolv.conf" 2>/dev/null
iptables-save > "$CASE_DIR/volatile/iptables.txt" 2>/dev/null
nft list ruleset > "$CASE_DIR/volatile/nftables.txt" 2>/dev/null

# Phase 5: Open files and deleted files
log "Collecting open files..."
lsof -i > "$CASE_DIR/volatile/lsof-network.txt" 2>/dev/null
lsof +L1 > "$CASE_DIR/volatile/lsof-deleted.txt" 2>/dev/null
lsof +D /tmp > "$CASE_DIR/volatile/lsof-tmp.txt" 2>/dev/null
lsof +D /dev/shm > "$CASE_DIR/volatile/lsof-devshm.txt" 2>/dev/null

# Phase 6: Kernel modules
log "Collecting kernel module data..."
lsmod > "$CASE_DIR/volatile/lsmod.txt"
dmesg > "$CASE_DIR/volatile/dmesg.txt" 2>/dev/null

# Phase 7: Volatile filesystem contents
log "Collecting staging area contents..."
ls -laR /tmp/ > "$CASE_DIR/volatile/tmp-listing.txt" 2>/dev/null
ls -laR /dev/shm/ > "$CASE_DIR/volatile/devshm-listing.txt" 2>/dev/null
find / -mtime -1 -type f -not -path "/proc/*" -not -path "/sys/*" \
  -not -path "/run/*" 2>/dev/null | head -500 \
  > "$CASE_DIR/volatile/recent-files-24h.txt"
find / -perm -4000 -type f 2>/dev/null \
  > "$CASE_DIR/volatile/suid-files.txt"

# Phase 8: Key log files
log "Collecting log files..."
cp -a /var/log/auth.log* "$CASE_DIR/logs/" 2>/dev/null
cp -a /var/log/secure* "$CASE_DIR/logs/" 2>/dev/null
cp -a /var/log/syslog* "$CASE_DIR/logs/" 2>/dev/null
cp -a /var/log/messages* "$CASE_DIR/logs/" 2>/dev/null
cp -a /var/log/kern.log* "$CASE_DIR/logs/" 2>/dev/null
cp -a /var/log/audit/audit.log* "$CASE_DIR/logs/" 2>/dev/null
cp -a /var/log/wtmp* "$CASE_DIR/logs/" 2>/dev/null
cp -a /var/log/btmp* "$CASE_DIR/logs/" 2>/dev/null
cp -a /var/log/lastlog "$CASE_DIR/logs/" 2>/dev/null
cp -ra /var/log/journal/ "$CASE_DIR/logs/journal/" 2>/dev/null

# Phase 9: Persistence locations
log "Collecting persistence artifacts..."
cp -a /etc/crontab "$CASE_DIR/config/" 2>/dev/null
cp -ra /etc/cron.d/ "$CASE_DIR/config/cron.d/" 2>/dev/null
cp -ra /var/spool/cron/ "$CASE_DIR/config/spool-cron/" 2>/dev/null
ls -la /etc/systemd/system/*.service > "$CASE_DIR/config/systemd-services.txt" 2>/dev/null
cp -a /etc/systemd/system/*.service "$CASE_DIR/config/" 2>/dev/null
cat /etc/ld.so.preload > "$CASE_DIR/config/ld-so-preload.txt" 2>/dev/null
cp -a /etc/passwd "$CASE_DIR/config/"
cp -a /etc/shadow "$CASE_DIR/config/" 2>/dev/null
cp -a /etc/sudoers "$CASE_DIR/config/" 2>/dev/null
cp -a /etc/ssh/sshd_config "$CASE_DIR/config/" 2>/dev/null

# Phase 10: User artifacts (all home directories)
log "Collecting user artifacts..."
for home in /home/* /root; do
  user=$(basename "$home")
  udir="$CASE_DIR/config/users/$user"
  mkdir -p "$udir"
  cp -a "$home/.bash_history" "$udir/" 2>/dev/null
  cp -a "$home/.ssh/authorized_keys" "$udir/" 2>/dev/null
  cp -a "$home/.ssh/known_hosts" "$udir/" 2>/dev/null
  stat "$home/.bash_history" > "$udir/bash_history_stat.txt" 2>/dev/null
done

# Generate hashes
log "Generating evidence hashes..."
find "$CASE_DIR" -type f -not -path "*/hashes/*" \
  -exec sha256sum {} \; > "$CASE_DIR/hashes/evidence_hashes.sha256"

log "=== COLLECTION COMPLETE ==="
log "End: $(date -u)"
log "Evidence directory: $CASE_DIR"
log "Files collected: $(find "$CASE_DIR" -type f | wc -l)"
log "Total size: $(du -sh "$CASE_DIR" | cut -f1)"

Deploying the Script

Store the collection script in three locations: on your forensic workstation (primary), on a USB drive in your IR kit (for physical access), and in a network share accessible from any server in your infrastructure (for rapid remote deployment).

1
2
3
4
5
6
# Deploy and run remotely via SSH
scp live-response.sh investigator@target:/tmp/
ssh investigator@target "sudo /tmp/live-response.sh /tmp/evidence"

# Copy results back
scp -r investigator@target:/tmp/evidence/ ~/cases/IR-2026-0402/BASTION-NGE01/

When to Deviate from the Script

The script captures the standard evidence set. Deviate when:

The triage decision framework (LX1.9) identifies a specific incident type that requires focused collection not covered by the standard script. Add the incident-specific commands to the end of the script or run them manually after the script completes.

The script encounters errors on specific commands (a command does not exist on this distribution, a directory does not exist, a permission is denied). The 2>/dev/null redirections suppress most errors, but review the collection.log for any commands that failed completely.

The attacker is actively deleting evidence while the script runs. If you observe evidence being destroyed (file sizes decreasing, log files being truncated), you may need to interrupt the script’s sequential execution and immediately capture the evidence under threat.

Try it: Copy the live response script above to your forensic workstation. Run it against a lab VM: ssh investigator@bastion-nge01 "sudo bash -s" < live-response.sh. Examine the output directory structure. Check the collection log for timing — how long did each phase take? Check the hash file — verify a few hashes manually with sha256sum. This script is now part of your IR toolkit.

Beyond This Investigation

The collection script from this subsection is the automated version of the manual procedures taught throughout LX1. Every scenario module (LX4–LX13) assumes evidence was collected using either this script, UAC, or the equivalent manual commands. Customize the script for your infrastructure: add distribution-specific paths (LX0.7), add environment-specific collection (cloud API calls from LX1.3, container commands from LX1.8), and add organization-specific evidence targets (application-specific log files, custom monitoring output).

Check your understanding:

Why does the script redirect errors to /dev/null on many commands, and what risk does this create?
The script runs find / -mtime -1 to locate recently modified files. What evidence does this provide, and what are its limitations?
You need to run the collection script on 8 compromised servers. How would you modify the deployment approach to collect from all 8 efficiently?
The script collects /etc/shadow. Why is this forensically relevant, and what security consideration applies to handling this file?

You're reading the free modules of this course

The full course continues with advanced topics, production detection rules, worked investigation scenarios, and deployable artifacts. Premium subscribers get access to all courses.

View Pricing See Full Syllabus

← LX1.9 The Triage Decision Framework LX1.11 Module Summary →