In this section
LX1.9 The Triage Decision Framework
Triage Decision Framework: What to Collect First and Why
The Triage Problem
In theory, you collect everything. In practice, you have constraints: the attacker may be active and you need to contain before they exfiltrate more data, the server is business-critical and must return to service within hours, the incident affects 15 servers and you are one investigator, or the container will restart in 3 minutes and you need to decide what to grab first.
Triage is the discipline of prioritizing evidence collection based on what is most volatile, most relevant to the investigation questions, and most at risk of destruction. An investigator who follows a rigid collection checklist regardless of circumstances collects evidence methodically but slowly. An investigator who triages effectively focuses on the evidence that answers the most critical questions first, then expands collection as time permits.
Decision Factor 1: Is the Attacker Currently Active?
# === RAPID TRIAGE SCRIPT — RUN THIS FIRST ON EACH SUSPECT HOST ===
# Step 1: Is the attacker currently active? (10 seconds)
echo "=== ACTIVE SESSIONS ==="
w # Who is logged in right now?
ss -tnp | grep ESTAB # Active network connections with process names
# Step 2: What processes are running? (10 seconds)
echo "=== SUSPICIOUS PROCESSES ==="
ps auxf | grep -vE '(root|www-data|syslog|daemon).*(/usr/|/lib/)' | head -20
# Look for: processes in /tmp, /dev/shm, or without full paths
# Step 3: Network connections to external IPs (10 seconds)
echo "=== OUTBOUND CONNECTIONS ==="
ss -tnp | awk '$5 !~ /^(127\.|10\.|172\.(1[6-9]|2|3[01])\.|192\.168\.)/'
# Shows connections to non-RFC1918 addresses (potential C2)
# Step 4: Recently modified files in suspicious locations (15 seconds)
echo "=== RECENT SUSPICIOUS FILES ==="
find /tmp /dev/shm /var/tmp -type f -mtime -7 2>/dev/null
find /var/www -name "*.php" -mtime -7 2>/dev/null
# Step 5: Quick check for common persistence (15 seconds)
echo "=== PERSISTENCE INDICATORS ==="
ls -la /etc/cron.d/ /var/spool/cron/crontabs/ 2>/dev/null
systemctl list-units --type=service --state=running | grep -v systemd
# TOTAL: ~60 seconds per host
# DECISION: based on output, determine if attacker is active,
# what incident type this is, and which host to prioritizeMyth: "You must collect a full disk image from every compromised system, or the investigation is incomplete."
Reality: Full disk imaging is the gold standard for legal proceedings and deleted file recovery. For operational incident response — determining what happened, containing the threat, and restoring service — triage collection (UAC ir_triage or equivalent) is sufficient for 80% of investigations. The triage captures running processes, network state, authentication logs, persistence mechanisms, and key configuration files. Disk imaging adds unallocated space (deleted files) and comprehensive filesystem metadata (timeline generation). If triage answers the investigation questions, the disk image adds time cost without proportional evidence value. Collect based on what the investigation needs, not based on a theoretical maximum.
Try it yourself
Run a triage scenario.
Run a triage scenario. Set a timer for 10 minutes. Scenario: WEBSRV-NGE01 is showing suspicious outbound connections to an unknown IP on port 4444. The attacker may be active. You have SSH access. What do you collect in those 10 minutes? Write down your first 5 actions in order, with the exact commands. After the timer, review: did you capture the most volatile evidence first? Did you identify the attacker's process? Did you capture the C2 connection details? This exercise builds the muscle memory for real-time triage decisions.
Beyond This Investigation
The triage framework applies to every scenario module. LX4–LX13 each begin with an initial alert and a time-constrained collection phase. The triage decisions you make in that initial phase determine the quality of evidence available for the analysis that follows. Investigators who triage effectively have more evidence, better evidence, and faster investigation outcomes.
Check your understanding:
1. The attacker is currently active on a compromised server. You have SSH access and a pre-compiled LiME module. What are your first three collection actions? 2. You are the sole investigator for 8 compromised Linux servers. What collection strategy do you use, and why? 3. A web server compromise has been detected through an alert on outbound connections. You do not know the specific incident type yet. What is your initial evidence collection priority? 4. You have cloud console access but no SSH access to a compromised AWS EC2 instance. What evidence can you still collect, and what is the most critical gap?
You are investigating a Linux server and discover evidence of both a cryptominer (resource abuse) and an SSH key theft (lateral movement preparation). The cryptominer is consuming 95% CPU and impacting production. Which do you address first?
Address the lateral movement first. The cryptominer is visible, noisy, and contained to this server — it is causing performance impact but not spreading. The SSH key theft is silent, potentially already exploited, and may have given the attacker access to additional servers. Contain the lateral movement risk: rotate the stolen SSH keys, check the target servers for unauthorized access, and apply network restrictions. Then address the cryptominer: kill the process, remove the binary and persistence mechanisms. Prioritizing the noisy but contained threat over the silent but spreading threat is the most common Linux IR prioritization mistake.
Get weekly detection and investigation techniques
KQL queries, detection rules, and investigation methods — the same depth as this course, delivered every Tuesday.
No spam. Unsubscribe anytime. ~2,000 security practitioners.