TR1.1 The Order of Volatility
Figure TR1.1 — The updated order of volatility. Five tiers from seconds (CPU registers, active sessions) to years (archival media). Triage focuses on Tiers 1-3. Investigation accesses Tiers 4-5. The triage responder who captures Tier 1 evidence within 15 minutes preserves data that cannot be recovered by any other means.
RFC 3227: what still holds, what has changed
RFC 3227’s core principle is timeless: collect the most volatile evidence first because it has the shortest lifespan. The RFC’s specific hierarchy — registers, cache, routing tables, memory, temporary filesystems, disk, remote logging, archival media — was designed for standalone servers in 2002. The hierarchy did not account for cloud environments, containerised workloads, or distributed identity systems because none of these existed at the scale they do today.
What still holds: the principle of proceeding from most volatile to least volatile. The insistence on not shutting down systems before evidence collection. The requirement to use trusted tools (not tools from the potentially compromised system). The documentation requirement for every collection step.
What has changed: the evidence categories themselves. A modern incident involves cloud API logs (which have their own volatility — access tokens expire in 1 hour, audit log entries rotate after 30-90 days), container layers (which are destroyed on container restart, not just host reboot), and distributed authentication state (a session token that exists in Microsoft’s cloud infrastructure, not on any system the responder controls).
The updated hierarchy adds three categories that RFC 3227 could not have anticipated: cloud session and token state (Tier 1 — tokens expire on schedules the responder cannot control), container runtime state (Tier 2 — lost on container restart, not just host reboot), and cloud audit streams (Tier 3 — persisted by the cloud provider but with finite retention that the responder must understand).
Tier 1 — Seconds to minutes
This is the evidence that is actively changing RIGHT NOW. Every second of delay reduces the fidelity of this evidence.
CPU registers and cache. The classical RFC 3227 Tier 1. In practice, triage responders do not capture registers and cache directly — this is hardware-level forensics beyond triage scope. However, the principle applies: the evidence closest to the processor (the currently executing instruction, the data being computed) is the most volatile.
Active network connections. TCP sessions, established connections to C2 servers, active DNS resolutions. On Windows: Get-NetTCPConnection or netstat -ano. On Linux: ss -tlnp or netstat -tulpn. These connections disappear when the process that established them terminates, when the connection times out, or when the system’s network state changes. An active C2 beacon that connects every 60 seconds is visible in the connection table right now — but if the beacon process is killed or the system is isolated, the connection entry vanishes from the connection table within seconds.
Live session tokens (cloud). An attacker’s active Entra ID access token is valid for approximately 1 hour. The refresh token is valid for up to 90 days. If the responder captures the sign-in log entry NOW (which shows the token’s IP, device, and conditional access evaluation), the investigation team has the session’s full context. If the responder waits until the token expires, the session becomes a historical log entry — still queryable but without the real-time session state that allows immediate revocation.
Running process state. The process ID, its command line, its parent process, its open handles, its loaded modules, its network connections. This is the snapshot that tells the triage responder: what is this process doing RIGHT NOW? On Windows: Get-Process -IncludeUserName | Select Id, ProcessName, UserName, Path. On Linux: ps auxf shows the full process tree. MemProcFS mounts a memory dump as a filesystem where /pid/NNNN/ contains each process’s state. This evidence is destroyed by process termination, system reboot, or the attacker killing their own process after achieving their objective.
Tier 2 — Minutes to hours
This evidence persists longer than Tier 1 but degrades as the system continues operating.
Full RAM contents. The complete memory dump — not just the running processes but every data structure, cached credential, decrypted payload, and network buffer in memory. On Windows: WinPMem, DumpIt, or Magnet RAM Capture produce a full memory dump. KAPE’s volatile collection target includes memory. On Linux: LiME (Linux Memory Extractor) loads as a kernel module and dumps physical memory to a file.
Memory is Tier 2 (not Tier 1) because the full dump takes time — 2-10 minutes depending on RAM size and dump tool speed. The Tier 1 process listing can be captured in seconds; the full memory dump requires minutes. Begin the memory dump as early as possible in the triage workflow, but do not delay Tier 1 captures waiting for it to complete.
Process table and kernel state. The complete list of all processes, their relationships, loaded kernel modules, mounted filesystems, and system configuration. On Windows: Sysinternals Process Explorer captures the full process tree with loaded modules. On Linux: /proc/ contains per-process state that persists as long as the process runs. Loaded kernel modules (lsmod on Linux) reveal rootkits that load as kernel modules to hide attacker processes.
Container runtime state. A running Docker container’s filesystem layers, environment variables, mounted volumes, and network configuration. docker inspect CONTAINER_ID captures this state. docker diff CONTAINER_ID shows which files the attacker modified in the container’s writable layer. On Kubernetes: kubectl describe pod POD_NAME captures the pod’s current configuration, including service account tokens and mounted secrets. Container state is destroyed when the container is restarted — and many orchestrators restart containers automatically on failure, meaning the attacker’s modifications may be wiped before the responder reaches the system.
Routing and ARP cache. The system’s network routing table and ARP cache. The routing table shows how traffic is directed — an attacker who added routes to redirect traffic has modified this table. The ARP cache shows recent network communications at the link layer. Both are overwritten as network activity continues.
Tier 3 — Hours to days
Evidence that persists on disk or in cloud storage but has finite retention or rotation schedules.
Event logs (Windows). The Security, System, Application, PowerShell, and Sysmon event logs on Windows. These are stored on disk and survive reboots — but they rotate based on maximum log size. The default Security log maximum is 20 MB, which on a busy server may hold 24-48 hours of events. An attacker who clears the Security log (Event ID 1102) destroys this evidence — though Sentinel may have already ingested the events before clearing.
System logs (Linux). auth.log, syslog, kern.log, daemon.log. Rotation schedules vary: auth.log typically rotates weekly with 4 retained copies. An attacker who truncates auth.log (> /var/log/auth.log) eliminates authentication history. journald provides persistent logging if configured with Storage=persistent, but many default configurations use Storage=volatile (log to memory only — lost on reboot).
Cloud audit streams. Entra ID sign-in logs (30 days on free, 30 days on P1/P2 natively, up to 2 years if streamed to Sentinel). Microsoft 365 audit logs (90 days standard, 1 year with E5, 10 years with Audit Premium). These are not volatile in the traditional sense — Microsoft persists them — but they have finite retention. An incident discovered after the retention period has no cloud audit evidence. The triage responder’s job is to ensure these logs are preserved: snapshot the relevant entries to a case file so the investigation has the data regardless of future retention expiry.
Prefetch and temporary files (Windows). Prefetch files record which executables ran and when — limited to 1,024 entries on Windows 10/11 before overwriting. Temporary files in %TEMP%, %APPDATA%, and the recycle bin may contain attacker tools, staged data, or intermediate outputs. These persist until overwritten by new temporary files or manually cleared.
The collection sequence for triage
The triage responder does not have time to methodically collect every tier in sequence. The practical approach:
First 2 minutes: Tier 1 captures using native commands. Process list (tasklist /v on Windows, ps auxf on Linux), network connections (netstat -ano / ss -tlnp), logged-in users (query user / w). In the cloud: run the 5-query triage pack from TR2.1 (SigninLogs, AuditLogs quick checks). These commands complete in seconds and provide the triage scorecard data for Q1-Q3.
Minutes 2-7: Tier 2 capture. Initiate the memory dump (WinPMem / LiME). While the dump runs, capture additional Tier 1/2 data: scheduled tasks, autorun entries, loaded modules. In the cloud: snapshot the full sign-in log and audit log for the affected user (last 48 hours) to the case folder.
Minutes 7-15: Complete the triage scorecard while the memory dump finishes. If classification is TP or probable TP, execute containment per the Triage Trinity. If the dump completes, verify the output file size matches expected RAM (a truncated dump indicates collection failure).
After minute 15: Tier 3 collection begins. KAPE collection on Windows (event logs, prefetch, registry hives, browser history). Log snapshots on Linux (auth.log, crontab, systemd). These persist long enough that the investigation team can collect them later if the triage responder runs out of time — but capturing them during triage is preferable because the attacker may clear logs during the delay.
Worked artifact: Triage evidence collection timeline
Minute 0-2 (Tier 1):
tasklist /v > C:\IR\processes.txt·netstat -ano > C:\IR\netstat.txt·query user > C:\IR\sessions.txt· KQL: SigninLogs triage query for affected user Minute 2-7 (Tier 2): Start WinPMem:winpmem.exe C:\IR\memdump.raw·schtasks /query /fo csv > C:\IR\tasks.csv·autorunsc -a * -c > C:\IR\autoruns.csv· KQL: AuditLogs snapshot (48h) for affected user Minute 7-15 (Scorecard + containment): Complete 8-question scorecard · If TP: execute containment · Verify memory dump completion Minute 15-30 (Tier 3): KAPE collection:kape.exe --tsource C: --tdest C:\IR\KAPE --target !SANS_Triage· Cloud: snapshot OfficeActivity for affected user Minute 30-60 (Report): Produce 15-minute triage report. Document all evidence collected with timestamps and file hashes.
Tier 4 — Days to weeks
Evidence that persists on disk and survives reboots but may be overwritten, deleted, or expire over longer time horizons.
Disk artifacts (Windows). The NTFS Master File Table ($MFT) records every file and directory including deleted entries — the MFT entry persists until the disk space is reallocated. Registry hives (SYSTEM, SOFTWARE, SAM, SECURITY, NTUSER.DAT) record configuration state, autorun entries, USB device history, network profiles, and user activity. Amcache records application execution history with SHA1 hashes — even for executables that have been deleted. Shimcache (AppCompatCache) records application compatibility data for executables that were accessed (not necessarily executed). SRUM (System Resource Usage Monitor) records per-application resource consumption over 30-60 days. Browser history and download records persist until the user or an automated cleanup process removes them.
These artifacts are collected by KAPE’s !SANS_Triage target (TR3.5) and analysed with Eric Zimmerman’s Tools during the investigation phase. For triage, Tier 4 artifacts provide the HISTORICAL context that Tier 1-3 captures do not: which executables ran last week (Amcache), which USB devices connected this month (registry), and which websites the user visited before the compromise (browser history). KAPE collects Tier 4 artifacts in 3-5 minutes alongside the Tier 3 event logs — there is no reason to skip Tier 4 collection during triage if KAPE is available.
Disk artifacts (Linux). The equivalent Linux Tier 4 artifacts are less structured than Windows: file timestamps (access, modify, change — but many Linux distributions mount with noatime by default, reducing access time utility), package installation logs (/var/log/dpkg.log on Debian, /var/log/yum.log on RHEL), wtmp/btmp (historical login records beyond the current auth.log), and application-specific logs in /var/log/ that rotate on weekly or monthly schedules.
Cloud sign-in and audit logs. Entra ID sign-in logs are retained for 30 days natively (P1/P2 licence) and up to 2 years if streamed to Sentinel or a storage account. Microsoft 365 unified audit logs are retained for 90 days (standard) or 1 year (E5) or 10 years (Audit Premium add-on). These retention periods define the maximum lookback for the investigation team — evidence older than the retention period is permanently lost unless it was previously exported or ingested into a SIEM with longer retention.
Sentinel ingested data. Data ingested into a Sentinel Log Analytics workspace is retained per the workspace’s retention configuration — default 90 days for interactive queries, with up to 12 years of archived data accessible via search jobs. The triage responder’s cloud evidence snapshots (TR1.2) ensure the investigation team has the data regardless of workspace retention changes.
Tier 5 — Weeks to years
The most persistent evidence category — data that is designed for long-term storage and is resistant to casual destruction.
Forensic images. A bit-for-bit disk image captured during the investigation phase preserves the complete state of the drive at the time of imaging. Forensic images are stored in the evidence archive indefinitely — they are the gold standard for legal proceedings because they capture everything, including deleted files, unallocated space, and file slack.
Backup data. Organisational backups (daily, weekly, monthly) may contain copies of files that have since been modified or deleted on the production system. Backup data from before the compromise provides the baseline for determining what the attacker changed — comparing the pre-compromise backup against the current state reveals every modification. Backup retention policies determine how far back the investigation can reach — an organisation with 30-day backup retention has a 30-day lookback window. An organisation with annual backups can look back 12 months.
SIEM long-term retention. Sentinel’s archive tier retains data for up to 12 years. Other SIEMs (Splunk, Elastic) have configurable retention based on storage allocation. Long-term SIEM data enables the investigation team to search for indicators across months or years of historical data — identifying whether the attacker accessed the environment before the detected incident (a common finding: the attacker had access for months before the alert that triggered the investigation).
Physical media and archival storage. Tape backups, cold storage drives, and offline archives represent the least volatile evidence category. Physical media is resistant to remote destruction (the attacker cannot delete a tape stored in a vault) but may be slow to access and require specialised hardware.
The triage responder’s relationship with Tiers 4 and 5
The triage responder focuses on Tiers 1-3 because these contain the time-sensitive evidence that disappears fastest. Tiers 4 and 5 are the investigation team’s domain — they have the tools, time, and expertise to analyse disk forensic artifacts, review backup comparisons, and search long-term SIEM archives. However, the triage responder’s preservation actions PROTECT Tier 4 evidence: isolating an endpoint (rather than reimaging it) preserves the disk artifacts for the investigation team. Capturing a KAPE collection during triage collects Tier 4 artifacts alongside Tier 3 logs, saving the investigation team the re-collection effort.
The principle: the triage responder collects Tiers 1-3 actively (running commands, dumping memory, exporting logs). The triage responder protects Tiers 4-5 passively (by not reimaging, not powering off, not overwriting disk data with new installations). The investigation team collects and analyses Tiers 4-5 during the investigation phase.
Try it: map your evidence to the volatility tiers
For your environment, identify one evidence source in each tier. Tier 1: what is the most volatile data you would need to capture in the first 2 minutes? Tier 2: what requires a dedicated tool to capture (memory dump, container state)? Tier 3: what persists on disk but has a rotation schedule you should know? For each, note: can you capture it with native tools, or do you need to pre-stage a tool? If a tool is required, is it already deployed on your systems, or would you need to transfer it during the incident? The answers determine your triage readiness.
The myth: Because cloud logs are stored by Microsoft, they are always available. There is no urgency to capture cloud evidence during triage — it will be there when the investigation team gets to it.
The reality: Cloud evidence is persisted by Microsoft, but with finite retention. Entra ID sign-in logs: 30 days on P1/P2 natively. Microsoft 365 audit logs: 90 days standard. If the incident is discovered 45 days after initial access, the sign-in evidence for the first 15 days has already been purged from native retention. Sentinel extends this if logs were ingested, but only for the tables and time ranges configured in the workspace. The triage responder’s snapshot — copying the relevant log entries to the case folder during triage — ensures the investigation has the data regardless of future retention expiry. Cloud evidence is less volatile than memory, but it is not permanent.
Troubleshooting
“I do not have WinPMem or LiME pre-staged — can I still collect Tier 2 evidence?” You can collect most Tier 2 evidence with native tools — process lists, network state, routing tables, and loaded modules are all accessible via PowerShell (Windows) or /proc (Linux). The full memory dump is the one Tier 2 artifact that requires a dedicated tool. If no memory acquisition tool is available, capture everything else in Tier 2 and document that the memory dump was not collected. The investigation team will work with the available evidence. Pre-staging memory acquisition tools is covered in TR9.4 (the “go bag” concept).
“The memory dump file is smaller than the system’s physical RAM.” The dump was interrupted or the tool encountered an error. Common causes: insufficient disk space on the output drive, access denied on specific memory regions (secure kernel), or the dump tool was terminated by an antivirus product. Retry with a different output location (USB drive, network share). If the partial dump is all you can capture, preserve it — a partial dump is better than no dump.
You're reading the free modules of this course
The full course continues with advanced topics, production detection rules, worked investigation scenarios, and deployable artifacts. Premium subscribers get access to all courses.