In this module
Why Artifacts Matter More Than Tools
Figure WF0.1 — Three failure modes of tool-dependent forensic analysis. Parser errors produce wrong data. Version-specific behaviors produce misinterpreted data. Anti-forensic evasion produces manipulated data that tools report without question. Artifact-level understanding is the only defense against all three.
The tool is not the evidence
A forensic tool is a parser. It reads a binary structure, interprets the fields according to a specification, and outputs human-readable results — typically as a CSV, a table, or a formatted report. MFTECmd reads the Master File Table and outputs file records with timestamps, paths, sizes, and attributes. PECmd reads Prefetch files and outputs execution timestamps, run counts, and referenced files. AmcacheParser reads the Amcache.hve registry hive and outputs program execution records with SHA1 hashes and file paths.
The tool is not the evidence. The artifact is the evidence. The MFT record is the evidence. The Prefetch file is the evidence. The Amcache entry is the evidence. The tool is an intermediary that translates binary data into human-readable form. When the translation is accurate, the tool saves hours of manual analysis. When the translation is inaccurate — because the parser encountered an edge case it wasn't designed to handle, because the artifact format changed in a Windows update the parser hasn't accounted for, or because an attacker manipulated the artifact in a way the parser doesn't detect — the tool produces incorrect output that looks exactly like correct output. There is no warning. There is no error flag. The CSV row looks like every other row. The examiner incorporates it into their findings because they have no reason to doubt it.
This distinction matters in exactly four professional contexts: court testimony, regulatory notification, insurance claims, and internal investigations with HR or legal consequences. In each context, the examiner's findings will be examined by someone whose job is to find weaknesses. A defense attorney's expert witness. A regulator's technical reviewer. An insurance company's forensic assessor. Opposing counsel in an employment dispute. Each of these adversarial reviewers has the ability and motivation to examine the raw artifact and compare it to the examiner's interpretation. If the examiner's interpretation was based entirely on tool output, and the tool output was wrong, the finding collapses — and it takes every other finding in the report with it, because the examiner's methodology is now in question.
Three failure modes of tool-dependent analysis
The first failure mode is parser errors — the tool misinterprets the binary structure of the artifact. This happens most commonly with edge cases: MFT records with unusual attribute ordering, Prefetch files from uncommon Windows versions, registry values with non-standard encoding, Event Log records that span chunk boundaries. The parser was written to handle the common case correctly, and it does. But NTFS is a specification with decades of accumulated complexity, and forensic artifacts can exist in states that the parser's developer never encountered in testing. When the parser encounters these states, it either produces incorrect output silently or skips the record entirely.
Consider an MFT record with multiple $FILE_NAME attributes — one for the long filename and one for the short (8.3) filename. The common case has two $FN attributes: the long name first, the short name second. A parser that assumes this ordering and extracts the "primary" filename from the first $FN attribute works correctly in 99% of cases. But NTFS does not guarantee this ordering. In certain conditions — particularly after disk repair operations, file system migration, or certain backup/restore sequences — the short name can appear before the long name. A parser that assumes ordering extracts the 8.3 filename as the primary name: IMPORT~1.XLS instead of Important_Financial_Data_Q3_2026.xlsx. The examiner's report states the subject accessed a file named IMPORT~1.XLS. The defense's expert opens the same MFT record in a hex editor, identifies both $FN attributes, and correctly identifies the long filename. The examiner's credibility is damaged — not because the conclusion was wrong (the file was accessed), but because the evidence was presented incorrectly.
The second failure mode is version-specific behaviors — the artifact's meaning changes across Windows versions, and the examiner applies the wrong interpretation. The most significant example is the Shimcache (Application Compatibility Cache). On Windows XP and Windows 7, a program's presence in the Shimcache was widely cited as evidence of execution. On Windows 10 and Windows 11, the Shimcache records programs that are in the execution path but may not have actually executed — the cache is populated during the executable compatibility lookup, which occurs before execution. A Shimcache entry on Windows 10 proves the operating system evaluated the executable for compatibility. It does not prove the executable ran. An examiner who states "the Shimcache proves this program executed" on a Windows 10 system is making a factually incorrect claim that will be challenged by any competent opposing expert.
The third failure mode is anti-forensic evasion — the attacker modifies the artifact before the examiner collects it, and the tool reports the modified data as if it were genuine. Timestomping is the canonical example: the attacker uses a tool like SetMACE or PowerShell's Set-ItemProperty to modify the $STANDARD_INFORMATION timestamps on a malicious executable, setting the creation time to match other files in the same directory. The MFT parser dutifully reports the modified timestamps. The examiner builds a timeline that shows the executable was created six months ago, consistent with a legitimate installation. The actual creation time — preserved in the $FILE_NAME attribute, which timestomping tools typically do not modify — was two hours ago.
A tool-dependent examiner reads the MFT output CSV, sees the timestamps, and builds their timeline around them. An artifact-aware examiner compares the $SI timestamps against the $FN timestamps, notices the discrepancy, correlates with the USN Journal (which records the file creation at the real time), and identifies the timestomping. The tool reported the same data in both cases. The difference is what the examiner knows to look for.
When artifact knowledge is mandatory
Artifact-level understanding is not required for every investigation. Routine SOC triage — is this alert a true positive, does it require escalation, what's the initial scope — can be performed effectively with tool output alone. The volume and speed demands of SOC work make raw artifact analysis impractical for every alert. The triage analyst's job is to classify and escalate, not to produce court-defensible findings.
Artifact-level understanding becomes mandatory when the investigation's findings will face adversarial scrutiny. Four contexts trigger this requirement.
Court testimony. When the examiner testifies as an expert witness, opposing counsel will probe the methodology. "How do you know this timestamp is accurate?" "Could this artifact have been modified?" "Have you verified your tool's output against the raw data?" The examiner who answers "my tool reported this value" has provided a conclusion based on an intermediary's interpretation. The examiner who answers "I verified this value against the raw MFT record at offset 0x38 of attribute type 0x10, and I confirmed it through cross-correlation with the USN Journal entry at this USN offset" has provided a conclusion based on direct evidence examination.
Regulatory notification. GDPR Article 33 requires notification "without undue delay" with "the nature of the personal data breach" including "the categories and approximate number of data subjects concerned." The examiner's assessment of what data was accessed — based on filesystem artifacts — directly determines the notification scope. An incorrect assessment (over-reporting or under-reporting) has regulatory consequences. The artifact analysis must be accurate.
Insurance claims. Cyber insurance claims require documented evidence of the incident, the scope of impact, and the remediation performed. The insurer's forensic assessor will review the examiner's methodology and evidence. Findings that rely solely on tool output without raw verification can be challenged, potentially affecting claim resolution.
Internal investigations with consequences. When an employee faces termination, disciplinary action, or legal proceedings based on forensic findings, those findings must be defensible. Employment tribunals, arbitration proceedings, and internal grievance processes can all examine the quality of the forensic evidence. The standard may be lower than criminal court, but the methodology must still be sound.
The raw-first principle
This course follows a raw-first principle: every artifact is first examined at the binary level — in a hex editor, with the field structure annotated — before any tool output is shown. This is not because tools are bad. It is because understanding what the tool is parsing is a prerequisite for trusting what the tool reports.
When you examine an MFT record in hex and identify the $STANDARD_INFORMATION attribute at offset 0x38, you understand that the four timestamps starting at that offset are 8-byte Windows FILETIME values representing 100-nanosecond intervals since January 1, 1601. You understand that the first timestamp is creation time, the second is modification time, the third is entry modification time (MFT record change), and the fourth is access time. You understand that these timestamps are in the $SI attribute, which is modifiable by user-mode applications — including timestomping tools. When MFTECmd reports these timestamps in its CSV output, you know exactly what it parsed and from where. If the timestamp looks suspicious, you know to check the $FILE_NAME attribute's timestamps for comparison, because $FN timestamps are set at file creation and modified only by the NTFS driver — not by user-mode applications.
Without this understanding, the MFTECmd output is a black box. The CSV contains timestamps, and you trust them because the tool is reputable. The raw-first approach converts that black box into a transparent process where every output value traces to a specific location in the raw artifact.
A colleague presents their investigation findings based entirely on EZ Tools output — MFTECmd CSVs, PECmd summaries, and AmcacheParser results. They haven't examined any raw artifact data. The findings support a termination decision for an employee. Your role is peer review.
Your options: (A) The findings are well-supported by multiple EZ Tools outputs — the tools are reliable and widely accepted in the forensic community. Accept the findings. (B) Request that the examiner verify at least the critical findings against raw data. Tool output is a starting point, not a conclusion. Specific concerns: MFTECmd occasionally misparses $FILE_NAME timestamps on entries with multiple $FN attributes. AmcacheParser may miss entries in non-standard Amcache structures. PECmd's timestamp parsing has known edge cases with certain Prefetch versions. For a termination decision, at least the key timestamps and the primary execution evidence should be verified against raw hex or a second parsing tool. If the findings hold under raw verification, the conclusion is stronger. If they don't, you've prevented an incorrect termination.
The correct approach is B. Tool output is efficient for triage and initial analysis. For high-stakes findings (termination, legal proceedings, insurance claims), raw verification of critical data points is the professional standard.
.\KAPE\kape.exe --tsource C: --tdest C:\Evidence\MFT --target MFT --vhdx MFT_ExtractTry It — Examine Your First MFT Record in Hex
This exercise introduces the raw-first approach using a single MFT record. You don't need a forensic image — you can extract a single MFT record from your analysis workstation.
Step 1: Extract the MFT. Open an elevated PowerShell prompt on your Windows 11 VM. Use KAPE to extract the $MFT:
Alternatively, use RawCopy or FTK Imager to extract the $MFT file. The file will be several hundred megabytes.
Step 2: Open in a hex editor. Open the extracted $MFT in HxD. Navigate to offset 0x00000000. You should see the signature bytes 46 49 4C 45 — ASCII "FILE". This is the header of the first MFT record (record 0, the $MFT file itself).
Step 3: Identify the record header. - Offset 0x00-0x03: Signature "FILE" (46 49 4C 45) - Offset 0x04-0x05: Fixup array offset - Offset 0x06-0x07: Fixup array entry count - Offset 0x08-0x0F: $LogFile sequence number - Offset 0x10-0x11: Sequence number (increments when MFT entry is reused) - Offset 0x12-0x13: Hard link count - Offset 0x14-0x15: First attribute offset (the offset where attribute data begins) - Offset 0x16-0x17: Flags (0x01 = in use, 0x02 = directory)
Step 4: Find the first attribute. Read the value at offset 0x14-0x15 (little-endian). This tells you where the first attribute starts. Navigate to that offset within the record. The first 4 bytes are the attribute type: 10 00 00 00 = $STANDARD_INFORMATION (type 0x10).
Step 5: Locate the timestamps. Within the $STANDARD_INFORMATION attribute, after the attribute header, the first four 8-byte values are the MACE timestamps: Created, Modified, Entry Modified (MFT record change), Accessed. Each is a Windows FILETIME — a 64-bit value representing 100-nanosecond intervals since January 1, 1601 UTC.
What you've just done: You traced the exact location of the timestamps that MFTECmd reports in its CSV output. When MFTECmd reports a creation timestamp for a file, it read this exact offset in this exact attribute. You now understand the provenance of that data point.
The myth: Forensic tools like KAPE and EZ Tools are industry-standard and widely accepted. Reports based on their output are automatically defensible because the tools themselves have been validated by the forensic community.
The reality: The tools are widely accepted. The tools are excellent. The tools are also software — and software has bugs, version-specific behaviors, and edge cases. Acceptance of a tool does not transfer to acceptance of a specific output value from that tool on a specific piece of evidence. The Daubert standard (US) and corresponding standards in other jurisdictions evaluate the examiner's methodology, not the tool's reputation. An examiner who states "I ran MFTECmd and it reported this value" has described a methodology of running a tool. An examiner who states "MFTECmd reported this value, which I verified against the raw MFT record at this offset, and which is corroborated by the USN Journal entry at this USN number" has described a methodology of analysis with verification. The second methodology survives scrutiny. The first methodology survives only until opposing counsel asks "did you verify the tool's output?"
What this course builds
This course builds three capabilities that tool-dependent analysis cannot provide.
The first is artifact interpretation — the ability to read a forensic artifact at the binary level and understand what each field means, what created it, and what it proves. This is not about memorizing hex offsets. It is about understanding the data structure well enough that when you see a tool's output, you know what the tool parsed and whether the interpretation is correct for your specific evidence and OS version.
The second is anti-forensic detection — the ability to identify when an artifact has been manipulated, deleted, or fabricated. Tools report what they find. They do not report what is suspicious about what they find. A timestomped file, a cleared event log, a deleted Prefetch folder — each leaves traces that artifact-aware analysis can detect and tool-dependent analysis cannot.
The third is multi-artifact correlation — the ability to corroborate findings across independent artifact sources. A single artifact source provides a data point. Two independent sources providing consistent data points provide evidence. Three or more independent sources provide strong evidence. Correlation is not just about building a more complete timeline — it is about building a more defensible one, where every key finding has independent corroboration.
Troubleshooting
"I've been doing DFIR for years with tools and never had a finding challenged." That may be true, and it may reflect that your findings were accurate — tools are correct most of the time. It may also reflect that your findings were never subjected to adversarial scrutiny. The shift from internal triage to legal proceedings, insurance claims, or regulatory notification changes the standard. The question is not "has my methodology been challenged?" but "would my methodology survive challenge?"
"Raw analysis takes too long for real investigations." You don't examine every artifact at the raw level. You use tools for bulk processing and raw analysis for critical findings — the ones that determine scope, attribution, or timeline. A typical investigation might process 10,000 MFT records with MFTECmd and raw-verify 15 of them: the malware executable, the staging directory, the exfiltration path, and the critical timeline entries. The raw verification adds minutes, not hours. The defensibility it provides is permanent.
"The tools I use are well-maintained and frequently updated." They are. Eric Zimmerman's tools are among the best-maintained forensic parsers available. That doesn't eliminate edge cases — it reduces them. The changelog for any EZ Tools release includes bug fixes that were only identified because users compared tool output against raw data. Those bug fixes are evidence that the tools are not infallible — and that the forensic community's standard practice includes raw verification.
You've built the foundations of artifact-level forensic analysis.
WF0 gave you the taxonomy, NTFS architecture, and the five-step methodology. WF1 took you inside the MFT at the binary level — every attribute, every timestamp, every edge case. From here, every artifact category gets the same raw-first treatment.
- WF2–WF10: every major Windows artifact decoded at binary level — USN Journal, Prefetch, Amcache, Shimcache, ShellBags, LNK, Jump Lists, SRUM, Event Logs, and the Registry hives
- INC-NE-2026-0915 (WF13) — Insider data exfiltration capstone. Work the complete investigation from USB history to OneDrive exfiltration evidence
- INC-NE-2026-1022 (WF14) — Ransomware capstone. Three-host triage (FIN01 → IT03 → FS01) across the 72-hour attack chain
- The lab pack — 25+ realistic evidence files in 10 formats, simulated KAPE triage pre-populated, both capstones deployable to your own VM
- Anti-forensic detection methodology — defeat timestomping, log clearing, and Prefetch deletion with cross-artifact correlation
Cancel anytime