In this module

Advanced MFT Edge Cases

14 hours · Module 1 · Free

Operational Objective

The standard MFT record structure covered in WF1.1 through WF1.10 handles the vast majority of files on a production NTFS volume. But every NTFS volume contains files that deviate from the standard pattern — compressed files that store less data on disk than their logical size, encrypted files whose $DATA content is unreadable without the user's key, sparse files with large logical sizes but minimal actual data, hard links that share a single MFT record across multiple directory entries, junction points and symbolic links that redirect filesystem paths, and extension records that span a file's attributes across multiple MFT entries when one 1,024-byte record is insufficient. Encountering these edge cases without understanding them leads to incorrect conclusions: reporting a compressed file's allocated size as its actual size, claiming an encrypted file's content is corrupted when it's simply encrypted, or misinterpreting a hard link as two separate files. Additionally, the emergence of ReFS (Resilient File System) on Windows Server and certain Windows 11 configurations means examiners may encounter non-NTFS evidence where MFT analysis doesn't apply. This subsection prepares you for the edge cases that every examiner eventually encounters.

Deliverable: Understanding of NTFS edge cases that affect MFT analysis — compressed files, EFS-encrypted files, sparse files, hard links, junction points, symbolic links, extension records, and the MFT differences on ReFS volumes — with the ability to identify each case in MFT output and adjust the analysis methodology accordingly.

Estimated completion: 35 minutes

Figure WF1.12 — NTFS edge cases that affect MFT forensic analysis. Each case has a specific identifier in MFTECmd output and requires adjusted analysis methodology. Failure to recognize these cases leads to incorrect file size reporting, content misinterpretation, or evidence double-counting.

Compressed files

NTFS compression stores file data in compressed form on disk while presenting the uncompressed data to applications through the filesystem API. The compression operates on 16-cluster compression units — NTFS compresses each 16-cluster block independently, and if the compressed result is smaller than 16 clusters, the saved space is represented as sparse runs in the data run list.

In the MFT record, compressed files are identified by the flag 0x0800 in the $SI attribute's permission flags field. The $DATA attribute's allocated size is smaller than the real size (the opposite of the normal relationship, where allocated size is equal to or slightly larger than real size due to cluster rounding). The data run list contains runs with zero offsets — these represent the sparse (saved) space within compressed units.

Forensic implications: when recovering a compressed file's content, you must decompress the data after extracting it from the cluster ranges. Raw cluster extraction produces compressed data that is not directly interpretable. MFTECmd correctly reports both the allocated size and real size for compressed files, but examiners who work directly with raw data must account for the compression.

Compressed files are relatively uncommon on modern Windows installations — Windows 10/11 does not compress files by default, and NTFS compression is generally discouraged due to performance impact. However, older systems, servers with disk space constraints, and systems where an administrator enabled compression on specific folders may contain compressed files.

Compliance Myth: "NTFS compression makes deleted file recovery impossible because the compressed data is unusable in fragments"

NTFS compression operates on 16-cluster compression units independently. Each unit is self-contained — if one compression unit's clusters are intact, that unit can be decompressed regardless of whether adjacent units have been overwritten. Partial recovery of compressed files is possible at the compression unit boundary (64 KB with 4 KB clusters). This is coarser than uncompressed file recovery (where each cluster is independent), but it is not impossible. The practical challenge is identifying which clusters belong to which compression unit, which requires the data run list from the MFT record to map the sparse gaps.

EFS-encrypted files

The Encrypting File System (EFS) encrypts file content at the NTFS level. The file's $DATA attribute contains ciphertext rather than plaintext — the data is encrypted with a file encryption key (FEK), which is itself encrypted with the user's public key and stored in a $LOGGED_UTILITY_STREAM attribute (type 0x100) on the MFT record.

In MFTECmd output, EFS-encrypted files are identified by the flag 0x4000 in the $SI flags. The filename, timestamps, parent directory, and file size are all readable — encryption only affects the $DATA content, not the metadata. This means MFT-based timeline construction and deleted file metadata recovery work normally on encrypted files. The limitation is content recovery: extracting the $DATA content produces encrypted bytes that require the user's decryption key.

For forensic analysis of EFS-encrypted files, three approaches: recover the user's EFS certificate from their Windows profile (the certificate is stored in the user's certificate store and can be exported if you have the user's password or a domain recovery agent certificate), use a domain EFS recovery agent certificate (if configured by the organization), or analyze the file metadata without accessing the content. In many investigations, the metadata alone (filename, timestamps, parent directory, file size) is sufficient — you can prove the file existed and was accessed without needing to read its content.

Hard links and their forensic impact

Hard links create multiple directory entries pointing to the same MFT record. The MFT record has a hard link count greater than 1 (header offset 0x12), and contains multiple $FILE_NAME attributes — one for each hard link. Each $FN attribute has a different parent directory reference and potentially a different filename, but they all share the same $DATA attribute.

In MFTECmd output, hard-linked files appear as multiple rows with the same EntryNumber but different FileName and ParentPath values. The key forensic consideration is not to count hard links as separate files — they are different names for the same data. If a file appears in both C:\Windows\System32\ and C:\Windows\SysWOW64\ with the same entry number, it is one file with two names, not two files.

Hard links are commonly used by Windows itself (many System32 files are hard-linked to their SysWOW64 counterparts on 64-bit systems) and by the Windows Component Store (WinSxS). They are also used by some installation frameworks and occasionally by attackers who create hard links to legitimate system files as a persistence mechanism.

Decision point

Your MFTECmd output shows an executable svchost-helper.exe in C:\ProgramData\Updates\ with EntryNumber 45,231 and sequence 3. You also find a row for maintenance.exe in C:\Windows\Temp\ with the same EntryNumber 45,231 and sequence 3, and a hard link count of 2.

Your options: (A) These are two different malicious executables in two different directories — investigate both as separate attacker tools. (B) These are the same file with two names (hard link). The file has one $DATA attribute containing one executable. The attacker created a hard link to give the executable a second name in a second location — possibly as a persistence mechanism or to confuse analysis. Investigate as one file with two directory entries.

The correct approach is B. The identical EntryNumber and SequenceNumber confirm this is a single MFT record with two $FILE_NAME attributes. The hard link count of 2 confirms two directory entries. Analyze the file once (content, hash, behavioral analysis) and document both locations and names. In your report, note: "The executable exists as a hard link with two directory entries: C:\ProgramData\Updates\svchost-helper.exe and C:\Windows\Temp\maintenance.exe. Both names reference MFT entry 45,231, sequence 3. The hard link may serve as a persistence mechanism — deleting one name does not remove the file because the other name maintains the link."

Junction points and symbolic links

Junctions and symbolic links are NTFS reparse points — special directory or file entries that redirect filesystem access to a different location. They are identified in the MFT by the presence of a $REPARSE_POINT attribute (type 0xC0) containing the target path.

Junctions are directory-only, local-volume-only redirects. C:\Users\Default User is a junction pointing to C:\Users\Default. Junctions are transparent to applications — accessing the junction is the same as accessing the target. Windows uses junctions extensively for backward compatibility paths.

Symbolic links can be files or directories and can cross volume boundaries. They function similarly to junctions but are more flexible. Symbolic links were introduced in Windows Vista and are used less frequently than junctions.

Forensic implications: when you encounter a junction or symlink in an investigation, the files "inside" the junction don't actually exist at that path — they exist at the target path. An examiner who reports finding suspicious files in a junction directory is actually reporting files at the target location. MFTECmd reports reparse points in its output — check the ReparseTarget column to identify the real file location.

Extension records

When a file has more attributes than fit in a single 1,024-byte MFT record (many Alternate Data Streams, very long filenames, complex security descriptors, or numerous hard links), NTFS creates extension records — additional MFT entries that hold the overflow attributes. The base record's header at offset 0x20 (base record reference) is zero, and the extension record's base record reference points back to the base entry.

MFTECmd handles extension records automatically — it consolidates the base and extension attributes into a single output. When working with raw hex, check the base record reference at offset 0x20: if non-zero, the record you're examining is an extension. Navigate to the base record (identified by the entry number in the base reference) to find the primary attributes ($SI, first $FN, primary $DATA).

Extension records are uncommon — most files fit comfortably in a single MFT record. Files most likely to have extensions: files with many ADS, files with extremely long paths, and files with complex access control lists. In forensic analysis, extension records rarely cause issues because MFTECmd handles them transparently, but recognizing them in raw hex prevents confusion when an MFT record appears to lack expected attributes.

ReFS — a different filesystem entirely

ReFS (Resilient File System) is Microsoft's successor to NTFS, designed for data integrity and resilience. ReFS is used on Windows Server Storage Spaces Direct volumes and is available as a formatting option on some Windows 11 configurations. ReFS does not use an MFT — it uses B+ tree metadata structures that are fundamentally different from NTFS.

If you encounter a ReFS volume in an investigation, MFT analysis does not apply. The forensic tooling for ReFS is significantly less mature than NTFS tooling — as of 2026, no equivalent of MFTECmd exists for ReFS metadata. Analysis of ReFS volumes typically requires X-Ways Forensics or EnCase, which have limited but growing ReFS support.

The key forensic considerations for ReFS: no $FILE_NAME timestamps (removing the primary timestomping detection method), no $I30 slack (removing directory-level deleted file recovery), and no resident data recovery (ReFS uses a different small-file storage mechanism). The loss of these forensic capabilities makes ReFS evidence significantly harder to analyze than NTFS evidence.

In practice, you will rarely encounter ReFS on workstation evidence. It is primarily used on server storage volumes. When you do encounter it, document the filesystem type in your report and note the limitations of the analysis compared to NTFS evidence.

Try it: Identify edge cases in MFTECmd output

Load your MFTECmd output in Timeline Explorer and look for each edge case:

1. Compressed files: Filter for entries where AllocatedSize is significantly less than RealSize. Check if the $SI flags include the compressed flag. 2. Hard links: Sort by EntryNumber and look for consecutive rows with the same entry number but different filenames. The hard link count in MFTECmd output confirms the number of names. 3. Junction points: Look in the ReparseTarget column for populated entries. Common junctions: C:\Users\Default User, C:\Users\All Users, C:\Documents and Settings. 4. ADS: Look for entries with the same EntryNumber but an ADS name in the output. Zone.Identifier is the most common ADS. 5. Very large files: Sort by RealSize descending. Check whether the largest files are sparse (AllocatedSize much smaller than RealSize) or genuinely large.

For each edge case found, note the entry number, the case type, and how it would affect your analysis if you encountered it during an investigation. This builds recognition of edge cases before they cause incorrect findings.

You are imaging a Windows Server 2022 file server for a ransomware investigation. The server has two volumes: a 500 GB C: drive formatted as NTFS and a 16 TB D: drive formatted as ReFS (Storage Spaces). The ransomware encrypted files on both volumes. How does this affect your forensic analysis plan?

ReFS and NTFS use the same MFT structure — the analysis is identical for both volumes. Extract and parse the MFT from each.

The C: drive (NTFS) supports full MFT analysis — extract the MFT, parse with MFTECmd, build the encryption timeline, detect timestomping, and recover deleted attacker tools. The D: drive (ReFS) does not have an MFT — none of the MFT-based techniques apply. For the D: drive, you're limited to: $LogFile equivalent analysis (ReFS has its own transaction log), Event Logs on the C: drive that reference D: drive access, and whatever metadata ReFS forensic tools can extract (limited in 2026). Document this limitation: "The D: drive is formatted as ReFS. MFT-based analysis including timestomping detection, $I30 slack recovery, and deleted file MFT recovery is not available for this volume. The encryption timeline for D: drive files is constructed from Event Logs and the ReFS metadata that forensic tools can parse."

ReFS is not supported by any forensic tool — skip the D: drive entirely and focus on the C: drive.

Convert the ReFS volume to NTFS before analysis to enable standard forensic techniques. This can be done non-destructively using Microsoft's conversion utility.

Operational Artifact — NTFS Edge Case Recognition Guide

This subsection provides the identification markers and forensic implications for each NTFS edge case. Compressed files (flag 0x0800, allocated less than real), EFS encryption (flag 0x4000, unreadable content), sparse files (flag 0x0200, large logical size), hard links (link count greater than 1, multiple $FN attributes), junctions and symlinks (reparse attribute present), and extension records (non-zero base reference). Additionally, ReFS volumes require an entirely different analysis approach. When beginning MFT analysis on new evidence, scan for these edge cases first — identify compressed, encrypted, and hard-linked files before building your timeline to avoid incorrect size reporting, content misinterpretation, or evidence double-counting. Include edge case identification in your methodology documentation to demonstrate awareness of NTFS complexity.

You've built the foundations of artifact-level forensic analysis.

WF0 gave you the taxonomy, NTFS architecture, and the five-step methodology. WF1 took you inside the MFT at the binary level — every attribute, every timestamp, every edge case. From here, every artifact category gets the same raw-first treatment.

WF2–WF10: every major Windows artifact decoded at binary level — USN Journal, Prefetch, Amcache, Shimcache, ShellBags, LNK, Jump Lists, SRUM, Event Logs, and the Registry hives
INC-NE-2026-0915 (WF13) — Insider data exfiltration capstone. Work the complete investigation from USB history to OneDrive exfiltration evidence
INC-NE-2026-1022 (WF14) — Ransomware capstone. Three-host triage (FIN01 → IT03 → FS01) across the 72-hour attack chain
The lab pack — 25+ realistic evidence files in 10 formats, simulated KAPE triage pre-populated, both capstones deployable to your own VM
Anti-forensic detection methodology — defeat timestomping, log clearing, and Prefetch deletion with cross-artifact correlation

Unlock with Specialist — £25/mo See Full Syllabus

Cancel anytime

← Previous Next →