In this section

LX1.7 Disk Imaging and Verification

3-4 hours · Module 1 · Free

Disk Imaging: Capturing the Complete Filesystem State

Operational Objective
The Complete Capture Question: UAC collected the critical log files and volatile data from BASTION-NGE01, but the investigation now requires deleted file recovery — the attacker removed their web shell from /var/www/html/ before you collected evidence. The file's data blocks are still on disk in unallocated space, but they are invisible to any live command. You need a bit-for-bit copy of the entire disk — including unallocated space, the journal, and the partition table — to recover what the attacker deleted. A wrong imaging approach (wrong device, no hash, no write protection) produces evidence that cannot be verified or may be challenged in proceedings.
Deliverable: The complete disk imaging workflow using dc3dd — device identification, partition layout analysis, acquisition with simultaneous hashing, split imaging for large disks, forensic mounting for analysis, and cloud snapshot equivalents. An acquisition log template that documents the imaging process for chain of custody.
⏱ Estimated completion: 35 minutes

When You Need a Disk Image

A disk image captures the complete state of the filesystem — every file (including deleted files whose data blocks have not been overwritten), every directory, every inode, the journal, the superblock, and the unallocated space. Live response and UAC triage collect specific artifacts; a disk image captures everything. You need a disk image when:

The investigation requires deleted file recovery — files the attacker deleted from the filesystem may have their data blocks intact in unallocated space. The UAC triage cannot collect deleted files because they are not visible to userspace commands. A disk image preserves the unallocated space where deleted file data resides, enabling recovery with extundelete, photorec, and Sleuth Kit.

# Identify block devices and partitions
lsblk -f

# Example output:
# NAME                  FSTYPE      MOUNTPOINT
# sda
# ├─sda1                ext4        /boot
# ├─sda2                crypto_LUKS
# │ └─sda2_crypt        LVM2_member
# │   ├─vg0-root        ext4        /
# │   ├─vg0-swap        swap        [SWAP]
# │   └─vg0-home        ext4        /home
# └─sda3

# For a full disk image (captures everything including partition table):
# Target: /dev/sda

# For individual partition/LV images:
# Target: /dev/mapper/vg0-root (root filesystem)
# Target: /dev/mapper/vg0-home (home directories)
# Target: /dev/sda1 (boot partition)
# Full disk image — local output
sudo dc3dd if=/dev/sda \
  hash=sha256 \
  log=~/cases/IR-2026-0402/disk/acquisition.log \
  of=~/cases/IR-2026-0402/disk/bastion-nge01-sda.raw

# Full disk image — remote output (pipe over SSH)
sudo dc3dd if=/dev/sda hash=sha256 \
  log=/tmp/acquisition.log | \
  ssh forensics@workstation \
  "dc3dd of=/cases/IR-2026-0402/disk/bastion-nge01-sda.raw hash=sha256 log=/cases/IR-2026-0402/disk/acquisition-recv.log"

# Split image (for large disks, media size limits)
sudo dc3dd if=/dev/sda \
  hash=sha256 \
  log=~/cases/IR-2026-0402/disk/acquisition.log \
  ofs=~/cases/IR-2026-0402/disk/bastion-nge01-sda.raw \
  ofsz=4G

# Individual logical volume
sudo dc3dd if=/dev/mapper/vg0-root \
  hash=sha256 \
  log=~/cases/IR-2026-0402/disk/acquisition-root.log \
  of=~/cases/IR-2026-0402/disk/bastion-nge01-root.raw
# AWS — create volume from snapshot, attach to forensic VM
aws ec2 create-volume \
  --snapshot-id snap-0abc123 \
  --availability-zone eu-west-2a \
  --volume-type gp3 \
  --tag-specifications "ResourceType=volume,Tags=[{Key=Case,Value=IR-2026-0402}]"

# Attach to forensic VM
aws ec2 attach-volume \
  --volume-id vol-0xyz789 \
  --instance-id i-forensicvm \
  --device /dev/xvdf

# On the forensic VM — mount read-only
sudo mount -o ro,noexec,nosuid /dev/xvdf1 /mnt/evidence
# Determine partition offsets in the image
mmls bastion-nge01-sda.raw
# Output shows partition start sectors

# Mount a specific partition (e.g., root at sector 2048, 512-byte sectors)
sudo mount -o ro,noexec,nosuid,loop,offset=$((2048*512)) \
  bastion-nge01-sda.raw /mnt/evidence

# For LVM-based images — set up loop device first
sudo losetup -rP /dev/loop0 bastion-nge01-sda.raw
sudo vgchange -ay
# The logical volumes appear as /dev/mapper/vg0-root etc.
sudo mount -o ro,noexec,nosuid /dev/mapper/vg0-root /mnt/evidence

# When done — clean up
sudo umount /mnt/evidence
sudo vgchange -an vg0
sudo losetup -d /dev/loop0
Expand for Deeper Context

The investigation requires comprehensive timeline generation — plaso generates the most complete timeline from a full disk image because it can parse every artifact type (filesystem metadata, log files, application data, browser history, package manager databases). A triage collection provides only the artifacts that UAC explicitly collects.

Legal proceedings require evidence integrity — a full disk image with hash verification is the standard evidence format for court proceedings. Triage collections are accepted but a full image provides stronger evidence integrity because it captures the complete system state, not a selected subset.

Identifying the Target Device

Before imaging, identify the disk layout. Production Linux servers commonly use LVM (Logical Volume Manager) with one or more physical volumes, volume groups, and logical volumes. The imaging target is the logical volume containing the root filesystem, not the physical device.

If the disk is encrypted with LUKS, the image must be acquired from the decrypted device (/dev/mapper/sda2_crypt or the LVM logical volumes on top of it), not from the encrypted block device (/dev/sda2). An image of the encrypted device is useless without the decryption key. During a live response where the encrypted volume is already unlocked and mounted, image from the decrypted device. If the system has been powered off and the encrypted volume is locked, you need the LUKS passphrase or key file to unlock it before imaging.

DISK IMAGING — WHEN AND HOW Is system a cloud VM? YES NO (bare-metal) Cloud API Snapshot Point-in-time, zero VM modification Best integrity — atomic capture System powered on? YES NO Live dc3dd over SSH System keeps running Minor inconsistency risk Offline dc3dd + blocker Hardware write blocker Best for legal proceedings ALL METHODS REQUIRE SHA256 hash at acquisition Acquisition log with timestamps Read-only mount for analysis Cloud snapshot → attach as volume → mount ro,noexec,nosuid | dc3dd image → losetup → mount ro,noexec,nosuid
Figure LX1.7 — Disk imaging decision tree by environment. Cloud VMs use API snapshots (atomically consistent, zero VM modification). Bare-metal systems use dc3dd live or offline. All methods require hashing at acquisition and read-only mounting for analysis.

Acquisition with dc3dd

dc3dd is the forensic standard for Linux disk imaging. It is a patched version of dd that adds simultaneous hashing, logging, progress reporting, split output, and error handling.

The hash=sha256 parameter computes the SHA256 hash during acquisition — the hash appears in the log file when imaging completes. This is the acquisition hash — the reference value you will verify every time you access the image for analysis.

The dual-hash approach for legal proceedings: dc3dd if=/dev/sda hash=sha256 hash=md5 computes both SHA256 and MD5 simultaneously. Two independent hashes provide stronger integrity evidence.

The acquisition log records: source device, destination file, hash values, bytes transferred, transfer rate, start time, end time, and any errors encountered. Preserve this log as part of the evidence chain.

Cloud Disk Snapshots as Image Equivalent

For cloud VMs, a disk snapshot through the provider's API is the functional equivalent of a disk image — but faster, easier, and forensically cleaner because it does not require any access to the running VM.

The snapshot captures the complete state of the virtual disk at the moment of snapshot creation. Unlike a live dc3dd image (which acquires blocks sequentially over minutes while the system continues writing), a cloud snapshot is a point-in-time capture — every block reflects the same instant. This is actually better evidence integrity than a live disk image.

After snapshot creation, attach the snapshot as a read-only volume to your forensic analysis VM:

The ro,noexec,nosuid mount options ensure: read-only (no writes that modify the evidence), no execution (prevents accidental execution of attacker binaries), and no SUID bit interpretation (prevents SUID binaries from escalating privileges on your forensic workstation).

Forensic Mounting of Disk Images

When analyzing a raw disk image on your forensic workstation, mount it read-only to prevent any modification:

Worked artifact — Disk acquisition log template:

Complete this log during every disk imaging operation. It becomes part of the chain of custody record alongside the LX1.4 evidence log.

Case: INC-2026-XXXX Target system: [hostname] Investigator: [name]

Source device: - Device path: ___ (e.g., /dev/sda, /dev/mapper/vg0-root) - Device type: ☐ Physical disk ☐ LVM logical volume ☐ Cloud snapshot volume - Encryption: ☐ None ☐ LUKS (decrypted for imaging) ☐ Cloud-managed - Size: ___ GB Partition layout command: lsblk -f output attached

Acquisition: - Tool: ☐ dc3dd ☐ dd (dc3dd unavailable) ☐ Cloud API snapshot - Command used: ___ - Output format: ☐ Raw (.raw) ☐ Split raw (segment size: ___) ☐ Cloud snapshot - Hash algorithm(s): ☐ SHA256 ☐ SHA256 + MD5 (legal standard) - Start time (UTC): ___ End time: ___ Duration: ___ - Bytes transferred: ___ Transfer rate: ___ - Errors: ☐ None ☐ Yes (details: ___)

Verification: - Acquisition hash (from dc3dd log): ___ - Verification hash (sha256sum on image file): ___ - Match: ☐ Yes ☐ No (investigate discrepancy)

Storage: Image stored at: ___ Encrypted: ☐ Yes ☐ No Access restricted to: ___

Decision points: choosing the imaging approach

Investigation requires deleted file recovery: Full disk image required. UAC triage and cloud snapshots of mounted partitions capture the filesystem as the OS sees it — unallocated space where deleted files reside is only accessible through a raw disk image or a cloud snapshot of the underlying block device.

Cloud VM, investigation does not require deleted file recovery: Cloud API snapshot is sufficient and preferred. It is faster, forensically cleaner (point-in-time atomic capture), and does not require SSH access. Mount the snapshot as a volume on your forensic VM for analysis.

Bare-metal server, system must stay online: Live imaging with dc3dd over SSH. The system continues running during imaging. Accept the minor inconsistency risk (files modified during sequential block acquisition). For most investigations, this inconsistency is insignificant.

Legal proceedings anticipated, bare-metal server: Power the system off (after collecting volatile evidence in Phases B-F). Image with dc3dd through a hardware write blocker. This is the strongest evidence integrity position — consistent filesystem state, no write contamination, hardware-enforced write protection. The cost is losing any volatile evidence not already collected and taking the system offline.

LUKS-encrypted disk, system powered off: You cannot image without the decryption key. The encrypted blocks are forensically useless without decryption. Obtain the LUKS passphrase from the system administrator, key escrow, or key management system before attempting to image.

Troubleshooting: common disk imaging issues

dc3dd reports errors during acquisition (bad sectors, I/O errors). Failing disks produce read errors on damaged sectors. dc3dd continues past errors by default and logs them — the acquisition still completes, with zeroed data substituted for unreadable sectors. Document the error count and affected sectors in the acquisition log. If the disk is physically failing, prioritize: image the most investigatively valuable partitions first (root filesystem, /var/log) before the disk fails completely.

Disk image hash does not match when verified on the forensic workstation. If you imaged a live system, the filesystem was being modified during acquisition — this can cause hash mismatches between the source device hash and the image hash because the source changed while being read. This is expected for live imaging. The verification hash of the image file itself (computed once, verified later) is the integrity anchor — the image has not been modified after acquisition even if it does not match the live source.

LVM logical volumes are not visible after setting up the loop device. Run sudo vgscan to scan for volume groups, then sudo vgchange -ay to activate them. If the volume group name conflicts with an existing VG on your forensic workstation (both named vg0), rename your workstation's VG first or use --partial and --activationmode partial to activate only the evidence VG.

Cloud snapshot is taking hours and the investigation cannot wait. Cloud snapshots are incremental — the first snapshot of a large disk can take 30-60 minutes. Do not wait for completion. The snapshot is usable once created (the provider copies blocks in the background). Attach the snapshot-derived volume to your forensic VM and begin analysis while the background copy completes.

Disk image is too large for the forensic workstation's storage. Use dc3dd's split output (ofsz=4G) to create manageable segments. Alternatively, image only the specific logical volumes you need (root, /var/log, /home) rather than the entire physical disk. For cloud VMs, you can analyze the snapshot by mounting it as a volume without creating a local image at all.

Beyond this investigation: LX7 (Persistence Mechanisms) investigates rootkit persistence on disk images acquired with dc3dd, tracing kernel modules, boot configurations, and modified system binaries through forensic disk analysis.

Myth: "A disk image of a live system is just as good as a disk image from a powered-off system."

Reality: A live disk image is acquired while the system continues writing — files may be in an inconsistent state because they were being modified during acquisition. The journal may contain transactions that were in progress. The filesystem metadata may not match the file contents for recently modified files. A powered-off disk image captures a consistent state. For most investigations, a live image is sufficient. For legal proceedings where the defense may challenge evidence integrity, a powered-off image is stronger — but it requires taking the system offline and loses all volatile evidence.

Try it yourself

Create a small test disk image on your forensic workstation.

Create a small test disk image on your forensic workstation. Create a 1GB virtual disk: dd if=/dev/zero of=/tmp/test-disk.raw bs=1M count=1024. Format it: mkfs.ext4 /tmp/test-disk.raw. Mount it, create some test files, unmount it. Then image it with dc3dd: dc3dd if=/tmp/test-disk.raw hash=sha256 of=/tmp/test-disk-image.raw log=/tmp/acquisition.log. Verify the hash: compare the hash in the log with sha256sum /tmp/test-disk-image.raw. Mount the image read-only and verify the test files are present: sudo mount -o ro,loop /tmp/test-disk-image.raw /mnt/test && ls /mnt/test.

Beyond This Investigation

Disk images are the evidence source for LX2 (Filesystem Forensics) — inode analysis, deleted file recovery, and timeline generation all work against disk images or cloud snapshots. The imaging techniques in this subsection produce the raw material that LX2's analysis tools consume.

Check your understanding:

1. The target server uses LUKS full disk encryption. The system is powered off. What do you need before you can acquire a forensically useful disk image? 2. Why is a cloud disk snapshot actually better evidence integrity than a live dc3dd image of the running VM? 3. What mount options should you use when mounting a disk image for analysis, and what does each option prevent? 4. The acquisition log shows dc3dd transferred 500GB but the hash does not match when you verify the image on your workstation. What are the possible causes?

Decision point

You are investigating a Linux server and discover evidence of both a cryptominer (resource abuse) and an SSH key theft (lateral movement preparation). The cryptominer is consuming 95% CPU and impacting production. Which do you address first?

Address the lateral movement first. The cryptominer is visible, noisy, and contained to this server — it is causing performance impact but not spreading. The SSH key theft is silent, potentially already exploited, and may have given the attacker access to additional servers. Contain the lateral movement risk: rotate the stolen SSH keys, check the target servers for unauthorized access, and apply network restrictions. Then address the cryptominer: kill the process, remove the binary and persistence mechanisms. Prioritizing the noisy but contained threat over the silent but spreading threat is the most common Linux IR prioritization mistake.

Unlock the Full Course See Full Course Agenda