In this section

DE0.3 The Detection Coverage, The Gap, The Illusion

8-10 hours · Module 0 · Free
What you already know

Sections 1 and 2 defined detection engineering and explained why organizations need it. This section makes the case concrete with a number. You'll understand how detection coverage is actually calculated, why rule count tells you nothing about what your rules can detect, and what the three layers of detection failure look like in a real environment.

The number nobody wants to hear

Twenty-three analytics rules. That is the detection library for a typical mid-size organization — a mid-size manufacturing company running M365 E5, Microsoft Sentinel, and Defender XDR across endpoints, email, identity, and cloud applications. They have a managed SOC partner providing 24/7 monitoring.

They passed their ISO 27001 surveillance audit three months ago. Their CISO reviews the incident dashboard weekly. By every operational metric their security program looks functional.

Twenty-three rules. Twelve are Microsoft-provided templates, enabled during the initial Sentinel deployment two years ago, never tuned, never reviewed. Eleven are custom rules the CISO wrote herself — password spray detection, suspicious inbox rules, a handful of identity anomalies. Basic threshold logic. Minimal entity mapping. No testing against the actual techniques they claim to detect.

Map those 23 rules to the MITRE ATT&CK techniques relevant to a manufacturing company running M365 E5, and the number that comes back is 10.3%. Fifteen of 145 relevant techniques have at least one detection rule. The other 89.7% — 130 techniques — produce telemetry in the workspace but trigger no alerts, create no incidents, and generate no response.

The telemetry is collected. The data sits in the workspace. Nobody examines it for attack patterns.

That 10.3% is not unusual. It is typical.

Estimated time: 40 minutes.

Scenario

Your CISO asks: "We have 35 active analytics rules. Are we protected?" You check: 12 are templates enabled during deployment, 8 are custom rules the previous security engineer wrote, 15 are Microsoft Security pass-throughs. Nobody has mapped them to ATT&CK. What do you actually know about the organization's detection capability from these numbers?

What detection coverage actually means

Detection coverage is not a count of analytics rules.

An organization with 100 rules that all detect variants of the same technique — different thresholds for brute force, different time windows for password spray, different exclusion lists for failed authentication — has 100 rules and coverage for one technique. An organization with 15 rules that each target a different technique across different ATT&CK tactics has fewer rules and vastly better coverage.

The distinction matters because rule count is the metric most organizations report. "We have 45 analytics rules" appears in security reviews, board reports, and managed SOC dashboards. It tells you nothing about what those 45 rules can detect. Five of them might detect the same technique with different thresholds.

Thirty might produce so many false positives that analysts auto-close them without investigation. Ten might query data sources that stopped ingesting six months ago. Rule count measures effort — someone configured rules. Coverage measures capability — the rules detect these specific techniques.

Coverage is measured against the techniques relevant to your environment — not the entire ATT&CK matrix. The full matrix contains over 200 techniques across 14 tactics. Many are irrelevant to your technology stack, industry, and threat landscape.

A healthcare organization doesn't need detection for industrial control system techniques. A cloud-native SaaS company doesn't need detection for physical access or removable media techniques. A financial services firm with no Linux infrastructure doesn't need detection for Linux-specific credential access techniques.

Your relevant technique set is determined by three factors. Your industry's threat actors and their documented techniques — Mandiant's M-Trends, CrowdStrike's threat reports, and CISA advisories identify which threat groups target which industries and which techniques they use.

Your technology stack and the techniques it enables — an M365 E5 environment with MDE, Defender for Office 365, and Sentinel has a specific set of attack techniques that work against it. Your crown jewels and the attack paths that reach them — engineering IP stored on SharePoint, financial data accessible through Exchange, identity infrastructure in Entra ID.

The coverage formula: count the distinct ATT&CK techniques with at least one active detection rule. Divide by the number of techniques in your relevant set. That percentage is your detection coverage. It is the single most important metric for evaluating a detection program. When the content modules begin, you'll run the KQL query that calculates it for your own workspace.

The concept, expressed in KQL against the SecurityAlert table, is straightforward:

KQL
// Coverage: distinct ATT&CK techniques with at least one alert in 90 days
SecurityAlert
| where TimeGenerated > ago(90d)
| where Status != "Dismissed"
| mv-expand tactic = todynamic(ExtendedProperties).["MITRE ATT&CK"]
| summarize RuleCount = dcount(AlertName) by tostring(tactic)
| summarize CoveredTechniques = count()

This counts techniques that have produced at least one alert. It's a starting point — the full coverage query in DE2 maps your active rule configurations (not just fired alerts) to ATT&CK techniques, giving you coverage even for rules that target rare techniques that haven't occurred in 90 days.

NE's result from this query:

Expected Output
CoveredTechniques
─────────────────
15

Fifteen techniques. NE has 145 relevant techniques in their filtered ATT&CK set. That's 10.3% coverage. Twenty-three active rules covering 15 techniques — some techniques have multiple rules, most have none. The number feels small because it is small. The question the rest of this course answers is: which of the remaining 130 techniques do you build for first, and how?

For comparison, here's what a community detection rule looks like in Sigma — the vendor-agnostic format that detection engineers share across platforms. This is the Sigma rule for password spray that NE's template rule is based on:

Sigma
title: Password Spray Detection via Azure AD Sign-In Logs
id: 4e4d35c9-7b3d-4a17-9e7e-1b6d0c3f8a2e
status: production
description: Detects multiple failed sign-ins from a single IP
  against many distinct user accounts — credential spray pattern
author: Sigma Community
date: 2025/03/15
tags:
  - attack.credential_access
  - attack.t1110.003
logsource:
  product: azure
  service: signinlogs
detection:
  selection:
    ResultType: '50126'
  timeframe: 10m
  condition: selection | count(TargetUserPrincipalName) by IpAddress > 10
level: high

Sigma is platform-agnostic — the same rule compiles to KQL for Sentinel, SPL for Splunk, and Lucene for Elastic. Detection engineers use Sigma rules as a starting point, then customize the thresholds, exclusions, and entity mapping for their specific environment. NE's template password spray rule is a compiled version of this Sigma rule — deployed without customization, which is why it fires on the shared VPN infrastructure where 15 users authenticate through the same IP every morning.

Three layers of detection failure

THREE LAYERS OF DETECTION FAILURE LAYER 1 No rule exists Technique executes. Telemetry recorded. No rule queries it. 89.7% of the organization techniques LAYER 2 Rule misses the variant Rule exists but targets the wrong signal. Attacker uses a variant the rule doesn't cover. LAYER 3 Alert drowns in noise Rule fires correctly. Alert created. 47 other alerts compete for analyst attention. The layers compound. Layer 1 leaves most techniques invisible. Layer 2 makes monitored techniques unreliable. Layer 3 buries reliable detections in noise. The result: a program that looks operational and fails silently.

Figure DE0.3 — Detection failure operates in three layers that compound. The majority of techniques (Layer 1) have no rule. Rules that exist may target the wrong signal (Layer 2). Rules that fire correctly may drown in false positive noise (Layer 3).

The 10.3% number raises the question: if Microsoft provides analytics rule templates built by their threat intelligence team, why don't they provide adequate coverage?

The templates are competently built. The problem is what happens after you enable them. Detection failure operates in three layers, each harder to identify than the last, and the layers compound.

Layer 1 — No rule exists

The attack technique executes and no analytics rule queries for it. The telemetry sits in the workspace — DeviceProcessEvents captured the process execution, AuditLogs captured the directory change, EmailEvents captured the phishing delivery — but no rule examines that data for the attack pattern. This is the primary gap: 130 of 145 relevant techniques have no detection rule at all.

The evidence is collected and never examined.

Layer 1 is the simplest to understand and the simplest to close. You write a rule. The technique moves from "invisible" to "monitored." Every rule you build in DE3-DE8 closes a Layer 1 gap.

Layer 2 — A rule exists but misses the variant

the impossible travel rule uses a geographic distance calculation between two sign-in locations. It fires when two sign-ins from the same user occur in different cities within a time window that makes physical travel impossible. This works against an attacker in Eastern Europe accessing an account based in the UK.

It does not work against an attacker using a residential proxy IP in the same country to access an account at the headquarters — 120 miles apart, well within the threshold.

A typical mid-size organization has 100+ employees who connect via VPN from residential ISPs. Field engineers fly between sites three times a week, appearing in two cities 200 miles apart within 30 minutes. The impossible travel rule fires on these legitimate patterns constantly. Analysts mark the alerts false positive.

The rule fires again. Analysts stop investigating impossible travel alerts entirely. The rule exists in the active rule count. The detection is operationally dead.

Layer 2 is harder to close because it requires understanding both the attack variant and the legitimate traffic that produces similar telemetry. The impossible travel rule fails not because the KQL is wrong but because the detection logic targets the wrong signal.

Geographic distance is a weak indicator when attackers use in-country proxies. Device fingerprint divergence within a session is a strong indicator — and it doesn't produce false positives against VPN users because their device fingerprint stays consistent even when their IP changes. Replacing one signal with another requires a detection engineer who understands both the attack technique and the environment's legitimate patterns.

Layer 3 — A rule detects correctly but the alert drowns

The rule fires on genuine attacker activity. The alert appears in the incident queue. But the analyst sees 47 alerts that morning — 38 are false positives from untuned templates, 4 are benign positives (correct behavior that's authorized), and 5 need investigation.

The genuine alert sits at position 23, classified medium severity because the rule lacks the enrichment and context that would elevate it. The analyst reaches it four hours later. The attacker has already established persistence, moved laterally, and started collecting data.

Layer 3 is the most insidious because the detection technically works. The rule fires. The alert is created.

The metric says "detection succeeded." But the alert is operationally invisible because it competes with noise for finite analyst attention. A rule that fires correctly at a volume analysts can't process is not a detection — it's noise with occasional signal. The tuning methodology in DE9 addresses Layer 3 by systematically reducing false positive rates until the signal-to-noise ratio makes every alert worth investigating.

These three layers compound. Layer 1 leaves most techniques unmonitored. Layer 2 makes the monitored techniques unreliable. Layer 3 buries the reliable detections in noise. The result is a detection program that looks operational — rules active, alerts flowing, incidents created — and fails to detect the attacks that matter.

What we see in 90% of environments (and why it fails)

The managed SOC reports "comprehensive monitoring" because alerts flow from Sentinel. The CISO reports "detection capability" because analytics rules are active. The ISO auditor checks the "SIEM deployed" box because the control exists on paper. Nobody has mapped the active rules to the techniques relevant to the organization's threat landscape. The coverage percentage has never been calculated.

When it is calculated for the first time, the reaction is consistent: disbelief, then concern, then urgency. A CISO who believed their 35 rules provided reasonable coverage discovers they cover 9% of relevant techniques. A SOC lead who reported "24/7 monitoring" discovers that the monitoring covers one-tenth of the attack surface. An IT director who believed "we'd know if we were compromised" discovers that 91% of the techniques an attacker would use against their environment produce zero alerts.

The gap between perceived coverage and actual coverage is almost always larger than anyone expected. That gap is the business case for detection engineering — and it's measurable from day one.

The cost of the gap

The detection gap has a dollar value. Multiple industry data sources converge on the same finding: organizations that detect breaches through their own detection capabilities have substantially lower breach costs and shorter dwell times than organizations where the attacker announces themselves through a ransomware note or a third party reports the breach.

The mechanism is straightforward. An attacker detected at initial access — within the first hours — has established a foothold and possibly one lateral movement step. The blast radius is contained. The remediation is a single compromised account, a single endpoint, a single persistence mechanism. The IR engagement is days, not weeks.

An attacker who operates undetected for the industry median dwell time — which Mandiant's M-Trends reports measure at 10 days for internally detected incidents and considerably longer for externally reported ones — has established multiple persistence mechanisms across different layers (identity, endpoint, cloud application), escalated privileges, identified crown jewels through internal reconnaissance, staged data for exfiltration, and potentially already exfiltrated sensitive material.

The blast radius is the environment. Remediation means rebuilding infrastructure, resetting all credentials, auditing every system the attacker could have touched, notifying regulators and affected parties, and conducting a forensic investigation that may span months.

At 10.3% coverage, an attacker using any technique in the 89.7% blind spot generates zero alerts.

The attack is not detected until it produces visible business impact — encrypted file shares, a fraudulent wire transfer, a customer data notification from law enforcement, or a threat actor post on a dark web forum. At that point, the containment conversation starts with "how bad is it?" rather than "we caught it early."

Detection engineering reduces dwell time by expanding coverage to the techniques that matter. Not all 200+ ATT&CK techniques — the subset that your threat landscape analysis identifies as high-probability, high-impact, and data-available.

That prioritized approach is what turns a 10% coverage baseline into a 48% or 65% coverage program. You'll learn the prioritization methodology in DE2 (Threat Modeling). The gap quantification starts in the next section.

Why template rules are a starting point, not a strategy

Microsoft's analytics rule templates are built for the broadest possible customer base. A template for suspicious PowerShell execution uses thresholds calibrated for an average enterprise — not for your development team that runs PowerShell automation 3,000 times per day.

A template for impossible travel uses distance calculations that don't account for your VPN topology. A template for suspicious inbox rules checks for forwarding but not for the folder-move rules that sophisticated attackers use instead.

The templates work at scale for Microsoft because they reduce median dwell time across millions of tenants. Any detection is better than no detection, and templates provide baseline coverage for commodity attacks that hit everyone.

But the templates fail for your specific organization because your organization is not the median. Your legitimate traffic patterns, your VPN architecture, your admin behaviors, your LOLBin usage in production scripts — these are specific to your environment and invisible to a template designed for millions of different environments.

Consider what "tuning" a template actually means. You enable the template. It fires 40 times in the first week — 38 false positives, 2 legitimate detections.

You raise the threshold to reduce false positives. Now it fires 8 times per week — 3 false positives, 2 legitimate detections, and 3 actual attacks that fall below the new threshold. You've traded false positives for missed detections. The template's one-size-fits-all logic forces this trade-off because it uses generic signals (thresholds, counts, time windows) rather than environment-specific signals (device fingerprints, behavioral baselines, session correlation).

A custom detection rule designed by a detection engineer who understands the environment doesn't face this trade-off.

Instead of "more than 10 failed authentications from a single IP in 10 minutes" (which catches brute force but also catches users with expired passwords), the engineer writes "more than 10 failed authentications from a single IP against more than 5 distinct user accounts in 10 minutes, where the IP is not in the corporate or VPN range" — a hypothesis that targets the specific pattern (distributed spray) rather than the generic indicator (failed auth count). The threshold doesn't need to be raised because the logic is precise enough to avoid the false positives the template produces.

Detection engineering exists because this precision is environment-specific. The rule that works at the organization produces 200 false positives per day at a software company with different authentication patterns.

The rule tuned for a cloud-only startup misses attacks at a manufacturing company with hybrid infrastructure and physical access vectors. Only someone who understands the specific environment can build detection rules that work in the specific environment. That someone is the detection engineer.

Detection Engineering Principle

Rule count measures effort. Coverage percentage measures capability. An organization with 100 rules targeting the same 5 techniques has less detection capability than an organization with 20 rules targeting 20 different techniques. The metric that matters is distinct techniques covered, not total rules deployed.

Next

Section 4 maps the Microsoft detection surface — the five data source families where attack telemetry lives. You need to understand what data your workspace contains before you can build detection rules that query it.

Unlock the Full Course See Full Course Agenda