In this section

DE0.11 Creating a Detection Engineering Baseline

8-10 hours · Module 0 · Free

What you already know

You understand the discipline (Sections 1-2), the gap (Section 3), the detection surface (Section 4), the metrics (Section 5), an attack chain in detail (Section 6), the six chains (Section 7), the ATT&CK framework (Section 8), the lifecycle (Section 9), and the tools (Section 10). This section brings it together — you'll establish your detection engineering baseline, the starting point that every subsequent module improves.

What the baseline is and why it matters

The detection engineering baseline is a snapshot of your detection program's current state — measured, documented, and dated. It captures what your detection library can do right now, before you build a single new rule.

The baseline is the "before" in the before-and-after comparison that the capstone (DE11) produces as the board report. Without the baseline, you can't demonstrate improvement. With it, every new rule you deploy is measurable progress.

The baseline is not a judgment. A coverage percentage of 8% is not a failing grade — it's a starting point. Most organizations are in that range, and the organizations that know their number are already ahead of those that don't. The baseline is the first act of detection engineering: measuring the current state so you can plan the improvement systematically.

Estimated time: 40 minutes.

Scenario

Your CISO asks you to build a detection engineering program. In your first week, you build and deploy five new detection rules. In the capstone three months later, you present the board report. The CISO asks: "What was the improvement?" If you didn't record the baseline before you started, you have no "before" for the before-and-after comparison. The rules may be excellent — but you cannot demonstrate it.

The seven components of your baseline

Your detection engineering baseline includes seven measurements. Together they describe the complete detection program state — what you can detect, how fast, how accurately, with what tools, against what threats.

1. Active rule count and type distribution

How many analytics rules are active in your Sentinel workspace? How many are scheduled, NRT, anomaly, Fusion, and Microsoft Security pass-through? How many are vendor templates and how many are custom? How many have ATT&CK technique mappings?

This is the raw inventory. It tells you the size and composition of the detection library.

A workspace with 35 active rules — 20 templates, 10 custom, 5 Microsoft Security — has a different improvement path than a workspace with 8 custom rules and nothing else. The template rules provide a starting base that custom engineering improves. The custom rules indicate that someone has already done some detection engineering work, and understanding what they built informs what to build next.

Also record how many rules have ATT&CK technique mappings configured. Sentinel supports mapping rules to MITRE ATT&CK techniques and tactics in the rule configuration. Rules without these mappings can't be counted in the coverage calculation — they might detect relevant techniques, but without the mapping, the coverage KQL query won't find them.

Many vendor templates ship with ATT&CK mappings. Most custom rules do not. The audit in Module 1 adds mappings to unmapped rules.

2. ATT&CK coverage percentage

The metric from Section 3 and Section 5. Distinct ATT&CK techniques with at least one active, healthy detection rule, divided by the relevant technique set for your environment. This is the strategic number — it answers "what proportion of the attacks that matter can we detect?"

Record the number and the date. You'll recalculate it monthly.

The first calculation often requires defining your relevant technique set — if you haven't done threat modeling yet, use the full Enterprise ATT&CK matrix as the denominator (approximately 200 techniques). The threat modeling module (DE2) will refine the denominator to your specific environment, which typically increases the percentage (smaller denominator, same numerator) but makes the number more meaningful.

If your rules don't have ATT&CK mappings, this number is unmeasurable. Record that — establishing coverage visibility is the first improvement.

3. Mean time to detect (MTTD)

The median detection latency across your active rules. How many minutes between an attack technique executing and the detection rule firing an alert? The median is more useful than the average — a few hourly rules skew the average badly. Also record the P95 (the 95th percentile) — this shows your worst-performing rules.

If your workspace doesn't have enough alert data to calculate MTTD (fewer than 20 alerts in 30 days), record "insufficient data." MTTD becomes calculable after you deploy rules that fire regularly.

4. False positive rate

The percentage of closed incidents classified as false positive. This requires that your SOC classifies incidents when closing them — True Positive, False Positive, Benign Positive, or Undetermined. If most incidents are closed as Undetermined, the FP rate is unmeasurable. Record that — establishing incident classification discipline is a prerequisite for measuring detection quality.

A false positive rate above 40% indicates that the detection library is producing more noise than signal. A rate below 20% indicates well-tuned rules. Between 20-40% is the improvement zone — the monthly tuning cadence in DE9 systematically drives the rate downward.

5. Rule health score

The percentage of active rules that are operationally healthy — firing periodically, producing a manageable volume of alerts, and contributing to investigations. Dormant rules (haven't fired in 60+ days), noisy rules (fire 5+ times per day consistently), and broken rules (query tables that have no data) don't count as healthy even though they appear in the active rule count.

Rule health tells you what proportion of your detection library is actually working. A workspace with 35 active rules and 12 healthy ones has a 34% health score — the detection program is not 35 rules, it's 12 functional rules and 23 rules in various states of failure.

6. Data source coverage

Which of the five data source families have data flowing into your workspace? Identity, endpoint, email, cloud apps, infrastructure. Each family enables specific detection domains. Missing families are detection blind spots — not potential future gaps, but current blind spots where attack techniques execute without generating queryable telemetry.

Record which tables have data, how much (events per day), and when the last event arrived. A table that shows "last event: 3 weeks ago" has a disconnected data source — the connector failed or was disabled, and the detection rules that query it are silently broken.

A quick data source inventory query checks all five families in one pass:

KQL

// Data source family health check
union
 (SigninLogs | summarize LastEvent=max(TimeGenerated), Events=count() | extend Table="SigninLogs", Family="Identity"),
 (AuditLogs | summarize LastEvent=max(TimeGenerated), Events=count() | extend Table="AuditLogs", Family="Identity"),
 (DeviceProcessEvents | summarize LastEvent=max(TimeGenerated), Events=count() | extend Table="DeviceProcessEvents", Family="Endpoint"),
 (EmailEvents | summarize LastEvent=max(TimeGenerated), Events=count() | extend Table="EmailEvents", Family="Email"),
 (CloudAppEvents | summarize LastEvent=max(TimeGenerated), Events=count() | extend Table="CloudAppEvents", Family="Cloud Apps"),
 (SecurityEvent | summarize LastEvent=max(TimeGenerated), Events=count() | extend Table="SecurityEvent", Family="Infrastructure")
| project Family, Table, LastEvent, DailyAvg = Events / 30.0
| order by Family asc, Table asc

Tables returning zero events or a LastEvent older than 7 days indicate a disconnected or misconfigured connector. You'll run this query in your workspace during Module 1.

NE's data source health check:

Expected Output

Table                    Family       LastEvent             Events
───────────────────────  ───────────  ────────────────────  ──────
SigninLogs               Identity     2026-05-18T09:47Z     482,910
AuditLogs                Identity     2026-05-18T09:46Z     12,847
DeviceProcessEvents      Endpoint     2026-05-18T09:47Z     1,247,003
EmailEvents              Email        2026-05-18T09:45Z     89,423
OfficeActivity           Cloud Apps   2026-05-18T09:42Z     34,891
CommonSecurityLog        Infra        2026-05-18T09:44Z     567,211
Syslog                   Infra        2026-05-15T22:14Z     8,943   ⚠

All five families have data — except the Syslog connector stopped 3 days ago (the RHEL agent issue identified in Section 5). Six of seven tables are healthy. One is silently broken. This output is the data source component of your baseline.

Export the complete baseline as a single artifact. This PowerShell compiles all seven measurements into one file:

PowerShell

# Compile your detection engineering baseline
$baseline = [ordered]@{
    GeneratedAt        = (Get-Date -Format "yyyy-MM-dd HH:mm")
    ActiveRules        = 23
    TemplateRules      = 12
    CustomRules        = 11
    RulesWithATTCK     = 15
    CoveragePercent    = 10.3
    RelevantTechniques = 145
    CoveredTechniques  = 15
    MTTD_Median_Min    = 12
    FP_Rate_Percent    = 43
    RuleHealth_Percent = 35
    HealthyRules       = 8
    DataFamilies       = "5 of 5 (Syslog degraded)"
}
$baseline | ConvertTo-Json | Out-File ".\baseline-$(Get-Date -f yyyy-MM-dd).json"
Write-Host "Baseline saved. Compare against this in the capstone."

7. Comparison against a reference

Compare your seven measurements against the the organization baseline:

Metric	the organization baseline	Your baseline
Active rules	23 (12 template, 11 custom)
ATT&CK coverage	10.3% (15 of 145)
MTTD (median)	12 minutes
FP rate	43%
Rule health	35% (8 of 23)
Data families	5 of 5

If you're working through this course without a production Sentinel workspace (using only the developer tenant you'll set up in Module 1), record the developer tenant's baseline. It will start at zero across all metrics.

That's the expected starting point — the developer tenant has no rules, no data, and no history. By the capstone, it will have 71 rules, six attack chains of evidence data, measurable coverage, and a functional detection-as-code pipeline. The improvement from zero to a structured program is the transformation the course delivers.

Detection Engineering Principle

The baseline is the first act of detection engineering. Measure before you build. Record the numbers even when they are bad — especially when they are bad. A coverage percentage of 8% is not a failing grade. It is the starting point that makes every subsequent improvement measurable, demonstrable, and fundable.

What you have after Module 0

You have six things that prepare you for the operational modules.

Understanding of the discipline

Detection engineering defined as an engineering discipline, positioned alongside SOC operations, threat hunting, and incident response in the security operations cycle. The six-stage lifecycle (hypothesize, design, build, test, deploy, tune). The adversarial mindset that evaluates every rule from the attacker's perspective.

The engineering practices — version control, code review, automated testing — that distinguish a detection program from ad-hoc rule writing.

Understanding of the gap

Detection coverage quantified against ATT&CK. Three layers of detection failure identified — no rule, wrong signal, drowned alert. Template limitations understood — why vendor-provided rules are a starting point, not a detection strategy.

The cost of the gap measured in dwell time and blast radius.

A complete attack chain walkthrough

CHAIN-HARVEST dissected phase by phase — phishing delivery, AiTM token capture, inbox rule persistence, mailbox reconnaissance, BEC wire fraud. Every phase mapped to specific tables, specific fields, and specific detection signals. The seven rules that catch the chain mapped to specific modules.

Five lessons about detection failure that apply to every rule you'll build.

The six attack chains

CHAIN-HARVEST, CHAIN-MESH, CHAIN-ENDPOINT, CHAIN-PRIVILEGE, CHAIN-DRIFT, CHAIN-FACTORY. Each chain modeled as a realistic campaign with four to seven ATT&CK techniques, cross-family detection opportunities, and mapping to detection modules. Your own environment's relevance to each chain assessed.

The measurement framework

Four metrics defined and calibrated with industry benchmarks — coverage, MTTD, FP rate, rule health. the baseline established. Your own baseline recorded or identified as unmeasurable (with the specific steps needed to make it measurable).

The tools

Microsoft Sentinel analytics rules. Defender XDR Advanced Hunting. KQL as the implementation language.

The lab pack's evidence data, rule templates, and program templates. ATT&CK as the organizing framework. The Navigator as the coverage visualization tool.

Atomic Red Team for testing. The detection-as-code pipeline for deployment. Community resources for ongoing development.

Module 1 teaches detection rule architecture in Microsoft Sentinel — how analytics rules execute, how entity mapping works, how alert grouping affects SOC workflow, how to evaluate any rule's quality, and the rule specification template that every production rule starts from. You'll audit your existing rules against the quality criteria, add ATT&CK mappings where they're missing, and build and deploy your first detection rule through the complete lifecycle.

Go to ### DE1 — Detection Rule Architecture in Microsoft Sentinel

to continue.

Unlock the Full Course See Full Course Agenda

Get weekly detection and investigation techniques

KQL queries, detection rules, and investigation methods — the same depth as this course, delivered every Tuesday.

No spam. Unsubscribe anytime. ~2,000 security practitioners.

← Previous Next →