In this section
TH1.15 Quality Assurance: Peer Review and Hunt Validation
The error you do not catch
A hunt for inbox rule manipulation queries CloudAppEvents for New-InboxRule operations. The query returns zero results. The analyst concludes: "No malicious inbox rules in the last 30 days." The hunt record is filed. The detection rule is deployed.
But the attacker used the Graph API to create the inbox rule, not the Outlook client. The Graph API creation path produces a different operation name in CloudAppEvents — or may appear only in MicrosoftGraphActivityLogs. The query was correct for one creation method and blind to another. The conclusion was false negative. The detection rule has the same blind spot.
// Detection rule validation: does the new rule cover
// what the hunt query covered?
// Compare the hunt query results against the rule's results
// over the same time window
// Hunt query result count (from hunt record): ____
// Run the detection rule query manually over the hunt's time window:
// If the rule returns fewer results than the hunt query,
// the rule's filters or exclusions may be too aggressive
// If the rule returns MORE results, the rule is less precise
// and may need additional exclusions from the FP analysisTry it yourself
Exercise: Review your first hunt record
If you completed the exercises from TH1.1 through TH1.7, you have a hunt record. Walk through the peer review checklist below against your own record.
For each checklist item, answer honestly: did the hunt do this? If any item is unchecked, consider whether the gap could have affected the conclusion. If it could have, add a note to the hunt record documenting the gap and its potential impact.
This self-review exercise builds the QA habit that becomes automatic after a few campaigns.
Peer review in practice
Hunt peer review follows the same principle as code review: a second analyst reviews the hunt methodology, queries, and conclusions before the findings are published or acted upon. The reviewer checks: does the hypothesis match the data source queried? Do the KQL queries correctly implement the detection logic described in the hypothesis? Are the findings supported by the query results, or is the conclusion a stretch? Were alternative explanations considered? At NE, every hunt producing a positive finding goes through peer review before escalation. This prevents two failure modes: false confidence (the hunter sees what they expected to find rather than what the data shows) and missed context (the reviewer may know about a legitimate business process that explains the anomalous activity).
Peer review also builds team capability through knowledge transfer. The reviewer learns the hunter's analytical approach — the specific KQL patterns, the data source selection logic, the interpretation methodology. Over time, the team develops a shared analytical vocabulary and a library of validated patterns that any member can apply. This knowledge distribution is a resilience mechanism: when the team's primary hunter is unavailable, other analysts can execute hunts using the same validated methodology.
The myth: QA adds overhead that reduces the number of hunts completed. The priority is volume — more hunts means more coverage improvement.
The reality: A hunt with a methodology error — a missed data source, a contaminated baseline, an overly narrow scope — produces a false negative that is worse than no hunt at all. The false negative creates documented (but incorrect) assurance that a technique was searched for and not found. Future analysts may deprioritize that technique based on the false negative, allowing the compromise to persist. Twenty minutes of QA per hunt prevents errors that undermine the program's credibility and operational value. The priority is not volume — it is accuracy. Ten accurate hunts per year outperform twenty flawed hunts.
Extend this process
As the hunting program matures, the peer review process naturally evolves. Initial reviews focus on methodology compliance (did you follow the Hunt Cycle?). Mature reviews focus on analytical quality (did you interpret the data correctly? Did you miss a correlation? Could you have enriched further?). The highest-level reviews focus on strategic questions (are we hunting the right techniques? Is the backlog prioritized correctly? Are we improving coverage in the areas that matter most?). TH15 (Phase 3) covers mature review practices for established hunting programs.
References Used in This Subsection
- Course cross-references: TH1.2 (scope — technique variant coverage), TH1.7 (hunt record template), TH1.6 (detection rule conversion), TH15 (mature review practices)
NE environmental considerations
NE's detection environment includes specific factors that influence this rule's operation:
Device diversity: 768 P2 corporate workstations with full Defender for Endpoint telemetry, 58 P1 manufacturing workstations with basic cloud-delivered protection, and 3 RHEL rendering servers with Syslog-only coverage. Rules targeting DeviceProcessEvents operate with full fidelity on P2 devices but may have reduced visibility on P1 devices. Manufacturing workstations in Sheffield and Sunderland represent a detection gap for endpoint-level detections.
You have time for one hunt this quarter. Do you hunt for the threat in the latest advisory or for the gap in your ATT&CK coverage matrix?
Hunt the coverage gap. Advisories describe threats that are CURRENT but may not target NE. Coverage gaps describe techniques that COULD target NE and would succeed undetected. The coverage gap hunt produces a detection rule (closing the gap permanently). The advisory-driven hunt produces a point-in-time assessment (confirming the specific threat is not present today). Both are valuable — but the coverage gap hunt has a longer-lasting impact because it produces a permanent detection improvement.
You understand the detection gap and the hunt cycle.
TH0 showed you what detection rules fundamentally cannot catch. TH1 gave you the hypothesis-driven methodology that closes that gap. Now you run the hunts.
- 10 complete hunt campaigns — from hypothesis through KQL execution through finding disposition, each campaign based on a real TTP
- 70 production hunt queries — every one mapped to MITRE ATT&CK and tested against realistic telemetry
- Advanced KQL for hunting — UEBA composite risk scoring, retroactive IOC sweeps, and hunt management metrics
- Hypothesis-Driven Hunt Toolkit lab pack — 30 days of realistic M365 and endpoint telemetry with multiple attack patterns seeded in
- TH16 — Scaling hunts across a team — the operating model for a production hunt program