TH1.15 Quality Assurance: Peer Review and Hunt Validation
The error you do not catch
A hunt for inbox rule manipulation queries CloudAppEvents for New-InboxRule operations. The query returns zero results. The analyst concludes: “No malicious inbox rules in the last 30 days.” The hunt record is filed. The detection rule is deployed.
But the attacker used the Graph API to create the inbox rule, not the Outlook client. The Graph API creation path produces a different operation name in CloudAppEvents — or may appear only in MicrosoftGraphActivityLogs. The query was correct for one creation method and blind to another. The conclusion was false negative. The detection rule has the same blind spot.
A 5-minute peer review would have caught this. “Did you check all inbox rule creation paths — Outlook, OWA, PowerShell, EWS, and Graph API?” The question identifies the gap. The analyst runs an additional query. The conclusion changes.
Three review points
Review point 1: Before the hunt — hypothesis and scope review. Before the first query runs, a second analyst (or the SOC lead) reviews the hypothesis and scope definition for 5 minutes.
Review questions: Is the hypothesis testable with the scoped data sources? Are all relevant data tables included? Is the time window appropriate for the technique? Is the population correctly defined? Are there technique variants that the scope should cover but does not?
This review catches the inbox rule example above — the reviewer asks about creation paths before the hunt starts, not after it concludes.
Review point 2: Before closing — hunt record review. Before the hunt is marked complete, a second analyst reviews the hunt record for completeness and methodology adherence.
Review questions: Were all four query funnel steps executed (orientation, indicator, enrichment, pivot)? Were false positives analyzed and documented? Were exclusions justified with specific evidence? Is the conclusion supported by the analysis? If the conclusion is “refuted” (no finding), does the scope cover the technique adequately — or could the technique have been present but invisible due to scope gaps?
Review point 3: Before deploying — detection rule review. Before the hunt-derived detection rule goes into production, a second analyst reviews the rule for correctness, exclusion appropriateness, and threshold calibration.
Review questions: Does the rule’s time window and frequency avoid gaps? Are the exclusions from the FP analysis documented and justified? Could any exclusion be exploited by an attacker (e.g., excluding an IP range that an attacker could route through)? Is the entity mapping correct? Is the severity appropriate?
| |
The peer review checklist
Figure TH1.15 — Three review points for hunt quality assurance. Total time: approximately 20 minutes per hunt. The investment prevents false negatives that undermine the entire program.
For solo hunters
If your team has only one analyst who hunts, peer review is not available. Three adaptations:
Self-review with a checklist. Before closing a hunt, walk through the checklist below. Check each item honestly. The checklist compensates for the absence of a second perspective.
Time-delayed review. Complete the hunt. Wait 24 hours. Re-read the hunt record with fresh eyes. The overnight gap provides cognitive distance that catches errors you missed during the hunt session.
Periodic batch review. Every 3 months, review the last quarter’s hunt records as a batch. Look for patterns — are you consistently missing certain data sources? Are your exclusions getting more permissive over time? Are your conclusions consistently reaching one outcome (always refuted, never confirmed)? Patterns in your own work reveal systematic biases that individual self-review misses.
Try it yourself
Exercise: Review your first hunt record
If you completed the exercises from TH1.1 through TH1.7, you have a hunt record. Walk through the peer review checklist below against your own record.
For each checklist item, answer honestly: did the hunt do this? If any item is unchecked, consider whether the gap could have affected the conclusion. If it could have, add a note to the hunt record documenting the gap and its potential impact.
This self-review exercise builds the QA habit that becomes automatic after a few campaigns.
The myth: QA adds overhead that reduces the number of hunts completed. The priority is volume — more hunts means more coverage improvement.
The reality: A hunt with a methodology error — a missed data source, a contaminated baseline, an overly narrow scope — produces a false negative that is worse than no hunt at all. The false negative creates documented (but incorrect) assurance that a technique was searched for and not found. Future analysts may deprioritize that technique based on the false negative, allowing the compromise to persist. Twenty minutes of QA per hunt prevents errors that undermine the program’s credibility and operational value. The priority is not volume — it is accuracy. Ten accurate hunts per year outperform twenty flawed hunts.
Extend this process
As the hunting program matures, the peer review process naturally evolves. Initial reviews focus on methodology compliance (did you follow the Hunt Cycle?). Mature reviews focus on analytical quality (did you interpret the data correctly? Did you miss a correlation? Could you have enriched further?). The highest-level reviews focus on strategic questions (are we hunting the right techniques? Is the backlog prioritized correctly? Are we improving coverage in the areas that matter most?). TH15 (Phase 3) covers mature review practices for established hunting programs.
References Used in This Subsection
- Course cross-references: TH1.2 (scope — technique variant coverage), TH1.7 (hunt record template), TH1.6 (detection rule conversion), TH15 (mature review practices)
You're reading the free modules of this course
The full course continues with advanced topics, production detection rules, worked investigation scenarios, and deployable artifacts. Premium subscribers get access to all courses.