11.11 Building a Hunting Programme

14-18 hours · Module 11

Building a Hunting Programme

Introduction

Required role: Microsoft Sentinel Reader (minimum for hunting queries). Sentinel Contributor for bookmark and hunt management.

Individual hunts are valuable. A hunting programme — recurring, structured, tracked, and integrated with detection engineering — is transformative. It systematically reduces the risk of undetected threats, builds institutional knowledge, and creates a feedback loop that continuously improves both hunting and detection capabilities.

Programme components

Cadence. How often do you hunt? Solo operators: 4 hours per fortnight (2 hypothesis hunts per month). Small teams (2-3 analysts): 8 hours per week (4-6 hunts per month). Dedicated hunting team: full-time continuous hunting.

Hypothesis backlog. Maintain a prioritised list of hunting hypotheses — generated from threat intelligence, MITRE coverage gaps, incident findings, and UEBA anomalies. When a hunting session starts, the analyst picks the highest-priority hypothesis from the backlog rather than inventing one on the spot.

Query library. Store hunting queries in the Sentinel Hunting blade (custom queries) and in Git (for version control and sharing). Organise by MITRE tactic and technique. Tag with data source requirements. Each query should have a description explaining: what it hunts for, what a positive result means, and recommended follow-up actions.

Hunt log. The documented record of all hunts (subsection 11.8). Tracks: what was hunted, when, by whom, and what was found. Enables accountability, coverage tracking, and institutional memory.

Detection integration. Every confirmed hunting finding produces an analytics rule. The hunting programme feeds the detection engineering lifecycle (Module 10.11). Over time, the most productive hunting queries migrate from manual hunts to automated rules — the programme systematically converts unknown threats into known detections.

The hunting cadence for a solo SOC operator

Most readers of this course operate as solo or near-solo SOC analysts. Building a hunting programme alongside incident response, detection engineering, and operational duties requires ruthless time management.

Weekly allocation (4 hours/fortnight):

Week 1 (2 hours): Execute one hypothesis hunt from the backlog. Follow the full cycle: hypothesise → query → analyse → document → close. If a finding requires a new analytics rule, draft the rule and schedule deployment for the next detection review.

Week 2 (2 hours): Execute one IOC-based hunt (from the latest threat intelligence advisory affecting your industry). Search all relevant tables for the reported IOCs. Document results. If positive, promote to incident.

Monthly rhythm:

Week 1: Hypothesis hunt (from backlog). Week 2: IOC hunt (from TI advisory). Week 3: MITRE ATT&CK gap hunt (pick the highest-priority uncovered technique). Week 4: UEBA review hunt (investigate the top 5 entities by investigation priority score).

This rotation ensures: every hunting approach is exercised monthly, MITRE coverage systematically improves, and UEBA insights are acted upon rather than ignored.

Measuring hunting programme effectiveness

Coverage metrics. Percentage of high-priority MITRE ATT&CK techniques hunted in the last quarter. Target: 100% of Priority 1 techniques, 75% of Priority 2. Track with the ATT&CK hunting coverage tracker from subsection 11.9.

Productivity metrics. Hunts completed per month. Bookmarks created per hunt. Incidents created from hunts. Analytics rules created from hunts. These measure the tangible output of the hunting programme.

Quality metrics. Threat confirmation rate (hunts that found real threats ÷ total hunts). Target: 10-25%. Below 10% suggests hypotheses are too vague or data coverage is insufficient. Above 25% suggests you are only hunting for easy-to-find threats (hunt harder).

Time metrics. Average hours per hunt. Time from hunt finding to analytics rule deployment. These measure efficiency — the hunting programme should become more efficient over time as the query library grows and the analyst gains experience.

Hunting programme maturity model

Level 1 — Ad-hoc. Hunting happens when an analyst has spare time or when triggered by an incident. No structure, no tracking, no measurement. Better than no hunting, but produces inconsistent results.

Level 2 — Structured. Regular cadence established (fortnightly or monthly). Hypothesis backlog maintained. Hunts documented in a hunt log. Findings sometimes converted to analytics rules. The solo operator model from this subsection operates at Level 2.

Level 3 — Managed. Dedicated hunting time in analyst schedules (not “if you have time”). MITRE ATT&CK coverage tracking. Every successful hunt produces an analytics rule. Hunting programme metrics reported monthly. Query library actively maintained and shared.

Level 4 — Optimised. Hunting programme integrated with detection engineering lifecycle. UEBA and threat intelligence feeds drive the hypothesis backlog automatically. Notebooks deployed for advanced analysis. Hunt-to-rule conversion time under 1 week. Cross-team intelligence sharing established.

Level 5 — Intelligence-driven. Dedicated hunting team (or dedicated hunting time for senior analysts). Custom threat models based on the organisation’s specific threat landscape. Proactive development of hunting methodologies for emerging threats. Hunt findings published to ISACs and the wider community. Organisation contributes to the collective defence.

Most organisations should reach Level 2 within 3 months and Level 3 within 6 months. Levels 4-5 require dedicated resources and mature detection engineering — typically 12+ months into the Sentinel deployment.

The hunting programme kickstart guide

For analysts starting a hunting programme from zero, follow this 30-day kickstart plan.

Week 1: Foundation. Enable UEBA (if not already). Install ASIM parsers. Import the six hunting query patterns from subsection 11.3 as custom queries in the Hunting blade. Tag each with its MITRE technique. Create the hunt log template from subsection 11.8.

Week 2: First hunt. Pick the highest-priority hypothesis from: “Has anyone in my environment been targeted by AiTM phishing in the last 30 days?” Run the cross-table correlation query from subsection 11.3. Document results in the hunt log. If findings exist, create bookmarks.

Week 3: IOC hunt. Find the most recent CISA or Microsoft threat advisory relevant to your industry. Extract the IOCs. Run the IOC-driven hunt from subsection 11.1. Document results.

Week 4: MITRE gap hunt. Open the MITRE ATT&CK blade. Pick one uncovered technique from the Initial Access tactic. Write and run a hunting query. Document results. If the hunt was productive, create an analytics rule.

End of Month 1: You have completed 3 hunts, built a hunt log with 3 records, imported 6+ hunting queries, and (ideally) created 1 new analytics rule. The hunting programme is operational. Set the recurring cadence from this subsection and continue.

Quarterly programme review

Every quarter, step back and assess the programme as a whole.

Coverage review. How many MITRE ATT&CK techniques have been hunted in the last quarter? How many have analytics rules? How many have neither? Update the coverage tracker.

Hypothesis backlog review. Are the hypotheses in the backlog still relevant? Has new threat intelligence introduced hypotheses that should be prioritised? Remove stale hypotheses (the vulnerability was patched, the campaign ended).

Query library review. Are all hunting queries still functional? Do they reference populated tables? Have any tables been renamed or restructured? Archive non-functional queries. Add queries for new data sources connected since the last review.

Metrics review. Calculate: hunts completed, threat confirmation rate, rules created, average time per hunt. Compare to the previous quarter. Are the metrics improving, stable, or degrading? If degrading: is the hypothesis quality declining, is the data coverage insufficient, or is the analyst overloaded?

Programme adjustment. Based on the review: adjust the cadence if needed, reprioritise the hypothesis backlog, identify training needs (e.g., notebooks for advanced analysis), and set goals for the next quarter.

Solo operator: combining hunting with incident response

The biggest challenge for a solo operator is context-switching between hunting and incident response. An incident arrives mid-hunt — do you drop the hunt, or finish the current query?

The 15-minute rule. If you are mid-hunt and an incident arrives: if the incident is P1/P2 (High/Critical severity), drop the hunt immediately. Bookmark your current position (literally — create a bookmark of the last query result you were reviewing), note where you stopped in the hunt log, and switch to the incident. If the incident is P3/P4 (Medium/Low), finish your current hunting step (up to 15 minutes), bookmark your position, and switch.

The hunt resume protocol. When you return to the hunt after an incident: review your last bookmark and hunt log note. Re-run the last query to refresh context. Continue from where you stopped. This avoids re-doing work or losing the thread of the investigation.

Combining hunting and incident response productively. The best hunting hypotheses come from incidents. After closing an incident, spend 30 minutes adding follow-up hypotheses to the hunting backlog: “The AiTM attacker used IP 203.0.113.47. Hunt: are there other accounts with sign-ins from the same ASN?” “The BEC attacker created inbox rules with financial keywords. Hunt: do any other accounts have rules with similar keywords?” These incident-derived hypotheses are the highest-value hunts because they are grounded in confirmed attacker behaviour in your environment.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// Incident-derived hunting: find accounts with inbox rules
// matching patterns from a confirmed BEC incident
CloudAppEvents
| where TimeGenerated > ago(90d)
| where ActionType in ("New-InboxRule", "Set-InboxRule")
| extend RuleData = parse_json(RawEventData)
| extend Creator = tostring(RuleData.UserId)
| extend Conditions = tostring(RuleData.Parameters)
| where Conditions has_any ("invoice", "payment", "bank", "wire",
    "transfer", "remittance", "account details")
| where Creator != "a.patel@northgateeng.com"  // Exclude already-investigated user
| extend ClientIP = tostring(RuleData.ClientIP)
| project TimeGenerated, Creator, Conditions, ClientIP
| order by TimeGenerated desc

This query — derived from the M12 BEC investigation — hunts for other accounts that may have the same attacker persistence mechanism. It runs in 30 seconds and could discover a compromise that has been silently operating for weeks.

Hunting programme ROI calculation

Management will ask: “Is this worth the analyst’s time?” Prepare the answer before they ask.

Cost of hunting: Hours per month × analyst hourly cost. For a solo operator at 4 hours/fortnight: ~8 hours/month. At £50/hour (fully loaded cost): £400/month.

Value of hunting: Each confirmed finding that prevents a security incident has a calculable value. The average BEC incident costs £65,000 (IC3 data). The average ransomware incident costs £100,000+ (recovery, downtime, reputation). If the hunting programme prevents one incident per year, the ROI is: (£65,000 saved) ÷ (£4,800 annual cost) = 13.5x return.

The intangible value: Hunting builds institutional knowledge about your threat landscape. It systematically closes detection gaps. It produces analytics rules that run permanently — each hunt makes the automated detection better. These compounding benefits are difficult to quantify but are the primary long-term value of the programme.

Present this to management: “The hunting programme costs £400/month in analyst time. It produces [X] analytics rules per month, has confirmed [Y] threats this quarter, and has improved MITRE ATT&CK coverage from [Z1]% to [Z2]%. One prevented BEC incident pays for 13 months of hunting.”

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// Hunting programme dashboard — monthly summary
// Requires hunt metadata in a custom table (HuntLog_CL)
// or manual tracking in the hunt log
// HuntLog_CL
// | where TimeGenerated > ago(30d)
// | summarize
//     HuntsCompleted = count(),
//     ThreatConfirmed = countif(Outcome == "ThreatConfirmed"),
//     RulesCreated = countif(AnalyticsRuleCreated == true),
//     TotalHours = sum(TimeSpentHours),
//     AvgHoursPerHunt = avg(TimeSpentHours)

Integrating hunting with the detection engineering lifecycle

The hunting programme and the detection engineering programme (Module 10.11) form a virtuous cycle.

Hunting → Detection: A hunt finds a novel technique. The hunter creates an analytics rule. Future occurrences are detected automatically. The technique moves from “unknown” to “detected.”

Detection → Hunting: A detection gap is identified in the MITRE ATT&CK coverage analysis. The gap is added to the hunting hypothesis backlog. A hunt validates whether the technique is present in the environment. If present, an analytics rule is created. If absent, the negative finding is documented and the technique is periodically re-hunted.

Incident → Hunting: An incident reveals a compromised account. The investigation is contained and closed. But: did the attacker compromise other accounts? Did they establish persistence mechanisms the investigation missed? A follow-up hunt answers these questions — extending the investigation scope beyond the specific incident.

UEBA → Hunting: UEBA flags entities with elevated investigation priority scores. The hunting programme reviews these entities — investigating whether the anomalous behaviour represents a genuine threat or legitimate activity.

Over months and years, this cycle transforms the security operation. The analytics rule library grows. The hunting query library grows. The MITRE ATT&CK coverage improves. The mean time to detect decreases. The mean time to respond decreases. The attackers’ window of undetected operation shrinks — from months (before hunting) to days or hours (with an active programme).

Modules 7-10 complete the Sentinel operational capability

Module 7 built the workspace. Module 8 connected the data. Module 10 built the automated detection layer. Module 11 built the proactive hunting layer. Together, these four modules provide: a configured workspace (M7), comprehensive data coverage (M8), automated detection for known threats (M10), proactive hunting for unknown threats (M11), automated response via playbooks (M10.7), operational dashboards via workbooks (M10.9), and continuous improvement via the detection engineering lifecycle (M10.11) fed by hunting findings (M11). This is a fully operational, continuously improving security operations capability built on Microsoft Sentinel.

Try it yourself

Draft a one-page hunting programme plan for your environment. Include: cadence (how often), hypothesis backlog (initial 5 hypotheses with priority), query library (starting with the queries from subsection 11.3), hunt log format (from subsection 11.8), and detection integration (how hunt findings feed into the Module 10.11 detection review). This plan is the operational document that turns ad-hoc hunting into a structured programme.

What you should observe

A complete hunting programme plan on one page: cadence matches your available time, the hypothesis backlog reflects your environment's threat profile, the query library leverages the patterns from this module, and the detection integration connects hunting to the monthly detection review. This plan is your roadmap for the first 3 months of structured hunting.

Knowledge check

Compliance mapping

NIST CSF: DE.AE-1 (Baseline of operations established), PR.DS-1 (Data-at-rest is protected). ISO 27001: A.8.15 (Logging), A.8.16 (Monitoring activities). SOC 2: CC7.2 (Monitor system components). Every configuration in this subsection contributes to the logging and monitoring controls that auditors verify.

Check your understanding

1. What is the recommended hunting cadence for a solo SOC operator?

4 hours per fortnight — 2 hours per hunt, 2 hunts per month. Rotate between hypothesis-driven, IOC-based, MITRE gap, and UEBA review hunts monthly. This balances hunting with incident response, detection engineering, and operational duties. More hunting time is better, but consistency matters more than volume — 2 hunts per month, every month, is more effective than 10 hunts in one month followed by 3 months of no hunting.

Full-time — hunting is the primary job

Once per quarter

Only when triggered by an incident

4 hours per fortnight with consistent cadence. Regularity matters more than volume. Rotate through all hunting approaches monthly.

← 11.10 Hunting with Notebooks 11.12 Module Summary →