In this section

TH1.4 Analysis: Separating Signal from Noise

3-4 hours · Module 1 · Free

Operational Objective

The collection step produced results. Some of those results are threats. Most are not. The analysis step is where hunting judgment lives — the contextual reasoning that transforms a dataset of anomalies into a finding (or a documented absence of findings). This subsection teaches the analytical framework for distinguishing legitimate activity from attacker behavior using contextual enrichment, baseline comparison, and structured decision-making.

Deliverable: A repeatable analytical framework for evaluating hunt results — including the five enrichment dimensions, the confidence scoring model, and the decision tree for escalation versus closure.

⏱ Estimated completion: 30 minutes

The query found 28 anomalies. Now what?

Your collection queries narrowed 350,000 sign-in events to 28 users with authentication from IPs outside their 30-day baseline. You enriched with MFA registration data and reduced to 3 users with both a new IP and a new authentication method in the same window. Those 3 users are your analysis population.

Three is a manageable number. But you cannot escalate all three to IR and declare them compromised. Some — possibly all — have legitimate explanations. A user who traveled to a conference last week and registered a new phone as their MFA device produces exactly the same signal as an AiTM attacker who registered a new MFA method from a stolen session. The data looks identical. The context is different.

// Temporal correlation: was a phishing email delivered before the anomalous sign-in?
let suspectUser = "j.morrison@northgateeng.com";
let anomalyTime = datetime(2026-03-28T14:32:00Z);
EmailEvents
| where TimeGenerated between (
    (anomalyTime - 48h) .. anomalyTime)
| where RecipientEmailAddress == suspectUser
| where DeliveryAction == "Delivered"
| where ThreatTypes has_any ("Phish", "Malware")
    or ConfidenceLevel == "High"
| project TimeGenerated, Subject, SenderFromAddress,
    ThreatTypes, DeliveryAction
// If a phishing email was delivered within 48h before the
// anomalous sign-in, the correlation significantly elevates
// the likelihood that the sign-in is a compromised session

// Geographic enrichment: full sign-in history from this IP
let suspectIP = "203.0.113.47";
SigninLogs
| where TimeGenerated > ago(90d)
| where IPAddress == suspectIP
| summarize
    Users = make_set(UserPrincipalName, 20),
    UserCount = dcount(UserPrincipalName),
    FirstSeen = min(TimeGenerated),
    LastSeen = max(TimeGenerated)
// If multiple users authenticate from the same anomalous IP,
// it is likely infrastructure (VPN, proxy, corporate egress)
// If only one user authenticates from it, the signal is stronger

Expand for Deeper Context

Analysis is the process of adding context until you can make a judgment. Not a guess — a judgment backed by evidence, documented with reasoning, and defensible if questioned.

Five enrichment dimensions

When a hunt result looks suspicious, enrich it across five dimensions before making a judgment. Each dimension either strengthens the suspicion or explains it away.

Dimension 1: User context. Who is this person? What is their role? Do they travel? Do they use VPNs? Are they in IT (where unusual system access is normal) or in finance (where it is not)? A Global Admin signing in from a new country during a known business trip is different from a finance clerk signing in from a new country with no travel history.

You do not need a full HR profile. Check Entra ID for the user's department, job title, and manager. Check their recent calendar (if accessible) for travel. Check with their manager if the enrichment alone does not resolve the ambiguity.

Dimension 2: Temporal context. When did the anomaly occur? Business hours in the user's time zone? Off-hours? During a known maintenance window? Immediately after a phishing email was delivered to the user? Timing matters because attackers operate on different schedules than users. A sign-in at 03:00 local time from a user who has never signed in outside 08:00–18:00 is more suspicious than the same sign-in at 14:00.

Cross-reference with email data. Was a phishing email delivered to this user in the 24 hours before the anomalous sign-in? The correlation between "phishing email delivered" and "anomalous sign-in" is the strongest single indicator of compromise in M365 investigations.

Dimension 3: Geographic context. Where is the anomalous IP? Is it in a country where the user has never authenticated? Is it a known VPN egress point? Is it a residential proxy (common in AiTM attacks)? Is it in a country on your organization's risk list?

Check the IP against your known VPN and proxy IP ranges. If your organization uses a VPN that rotates egress IPs, legitimate VPN usage produces "new IP" signals constantly — this is a tuning consideration for TH4, not evidence of compromise. If the IP is in a country the user has never signed in from, and the user has no travel history to that country, the geographic signal is strong.

Dimension 4: Behavioral context. What did the user do during and after the anomalous session? Normal work activity — reading email, editing documents, attending Teams meetings — is consistent with a legitimate user on a new device or location. Unusual activity — creating inbox rules, consenting to applications, downloading files from SharePoint sites the user does not normally access, sending emails to external recipients they have never contacted — is consistent with attacker post-compromise behavior.

This is the most powerful enrichment dimension because it examines what the session was used for rather than how it was established. A sign-in from a new country is an indicator. A sign-in from a new country followed by inbox rule creation, OAuth consent, and bulk file download is a finding.

Dimension 5: Correlated indicators. Does this anomaly exist in isolation, or does it correlate with other signals? A new IP alone is weak. A new IP + new MFA method is stronger. A new IP + new MFA method + inbox rule creation + phishing email delivered 2 hours before is very strong. Each additional correlated indicator increases confidence that the anomaly is a compromise rather than legitimate activity.

The campaign modules (TH4–TH13) define the specific correlations relevant to each technique. The principle here is general: never make a judgment based on a single indicator. Always correlate across at least two of the five dimensions before escalating.

Figure TH1.4 — Five enrichment dimensions. Each dimension adds context. Judgments require correlation across at least two dimensions. Single-dimension signals are indicators, not findings.

The confidence model

Not every analysis produces a binary yes/no. Some results are clearly compromise. Some are clearly legitimate. Many are ambiguous. The confidence model gives you language for the ambiguity:

High confidence — escalate to IR. Three or more enrichment dimensions indicate compromise. Example: new IP (geographic) + phishing email delivered 2 hours before (temporal) + inbox rule created (behavioral) + second user from same IP (correlated). Escalate immediately. Do not wait for the hunt to complete.

Medium confidence — investigate further. Two dimensions indicate something unusual, but legitimate explanations remain plausible. Example: new IP (geographic) + off-hours sign-in (temporal), but no post-sign-in anomalous behavior (behavioral is clean). The user may have traveled. Check user context (dimension 1) before escalating.

Low confidence — document and close. One dimension shows an anomaly that is fully explained by another dimension. Example: new IP (geographic), but the IP belongs to a known VPN provider the organization uses (geographic context resolves it). Document the finding as "legitimate — VPN IP change" and close.

No finding — document and close. The hunt produced no anomalies matching the hypothesis across the full scope. This is the negative finding. Document it: "Hypothesis X tested against Y data over Z time window. No evidence found. Hypothesis refuted for this period." Convert the hunt query to a detection rule.

The most common analysis error

Confirmation bias. The analyst expects to find a threat. They interpret ambiguous results as evidence of compromise because finding a threat feels productive and finding nothing feels like failure. TH0.7 addressed why negative findings have value. Here the operational implication is: ambiguous results get the same enrichment treatment as suspicious results. Do not escalate based on a single anomaly because you want to find something. Escalate based on correlated evidence across multiple dimensions because the evidence supports it.

The inverse error also exists: analysts who rationalize away every anomaly because escalating creates work. "The user probably traveled." "The inbox rule was probably legitimate." Each rationalization may be correct individually. In aggregate, they allow compromises to persist undetected. The enrichment framework exists to prevent both errors — it provides structured evidence for the judgment rather than leaving the judgment to instinct alone.

Try it yourself

Exercise: Analyze a hunt result using the five dimensions

Take the results from the TH1.3 exercise (users with new IPs and new MFA registrations). For each user in your result set, enrich across all five dimensions:

User: Check Entra ID for role, department, manager. Is this person expected to travel or use unusual devices?

Temporal: When did the anomalous sign-in occur relative to business hours? Was a phishing email delivered in the 48 hours before?

Geographic: Where is the new IP? Is it a known VPN egress? A residential proxy? A cloud hosting provider?

Behavioral: What did the user do from the new IP? Normal work activity or post-compromise indicators (inbox rules, app consent, file downloads)?

Correlated: How many dimensions show anomalies? One? Two? Three or more?

Record the confidence level (high, medium, low, no finding) and the decision (escalate, investigate further, document and close). This is the analysis section of your hunt record.

⚠ Compliance Myth: "Any anomaly detected during a hunt should be escalated as an incident"

The myth: If hunting finds something unusual, it must be escalated. Failing to escalate is a security failure.

The reality: Anomalies are not findings. They are indicators that require analysis. A hunt that escalates every anomaly without enrichment overwhelms the IR team with false positives and erodes trust in the hunting program. The analysis step exists to separate anomalies (raw signal) from findings (enriched, contextualized evidence). Only high-confidence findings — supported by correlated evidence across multiple enrichment dimensions — warrant IR escalation. Medium-confidence results warrant further investigation within the hunt. Low-confidence results warrant documentation and closure. The quality of a hunting program is measured by the precision of its escalations, not the volume.

Extend this framework

The five enrichment dimensions described here apply to identity-based hunts — identity compromise, cloud persistence, privilege escalation. Endpoint hunts (TH9, TH10, TH12) use adapted dimensions: process context (parent process, command line, execution frequency), file context (file path, creation time, digital signature), network context (destination IP, port, protocol, frequency), and device context (device role, patch level, user population). The confidence model is the same — single-dimension signals are indicators, multi-dimension correlations are findings — but the dimensions change to match the data. Each campaign module defines the enrichment dimensions relevant to its technique.

📋 Operational Artifact — Analysis Decision Framework

Per suspect result, enrich across five dimensions:

| Dimension | Finding | Signal Strength |

User context dimension: assess the user's role, travel history, and VPN usage patterns — does this context support or weaken the suspicion? Temporal dimension: assess the time of day, correlation with known phishing campaigns, and timing relative to other events — does the timing support or weaken the suspicion? Geographic dimension: assess the IP location, whether it falls within known corporate ranges, and whether the location is consistent with the user's work patterns — does the geography support or weaken the suspicion?

| Behavioral | [post-sign-in activity] | [supports/weakens suspicion] |

| Correlated | [# of dimensions with anomalies] | [1=indicator / 2=suspicious / 3+=finding] |

Decision: High confidence (3+ dimensions) → Escalate to IR.

Medium confidence (2 dimensions) → Investigate further.

Low confidence (1 dimension, explained) → Document and close.

No finding → Document negative finding. Convert query to detection rule.

References Used in This Subsection

MITRE ATT&CK Techniques referenced: T1557.001 (Adversary-in-the-Middle), T1078 (Valid Accounts), T1564.008 (Email Hiding Rules)
Course cross-references: TH0.7 (value of negative findings), TH4 (identity compromise campaign — full implementation of this analysis framework)

Decision point

You have time for one hunt this quarter. Do you hunt for the threat in the latest advisory or for the gap in your ATT&CK coverage matrix?

Hunt the coverage gap. Advisories describe threats that are CURRENT but may not target NE. Coverage gaps describe techniques that COULD target NE and would succeed undetected. The coverage gap hunt produces a detection rule (closing the gap permanently). The advisory-driven hunt produces a point-in-time assessment (confirming the specific threat is not present today). Both are valuable — but the coverage gap hunt has a longer-lasting impact because it produces a permanent detection improvement.

A hunt query returns 200 results. You have 4 hours remaining in the hunt window. You can investigate 20 results thoroughly or review all 200 superficially. Which approach produces better hunt outcomes?

Review all 200 — you might miss a critical finding in the 180 you skip.

Investigate 20 thoroughly. A superficial review of 200 results produces 200 'looked at it, seemed okay' assessments that provide no investigative value and no documentation for future reference. A thorough investigation of 20 results produces: confirmed findings (true positives requiring remediation), confirmed benign patterns (documented baselines for future comparison), and inconclusive results (flagged for monitoring). Prioritise the 20 by: highest anomaly score, highest-value assets involved, and highest-risk users involved. Document why the remaining 180 were not investigated and recommend a follow-up hunt with refined query criteria to reduce the result set.

Investigate 20 — but only if they are from the most recent 24 hours.

Neither — refine the query first to reduce the result set below 50.

You understand the detection gap and the hunt cycle.

TH0 showed you what detection rules fundamentally cannot catch. TH1 gave you the hypothesis-driven methodology that closes that gap. Now you run the hunts.

10 complete hunt campaigns — from hypothesis through KQL execution through finding disposition, each campaign based on a real TTP
70 production hunt queries — every one mapped to MITRE ATT&CK and tested against realistic telemetry
Advanced KQL for hunting — UEBA composite risk scoring, retroactive IOC sweeps, and hunt management metrics
Hypothesis-Driven Hunt Toolkit lab pack — 30 days of realistic M365 and endpoint telemetry with multiple attack patterns seeded in
TH16 — Scaling hunts across a team — the operating model for a production hunt program

Unlock the full course with Premium See Full Syllabus

← Previous Next →