In this section

TR0.2 The Triage Decision

3-4 hours · Module 0 · Free
What you already know

Section 0.1 established why the first 60 minutes matter — evidence decays, attackers advance, and triage speed determines whether you contain or recover. The decision framework itself: the four possible classifications, the cost of getting each one wrong, and the confidence threshold that separates "close" from "escalate."

Scenario

Tom picks up an impossible travel alert for j.morrison — authenticated from London at 09:00 and Singapore at 09:05. He's seen this alert before. j.morrison uses a cloud VPN service that routes through Singapore, and this exact alert fires every Monday morning. Tom closes it as a false positive. This Monday, j.morrison is not using the VPN. An attacker replaying a stolen session token from a Singapore residential proxy has just been given free access to j.morrison's account because the 200th alert looked identical to the previous 199.

The four triage outcomes

Every alert resolves to one of four classifications. The triage responder's job is to determine which one within 15 minutes using data, not intuition.

FOUR TRIAGE OUTCOMES — THE DECISION MATRIX TRUE POSITIVE Alert real. Attacker active. Preserve → contain → escalate Confidence: >70% FALSE POSITIVE Alert wrong. Legitimate activity. Document reason → close → tune Confidence: >90% BENIGN TRUE POSITIVE Alert correct, activity authorized. Document authorization → close Requires verification INDETERMINATE Cannot classify. Data insufficient. Treat as probable TP → escalate Default: escalate Note the confidence asymmetry: closing as FP requires >90% confidence. Escalating as TP requires only >70%.

Figure TR0.2 — Four triage outcomes. The indeterminate category defaults to escalation — when you can't classify, treat it as a probable true positive until evidence proves otherwise.

True positive (TP) — the alert detected genuine malicious activity. The CHAIN-HARVEST spray alert at NE was a true positive: 24 accounts sprayed, one succeeded, attacker active within minutes. The triage response: preserve volatile evidence, execute containment, produce the triage report, hand off to investigation.

False positive (FP) — the alert fired on legitimate activity that matched the detection rule's criteria. A sign-in from a hosting provider IP triggered the "suspicious IP" rule, but the user was connected to a legitimate VPN. The triage response: document the specific reason (user confirmed VPN use, IP belongs to known provider), close the alert, submit a tuning request to detection engineering.

Benign true positive (BTP) — the alert correctly detected the activity, and the activity genuinely occurred, but it was authorized. An IT administrator running Mimikatz during an authorized penetration test triggers the credential access rule. The rule is correct — Mimikatz ran. The activity is not malicious — it was planned. Document the authorization reference, close, and consider a testing-window exclusion.

Indeterminate — the triage responder cannot classify with available data. The sign-in came from an unfamiliar IP, the user hasn't responded to verification. The process looks suspicious, but the file hash is unknown. The indeterminate classification is not a failure of triage — it's the acknowledgment that available data is insufficient. Treat it as a probable TP: preserve evidence, escalate with context, continue data collection. At NE, approximately 20% of cloud identity alerts classify as indeterminate on first pass.

Of those, 75% resolve to FP after user verification. The remaining 25% are confirmed TP — meaning the indeterminate workflow correctly identified genuine ambiguity that required additional data.

The cost asymmetry that drives every decision

The triage decision is not symmetric. Getting it wrong in one direction has a entirely different cost than getting it wrong in the other.

A false negative — classifying a genuine incident as a false positive — is catastrophic. The alert was the detection system's one notification, and the SOC dismissed it. The attacker continues operating undetected. The next detection opportunity may not come until after the attacker has achieved their objective. At NE, if the CHAIN-HARVEST spray alert had been closed as "normal authentication noise," the BEC wire transfer would have succeeded with no interference.

A false escalation — classifying a benign alert as a true positive — wastes 4–20 analyst hours, may disrupt business operations if containment was executed prematurely, and erodes trust in the SOC's judgment. After multiple false escalations, the IR team becomes skeptical and response speed degrades because they assume the next escalation is another false alarm.

The asymmetry is stark: a missed breach costs weeks of investigation, regulatory fines, and potentially irreversible damage. A false escalation costs hours. This asymmetry is why the confidence threshold for closing as FP (>90%) is higher than the threshold for escalating as TP (>70%). When uncertain, escalate. The cost of a 5-hour wasted investigation is orders of magnitude lower than the cost of a missed breach.

KQL — Measure your FP-to-TP ratio
// What percentage of your incidents are closed as FP vs TP?
// High FP rate = detection rules need tuning
// Zero FP rate = analysts may be over-escalating
SecurityIncident
| where CreatedTime > ago(30d)
| where Status == "Closed"
| summarize
    TruePositives = countif(Classification == "TruePositive"),
    FalsePositives = countif(Classification == "FalsePositive"),
    BenignPositives = countif(Classification == "BenignPositive"),
    Undetermined = countif(Classification == "Undetermined"),
    Total = count()
| extend FPRate = round(FalsePositives * 100.0 / Total, 1)
| extend TPRate = round(TruePositives * 100.0 / Total, 1)

The healthy range for a mature SOC is 15–30% TP rate. Below 15% means your detection rules produce too much noise — analysts spend more time closing false positives than investigating real incidents. Above 40% means either your environment is under sustained attack (unlikely over 30 days) or your analysts are only escalating alerts they're absolutely certain about — which means ambiguous alerts are being closed as FP, and some of those are missed breaches.

The confidence threshold

Triage does not require certainty. It requires sufficient confidence to act. Four levels guide the classification.

Confirmed (>95% confidence). The evidence leaves no reasonable doubt. The KQL output shows j.morrison authenticated from a Tor exit node 3 seconds after the legitimate user authenticated from the Bristol office. Two simultaneous sessions from two countries is a compromised account. Classification: TP, no ambiguity.

Probable (70–95% confidence). The evidence strongly suggests malicious activity, but an alternative explanation exists. A user's account shows a sign-in from an unfamiliar foreign IP at 03:00. The user hasn't confirmed whether they were traveling. Classification: probable TP. Begin evidence preservation and escalate — don't wait for user confirmation, because by the time they reply the attacker may have achieved their objective.

Possible (40–70% confidence). The evidence is ambiguous. A new OAuth application was granted permissions, the publisher is unverified, but the application name matches a legitimate business tool the user may have been evaluating. Classification: indeterminate. Treat as probable TP, investigate the application, verify with the user, preserve the consent logs.

Suspected (<40% confidence). A weak signal that may be noise. Elevated failed sign-ins from a single IP, but the volume is below the spray threshold and the failures target users with similar usernames — likely auto-complete errors. Probably FP. But document the observation, check the IP against threat intelligence, and monitor for escalation in the next 24 hours. Weak signals that recur become investigations.

The scorecard in Section 0.8 formalizes this confidence assessment. Each of the 8 questions contributes to the confidence score. The threshold for escalation is "probable" — anything at 70% or above triggers the preserve-and-contain sequence. Below 70%, you still document and monitor. Below 40%, you close with documented reasoning and a 24-hour review flag.

Why the decision is harder in practice

Three factors make triage decisions more difficult than the framework suggests.

Alert fatigue. An analyst who triages 200 alerts per shift develops pattern-matching reflexes. "Impossible travel for j.morrison" becomes "close — this fires every Monday." The 201st impossible travel alert — the one that's genuinely an attacker replaying a stolen token — gets the same reflexive closure. Alert fatigue doesn't make analysts lazy. It makes them efficient with a dangerous side effect: the efficiency that handles 195 false positives correctly also handles 5 true positives incorrectly.

The triage scorecard in Section 0.8 counteracts this by requiring data-level verification for every alert above a severity threshold, preventing reflexive closure. "Is the source IP in the trusted IP list?" is a data check. "This alert always fires for this user" is a pattern assumption. The scorecard forces the former.

Ambiguous data. A sign-in from a hosting provider IP could be an attacker using a VPS, a user on a cloud VPN, or a legitimate application authenticating on the user's behalf. The same data point supports three conclusions. The triage responder must gather additional data — does the IP appear in the user's 30-day sign-in history? Does the user-agent match known devices? Was the authentication interactive or non-interactive? Each additional data point narrows the possibilities.

The scorecard structures this data gathering into a repeatable sequence, so the analyst checks the same evidence sources in the same order for every alert of the same type.

Time pressure. The 15-minute target creates urgency. The analyst knows that every minute deliberating is a minute the attacker (if real) uses to advance. This pressure pushes toward quick decisions — and quick decisions based on incomplete data trend toward the analyst's default bias, whether that's "probably fine" or "probably bad." Individual bias varies by personality and experience: analysts who have been burned by a missed breach tend to over-escalate.

Analysts who have faced criticism for false escalations tend to under-escalate. Neither default is correct for every alert. The triage scorecard replaces individual bias with a structured sequence of 8 checks that completes in 10–15 minutes regardless of alert type. The structure produces consistent outcomes regardless of which analyst runs it.

Closing indeterminate alerts as "probable FP — monitoring"

The most dangerous triage outcome is not a confident wrong answer — it's the non-answer. "Probable FP — will monitor" closes the alert, removes it from the queue, and relies on the analyst remembering to check back. They won't. The queue refills, the shift changes, and the "monitoring" alert disappears into the noise. If you can't classify an alert within 15 minutes, escalate it as indeterminate with the evidence you've gathered. Escalation is a decision. "Monitoring" is the absence of one.

Documenting the decision

Every triage decision requires documentation — not just the classification, but the reasoning. A triage entry that says "FP — closed" provides no value. A triage entry that records the specific evidence checked, the data points that supported the classification, and the alternative explanations considered is defensible, reviewable, and instructive for the team.

Worked Example — Triage Classification Entry

Alert: DE4-002 — AiTM Token Replay Detected

Time: 2026-02-27 08:14 UTC

Entity: j.morrison@northgate-eng.co.uk

Source IP: 185.220.101.42 (Tor exit node, Romania)

Classification: TRUE POSITIVE — Confirmed (>95%)

Evidence: (1) IP is known Tor exit node. (2) IP not in j.morrison's 30-day history. (3) Simultaneous session from Bristol office IP — two countries, one user. (4) Auth method: token replay, not interactive. (5) No change tickets or authorized testing.

Actions: Session revocation at 08:19. SigninLogs + AuditLogs snapshot saved. IR notified 08:22. Triage report delivered 08:26.

This documentation serves three purposes. First, it protects the analyst — if a classification is later questioned, the documented reasoning shows what evidence was available and what process was followed. A triage entry that records "checked 30-day sign-in history, no prior occurrence of this IP, user-agent mismatch from known devices" is defensible even if the classification later turns out to be wrong. A triage entry that records "FP — closed" is not defensible regardless of whether the classification was correct.

Second, it enables quality review. The triage quality process in TR13 examines closed alerts to identify misclassification patterns. Without documented reasoning, you can measure that an analyst closed 180 alerts last week. With documented reasoning, you can measure that 12 of those 180 closures skipped the IP verification step — and that 3 of those 12 were subsequently reclassified as true positives during retrospective review.

Third, it trains the team. New analysts learn triage judgment by reading the documented reasoning of experienced analysts. The documented entry for j.morrison above teaches a pattern: Tor exit node + simultaneous legitimate session + non-interactive authentication = confirmed AiTM session hijack. That pattern is transferable to every future AiTM alert the analyst encounters. Without the documentation, the new analyst learns only that "j.morrison was a TP" — useful as a data point, useless as a training artifact.

Investigation Principle

When uncertain, escalate. The cost asymmetry between a missed breach and a false escalation is not close. A false escalation wastes hours. A missed breach costs weeks, regulatory fines, and damage that may be permanent. The indeterminate classification exists to formalize this: if the data is insufficient to close with confidence, preserve evidence and escalate with what you have.

Next

Section 0.3 applies this decision framework across three environments — M365/cloud, Windows endpoints, and Linux infrastructure. The classification logic is the same. The evidence locations, the queries you run, and the containment actions you take are different for each. You'll see why single-environment triage produces incomplete assessments and how a unified methodology handles cross-environment incidents.

Unlock the Full Course See Full Course Agenda