TR0.2 The Triage Decision
Figure TR0.2 — Four triage outcomes. Every alert resolves to one of these four classifications. The indeterminate category is the trap — it must be treated as a probable true positive until evidence proves otherwise.
The four outcomes
Every alert resolves to one of four classifications. The triage responder’s job is to determine which one — and to do so within 15 minutes.
True positive (TP). The alert detected genuine malicious activity. An attacker is present in the environment, and the alert correctly identified one of their actions. The CHAIN-HARVEST spray alert at NE was a true positive: 24 accounts were sprayed, one succeeded, and the attacker was active in the environment within minutes. The triage response: preserve volatile evidence, execute containment, produce the triage report, and hand off to the investigation team.
False positive (FP). The alert fired on legitimate activity that matched the detection rule’s criteria. A sign-in from a hosting provider IP triggered the “suspicious IP” detection, but the user was connected to a legitimate VPN service. The activity is not malicious — the rule needs tuning. The triage response: document the specific reason this is a false positive (user confirmed VPN use, IP belongs to known provider), close the alert, and submit a tuning request to the detection engineering team.
The indeterminate classification workflow. Indeterminate alerts (scorecard 8-14 but evidence is ambiguous) require a specific workflow that differs from both FP closure and TP escalation. The indeterminate workflow: (1) preserve all available evidence — treat the alert as a probable TP for evidence purposes, (2) initiate user verification — contact the user or their manager via phone to verify the anomalous activity, (3) monitor the account for 24-48 hours — add the user to a temporary Sentinel watchlist that alerts on any new anomalous activity, (4) re-classify after verification — if the user confirms the activity was legitimate, reclassify as FP with verified documentation. If the user denies the activity, reclassify as confirmed TP and escalate. If the user cannot be reached within 24 hours, default to probable TP and escalate with the unresolved verification noted.
At NE, approximately 20% of cloud identity alerts classify as indeterminate on first triage. Of these, 75% resolve to FP after user verification (the user was travelling, using a new device, or testing a new application). The remaining 25% escalate to confirmed TP — meaning the indeterminate classification correctly identified genuine ambiguity that required additional data to resolve. The watchlist monitoring during the 24-48 hour window caught 2 incidents in Q1 2026 where the initial alert was ambiguous but the attacker’s SUBSEQUENT activity (inbox rule creation, OAuth consent) was definitive.
Benign true positive (BTP). The alert correctly detected the activity described in the detection rule, and the activity genuinely occurred, but it was authorised. An IT administrator running Mimikatz during a penetration test triggers the credential access detection rule. The rule is correct — Mimikatz was executed. The activity is not malicious — it was authorised. The triage response: document that the activity was authorised (reference the change ticket or penetration test authorisation), close the alert, and consider whether an exclusion should be added for future authorised testing windows.
Indeterminate. The triage responder cannot classify the alert with the available information. The sign-in came from an unfamiliar IP, but the user has not responded to the verification request. The process on the Windows endpoint looks suspicious, but the file hash is not in any threat intelligence database. The SSH session originated from an internal IP that could be a legitimate jump host or a compromised workstation. The triage response: treat this as a probable true positive. Begin evidence preservation, escalate with the information gathered so far, and continue data collection. Indeterminate alerts that default to “close — probably fine” are how breaches are missed.
The cost of each error type
The triage decision is not symmetric. Getting it wrong in one direction has a different cost than getting it wrong in the other.
False negative (missed true positive) — the missed breach. The analyst classifies a genuine incident as a false positive and closes the alert. The attacker continues operating undetected. At NE, if the CHAIN-HARVEST spray alert had been closed as “normal authentication noise,” the attacker would have had unlimited time to complete the BEC attack chain without interference. The cost: data breach, financial fraud, regulatory notification, reputational damage, and an IR engagement that starts days or weeks later with degraded evidence.
False negatives are catastrophic because they are invisible. The alert was the detection system’s one chance to notify the SOC, and the SOC dismissed it. The next detection opportunity may not come until the attacker triggers a different rule — which may happen after they have already achieved their objective.
False positive escalation (unnecessary IR mobilisation). The analyst classifies a benign alert as a true positive and mobilises the IR team. The investigation finds no compromise. The cost: wasted analyst hours (typically 4-20 hours for a full investigation), disrupted business operations (if containment was executed — accounts disabled, systems isolated), and eroded trust in the SOC’s judgment. After multiple false escalations, the IR team becomes sceptical of the SOC’s triage accuracy, and response speed degrades because the team assumes the next escalation is another false alarm.
The asymmetry. The cost of a missed breach vastly exceeds the cost of a false escalation. A false escalation wastes hours. A missed breach costs weeks of investigation, potential regulatory fines, and reputational damage that may be permanent. This asymmetry drives the fundamental triage principle: when uncertain, treat as a probable true positive. The indeterminate category exists specifically for this purpose — it is not a failure of triage. It is the acknowledgment that the available data is insufficient for a definitive classification, and the correct action is to preserve, escalate, and continue investigating rather than close and hope.
The confidence threshold
Triage does not require certainty. It requires sufficient confidence to act. The confidence scale:
Confirmed (>95% confidence). The evidence leaves no reasonable doubt. The KQL query shows j.morrison authenticated from a Tor exit node 3 seconds after the legitimate user authenticated from the Bristol office. This cannot be a VPN — two simultaneous sessions from two countries is a compromised account. Classification: TP. No ambiguity.
Probable (70-95% confidence). The evidence strongly suggests malicious activity, but an alternative explanation exists. A user’s account shows a sign-in from a Nigerian IP at 03:00, but the user has not confirmed whether they were travelling. Probable TP. Begin evidence preservation and escalate. Do not wait for user confirmation — by the time the user responds to the email, the attacker may have achieved their objective.
Possible (40-70% confidence). The evidence is ambiguous. A new OAuth application was granted permissions by a user, and the application publisher is unverified, but the application name matches a legitimate business tool the user may have been evaluating. Classification: indeterminate. Treat as probable TP. Investigate the application, check with the user, preserve the consent logs.
Suspected (<40% confidence). A weak signal that may be noise. Elevated failed sign-in count from a single IP, but the volume (8 failures in 1 hour) is below the spray detection threshold and the failures are spread across 3 users with similar usernames (auto-complete errors). Likely FP. But document the observation, check the IP against threat intelligence, and monitor for escalation in the next 24 hours. Do not ignore — log it as a weak signal and move on.
The triage scorecard in TR0.6 formalises this confidence assessment. Each of the 8 questions contributes to the confidence score. The threshold for escalation is “probable” — anything at 70% or above triggers the preserve-and-contain sequence.
Why the decision is harder in practice
Three factors make triage decisions more difficult than the framework suggests.
Alert fatigue. An analyst who triages 200 alerts per shift develops pattern-matching reflexes. “Impossible travel alert for j.morrison” becomes “close — j.morrison uses a VPN, this fires every week.” The 201st impossible travel alert for j.morrison — the one that is genuinely an attacker replaying a stolen token from Romania — gets the same reflexive closure. Alert fatigue does not make analysts lazy. It makes them efficient with a dangerous side effect: the efficiency that handles 195 false positives correctly also handles 5 true positives incorrectly.
The countermeasure: the triage scorecard forces the analyst to answer 8 specific questions for every alert classified above a minimum severity threshold. The questions prevent reflexive closure because they require data-level verification, not alert-level pattern matching. “Is the source IP in the trusted IP list?” is a data check. “This alert always fires for this user” is a pattern assumption.
Ambiguous data. Real security data is messy. A sign-in from an IP address that resolves to a hosting provider could be an attacker using a VPS, a user connected to a cloud-based VPN, or a legitimate cloud application authenticating on the user’s behalf. The same data point supports three different conclusions. The triage responder must gather additional data points until the balance tips in one direction. The triage scorecard provides the specific additional checks: does the IP appear in the user’s 30-day history? Does the user-agent match the user’s known devices? Was the authentication interactive (human) or non-interactive (application)?
Time pressure. The 15-minute triage target creates urgency. The analyst knows that every minute spent deliberating is a minute the attacker (if this is real) uses to progress. This pressure pushes toward quick decisions — and quick decisions based on incomplete data trend toward the analyst’s default bias (either “probably fine” or “probably bad,” depending on the analyst’s experience and risk tolerance). The triage scorecard mitigates this by providing a structured sequence of 8 checks that completes in 10-15 minutes regardless of the alert type. The structure replaces the analyst’s individual bias with a consistent, repeatable process.
The triage report that captures the decision
Every triage decision must be documented — not just the classification, but the reasoning. A triage report that says “FP — closed” provides no value. A triage report that says “FP — source IP 198.51.100.55 belongs to NordVPN exit node, user j.morrison confirmed VPN usage via Slack message at 09:14, IP appears in user’s 30-day sign-in history 12 times” provides complete accountability. If the classification is later proven wrong (the IP was not a VPN — it was an attacker using the same VPN provider), the documented reasoning shows what the analyst checked and what data was available at the time of the decision.
This documentation discipline serves three purposes. First, it protects the analyst: if a missed incident is traced back to a closed alert, the documented reasoning shows whether the analyst followed the triage process correctly with the data available (defensible) or skipped steps (not defensible). Second, it enables retrospective analysis: the triage quality review in TR14 examines closed alerts to identify patterns in misclassification. Third, it trains the team: new analysts learn triage judgment by reading the documented reasoning of experienced analysts, understanding WHY a specific combination of data points led to a specific classification.
Worked artifact: Triage classification entry
Alert: DE4-002 — AiTM Token Replay Detected Time: 2026-02-27 08:14 UTC Entity: j.morrison@northgateeng.com Source IP: 185.220.101.42 (TOR exit node, geo: Romania) Classification: TRUE POSITIVE — Confirmed Confidence: >95% Reasoning: (1) IP is a known Tor exit node — not a VPN or hosting provider. (2) IP does NOT appear in j.morrison’s 30-day sign-in history. (3) j.morrison’s legitimate session is active from 198.51.100.10 (Bristol office egress) at the same timestamp — two simultaneous sessions from two countries. (4) Authentication method: token replay (not interactive) — consistent with AiTM session hijack. (5) No change tickets or authorised testing in progress. Immediate actions: Session revocation initiated at 08:19. Evidence preservation: SigninLogs and AuditLogs snapshot for j.morrison (last 48h) saved to IR case folder. Escalation: IR team notified at 08:22. Severity: HIGH. Triage report delivered at 08:26.
Try it: classify these three alerts
Alert A: Impossible travel — user s.patel authenticated from London at 09:00 and from Singapore at 09:05. s.patel is a software developer based in London. No VPN usage documented. No travel authorisation in the system.
Alert B: Suspicious inbox rule created — user m.chen created a rule forwarding emails containing “invoice” or “payment” to an external email address (gmail.com). m.chen works in the finance department.
Alert C: Anomalous process — svchost.exe spawned powershell.exe on SRV-NGE-BRS005 at 02:14. The PowerShell command line contains a Base64-encoded string.
For each alert, determine: TP, FP, BTP, or Indeterminate? What additional data would you check? What is your confidence level? What immediate action would you take?
Alert A is almost certainly a TP: 5-minute impossible travel between London and Singapore with no VPN and no travel authorisation. Confidence: >90%. Begin session revocation and evidence preservation immediately.
Alert B is a probable TP: inbox rule forwarding financial keywords to an external address is a textbook BEC persistence indicator. However, check whether m.chen regularly forwards emails to a personal account for work-from-home purposes (BTP if authorised). Confidence: 75-85%. Preserve mailbox audit logs and check the rule creation timestamp against recent sign-in anomalies.
Alert C is a confirmed TP: svchost.exe spawning PowerShell with Base64-encoded commands at 02:14 on a server is a classic fileless attack indicator. No legitimate Windows service operation encodes PowerShell commands in Base64 at 2 AM. Confidence: >95%. Network-isolate the server via Defender for Endpoint, capture memory before the process terminates, and escalate immediately.
IOC-driven, IOA-driven, and TTP-driven triage
Three methodologies for approaching triage — the triage responder should understand all three because different alert types benefit from different approaches:
IOC-driven triage (Indicator of Compromise). The alert provides a specific, observable indicator: a known malicious IP address, a file hash matching malware, a domain name on a threat intelligence blocklist. The triage process: check whether the IOC is present in the environment (search logs for the IP, scan endpoints for the hash, query DNS for the domain). IOC-driven triage is FAST but BRITTLE — the attacker changes the IP, recompiles the binary (new hash), or registers a new domain, and the IOC is no longer detectable. IOC-driven triage answers: “Is this specific known-bad indicator present?”
IOA-driven triage (Indicator of Attack). The alert describes a behaviour pattern rather than a specific artifact: “a process spawned from a web server,” “PowerShell downloaded a file from an external URL,” “an account authenticated from two countries within 30 minutes.” The triage process: assess whether the behaviour is legitimate in context (does the web server normally spawn shell processes? Does this user normally travel between countries?). IOA-driven triage is SLOWER than IOC-driven but MORE RESILIENT — the attacker can change their tools and infrastructure but the behavioural pattern remains consistent. The triage scorecard from TR0.6 is fundamentally an IOA assessment tool: it evaluates behaviours (anomalous location, unusual timing, unexpected access pattern) rather than specific indicators. IOA-driven triage answers: “Does this behaviour indicate an attack regardless of the specific tools used?”
TTP-driven triage (Tactics, Techniques, and Procedures). The alert is mapped to a MITRE ATT&CK technique, and the triage process assesses whether the observed technique is part of a known attack pattern. For example: “T1566.001 Spearphishing Attachment detected” triggers a triage that checks for the DOWNSTREAM techniques an attacker would use after successful phishing (T1078 Valid Accounts for credential use, T1098 Account Manipulation for persistence, T1114 Email Collection for BEC preparation). TTP-driven triage follows the attacker’s EXPECTED path through the kill chain — checking not just the triggering alert but the techniques that typically follow it. TTP-driven triage answers: “If this technique succeeded, what would the attacker do next — and is there evidence of those next steps?”
At NE, Rachel trains analysts to apply all three approaches in sequence: IOC check first (30 seconds — is this IP/hash/domain known-bad?), IOA assessment second (2-3 minutes — is this behaviour anomalous in context?), TTP chain check third (3-5 minutes — if this technique succeeded, what downstream evidence should I look for?). The combined approach maximises both speed (IOC catches known threats instantly) and depth (TTP catches novel threats that IOC misses).
The false positive cost model
False positives are not free. Every FP consumes triage time that could be spent on genuine threats. The FP cost model quantifies this impact and justifies tuning investments.
The direct cost: At NE, Rachel’s average triage time per FP is 8 minutes (read alert context, run initial queries, classify as FP, document, close). With 12-15 FPs per day, the daily FP cost is 96-120 minutes of analyst time — approximately 2 hours per shift. Over a month, FP triage consumes 40-50 hours of analyst time, roughly equivalent to one analyst-week.
The indirect cost: FP fatigue degrades the analyst’s attention to genuine alerts. After classifying 10 consecutive FPs, the analyst’s cognitive engagement decreases — they develop a pattern of “probably another FP” that biases them toward closing the 11th alert quickly. If the 11th alert is a TP that superficially resembles the previous 10 FPs, the analyst may misclassify it. This is the alert fatigue problem that the structured scorecard from TR0.6 addresses: the scorecard forces the analyst to score every alert against the same 8 questions, regardless of how many FPs preceded it.
The opportunity cost: The 2 hours per shift consumed by FP triage is 2 hours NOT spent on threat hunting, detection tuning, or proactive security improvements. At NE, Rachel estimated that reducing the FP rate by 30% (achievable through 3 months of systematic detection tuning) would free 30-40 minutes per shift — enough time for one proactive threat hunt per day. The detection tuning feedback loop is covered in Detection Engineering, but the triage responder initiates the loop by documenting FP patterns in the incident comments.
The SOC-wide calculation: If your SOC processes 50 alerts per day with a 60% FP rate, 30 alerts are FPs consuming 240 minutes (4 hours) of analyst time. If tuning reduces the FP rate to 30%, only 15 alerts are FPs consuming 120 minutes (2 hours). The 2-hour saving per day equals 40 hours per month — the equivalent of adding half an analyst to the team without hiring anyone. This calculation is the business case for detection tuning, and the triage responder produces the data that drives it by consistently documenting FP classifications with the conditions that caused the FP.
The Benign True Positive in practice
The BTP classification deserves special attention because it is the most nuanced of the four classifications. A BTP occurs when the alert correctly detected the activity (the detection worked), but the activity is authorised or expected (no threat). Examples from the NE environment:
IT admin lateral movement: Phil’s IT team uses RDP to manage servers daily. Defender for Identity detects this as lateral movement — and it IS lateral movement, from a technical definition. But it is authorised lateral movement by a known admin. Classification: BTP. The detection is correct (RDP from a workstation to a server IS lateral movement). The activity is authorised (Phil’s team has approval to manage servers via RDP). The triage action: close with documentation. The detection engineering action: add Phil’s admin workstations to the lateral movement watchlist exception, or create a suppression for the specific rule + source workstation combination.
Scheduled vulnerability scan: NE’s quarterly vulnerability scan triggers multiple Defender for Endpoint alerts: port scanning, brute force attempts against test accounts, exploitation attempts against known CVEs. Every alert is technically correct — the scanner IS performing these activities. But the scanner is authorised via a change ticket with scheduled dates. Classification: BTP for all alerts during the scan window. Triage action: verify the scan is authorised (check the change ticket), close all related alerts with the ticket reference. The BTP classification acknowledges that the detection works (the SOC would want to know if an UNAUTHORISED entity performed the same scan) while avoiding a TP escalation for authorised activity.
The myth: SOAR playbooks and AI-driven triage will eliminate the need for human classification decisions. Automation handles the volume; the analyst just reviews the automation’s output.
The reality: Automation handles predictable alert patterns with high accuracy. Alerts that match known-benign signatures (trusted IPs, scheduled maintenance, authorised testing) can be auto-closed with documented reasoning. But the alerts that matter — the ambiguous ones, the novel ones, the ones where the attacker deliberately mimics legitimate activity — require human judgment. The AiTM attack at NE used a Tor exit node, which automated triage would flag. But a sophisticated attacker using a residential proxy in the same city as the user produces a sign-in that no automated rule can classify — only a human analyst who checks the device fingerprint, the authentication method, and the post-authentication behavior can determine whether it is legitimate. Automation reduces the volume of alerts requiring human triage. It does not eliminate the need for human triage on the alerts that matter most.
Troubleshooting
“I classified an alert as FP and later discovered it was a TP — what went wrong?” Review your triage documentation. Which of the 8 scorecard questions did you check? Was the data available to make the correct classification, or was critical data missing (log ingestion delay, incomplete threat intelligence, user unreachable for verification)? If the data was available and you missed it, the scorecard process needs reinforcement. If the data was unavailable, the issue is data pipeline or coverage, not triage methodology.
“My team disagrees on classifications — the same alert gets different answers from different analysts.” This indicates the triage criteria are ambiguous. Run a calibration exercise: present 10 historical alerts to all analysts independently, compare classifications, and discuss disagreements. The disagreements reveal where the scoring criteria need sharpening. NE runs this exercise monthly — Rachel presents 10 anonymised alerts, each analyst classifies independently, and the team discusses any split decisions.
“Management pressures us to close alerts faster, which increases our FP rate.” Present the cost model from this subsection. A 5% increase in FP closure rate (faster triage) that results in 1 missed true positive per quarter costs more than the analyst hours saved by faster closure. Frame triage accuracy as a business risk metric, not a productivity metric. The 15-minute triage target from this course is fast AND accurate — it achieves speed through structured methodology, not through skipping steps.
You're reading the free modules of this course
The full course continues with advanced topics, production detection rules, worked investigation scenarios, and deployable artifacts. Premium subscribers get access to all courses.