TR0.9 The Triage Scorecard

· Module 0 · Free

Operational Objective

The Intuition Problem: Experienced analysts triage by pattern recognition — they see an alert, match it against their mental library of previous alerts, and classify it in seconds. This works until the alert does not match a known pattern. Novel attacks, subtle indicators, and cross-environment incidents defeat pattern recognition because the analyst has no pattern to match. The triage scorecard replaces intuition with a structured sequence of 8 questions that produce a defensible classification for ANY alert — familiar or novel — within 15 minutes. The same 8 questions apply regardless of environment (cloud, Windows, Linux) or alert type (identity, endpoint, email, network). The data sources that answer each question differ per environment, but the decision logic is identical.

Deliverable: The complete triage scorecard with 8 questions, the scoring methodology, the classification thresholds, and the escalation triggers. This is the core operational artifact of the course.

Estimated completion: 30 minutes

Figure TR0.9 — The 8-question triage scorecard. Each question contributes points. The total score maps to a classification: FP (close), probable TP (preserve and escalate), or confirmed TP (full triage response). Question 8 provides the confidence override — low confidence on a high-scoring alert always escalates.

The 8 questions

Each question targets a specific dimension of the alert. The questions are ordered by diagnostic value: the first three questions (compromise evidence, scope, active/historical) most strongly differentiate true positives from false positives. The remaining five questions refine the severity and urgency.

Q1: Is there evidence of compromise beyond the initial alert?

The most powerful triage question. A single alert in isolation could be noise. A single alert accompanied by corroborating evidence from a different data source is almost certainly a true positive. At NE, the CHAIN-HARVEST AiTM alert (DE4-002) was a single alert. But the 5-minute triage query revealed: the source IP was a Tor exit node, the same user had a simultaneous legitimate session from Bristol, and the authentication method was token replay rather than interactive. Three corroborating data points from SigninLogs — each independently suspicious, together definitive.

Cloud data sources: SigninLogs (additional anomalous sign-ins from the same IP or user), AuditLogs (configuration changes within 60 minutes of the alert), OfficeActivity (data access or email operations).

Windows data sources: DeviceProcessEvents (suspicious child processes, additional malicious executions), DeviceLogonEvents (lateral movement from the affected device), DeviceFileEvents (file writes to suspicious locations).

Linux data sources: auth.log (additional SSH sessions, privilege escalation), crontab (new scheduled tasks), systemd (new services), /tmp or /dev/shm (dropped files).

Score: YES (+3) / NO (0) / CANNOT DETERMINE (+1, and note the gap).

Q2: Is the scope beyond a single entity?

An alert affecting one user or one device is an incident. An alert pattern affecting 24 users is a campaign. The scope question changes the severity classification, the containment approach, and the urgency. The CHAIN-HARVEST spray targeted 24 accounts — scope beyond a single entity. The triage response included checking ALL 24 accounts for successful authentication, not just the one that triggered the alert.

Score: YES (+3, multiple entities affected) / NO (0, single entity) / UNKNOWN (+2, cannot determine scope yet).

Q3: Is the threat active or historical?

An active threat requires immediate containment — the attacker is currently operating in the environment. A historical threat (discovered after the fact) requires investigation but not emergency containment. The distinction determines the Triage Trinity sequence: active threats may require containment before preservation (stop the damage first). Historical threats allow preservation before containment (no urgency to stop something that already stopped).

Active indicators: current sessions visible in sign-in logs, processes running on endpoints, network connections to external IPs. Historical indicators: the suspicious activity ended hours or days ago, no current sessions from the suspicious IP, the malicious process is no longer running.

Score: ACTIVE (+3) / HISTORICAL (0) / UNCERTAIN (+2).

Q4: Is sensitive data at risk?

Data exposure changes the incident classification from “security event” to “potential data breach.” A compromised account that accessed a SharePoint library containing customer PII is a different severity than a compromised account that accessed a public marketing site. The data-at-risk question also determines regulatory obligations — personal data exposure triggers GDPR Article 33 notification (72 hours), financial data triggers PCI DSS obligations, and health data triggers HIPAA requirements.

Score: YES (+2, sensitive data confirmed at risk) / NO (0) / PROBABLE (+1).

Q5: How urgent is containment?

Some threats require containment within minutes: active ransomware encryption, ongoing data exfiltration, live BEC email about to be sent. Others allow the responder to complete classification and preservation before containing: dormant persistence mechanisms, historical compromise discovered during a review, policy violations without active exploitation.

Score: IMMEDIATE (+2, active damage occurring) / SOON (+1, damage likely but not yet occurring) / STANDARD (0, no time pressure beyond normal triage).

Q6: What is the business impact if this is a true positive?

A compromised test account in the developer tenant has different business impact than a compromised CFO account with access to wire transfer authorisation. The business impact question scales the response: low-impact true positives get standard triage and investigation. High-impact true positives get emergency escalation to the CISO, immediate containment of the affected business process, and priority investigation.

Score: HIGH (+2, executive, financial process, production system) / MEDIUM (+1, standard business user or system) / LOW (0, test or non-critical system).

Q7: Does this trigger a regulatory notification obligation?

If the alert involves personal data (GDPR), essential service disruption (NIS2), financial system compromise (DORA/PCI DSS), or health data exposure (HIPAA), the triage responder must flag this in the triage report. The responder does not make the notification — that is a legal and management decision. The responder identifies the trigger so that management can begin the notification assessment within the regulatory timeline.

Score: YES (+2, regulatory framework triggered) / NO (0) / UNCERTAIN (+1, flag for legal review).

Q8: How confident is your classification?

This is the override question. If your score is 8-14 (probable TP) but your confidence is low — you cannot verify the data, the user is unreachable, the evidence is ambiguous — do not close the alert. Escalate with documented uncertainty. The confidence override prevents the analyst from closing an alert they are unsure about simply because the score is borderline.

This question does not add points. It provides a decision gate: HIGH confidence on any score = act on the score. LOW confidence on a score ≥ 8 = escalate regardless. LOW confidence on a score < 8 = document the uncertainty, monitor for 24 hours, and re-triage if new data appears.

Using the scorecard

The scorecard takes 10-15 minutes to complete. The analyst reads the alert, runs the environment-specific triage queries for Q1-Q3 (the data-intensive questions), answers Q4-Q7 from environmental knowledge and the alert context, and assesses Q8 from their overall confidence. The total score maps to a classification.

Score 0-7: Likely false positive. Close the alert with documented reasoning for each question. Feed the FP back to the detection engineering team for rule tuning. If Q8 confidence is low, note the uncertainty in the closure documentation and set a 24-hour monitor.

Score 8-14: Probable true positive. Begin evidence preservation immediately. Escalate to the investigation team with the triage scorecard results. Do not wait for certainty — probable is sufficient to begin the preserve-and-contain sequence.

Score 15-20: Confirmed true positive. Execute full Triage Trinity: classify (done — confirmed TP), preserve (environment-specific volatile evidence collection), contain (environment-specific containment actions). Produce the triage report. Notify management per escalation framework.

Worked artifact: Scored triage for CHAIN-HARVEST AiTM alert
Alert: DE4-002 — AiTM Token Replay Detected (j.morrison, 185.220.101.42)
Q1 — Evidence beyond alert? YES (+3). Tor exit node IP, simultaneous legitimate session from Bristol, token replay authentication method. Q2 — Scope beyond single entity? YES (+3). Password spray hit 24 accounts. j.morrison success confirmed. Other accounts may be compromised. Q3 — Active or historical? ACTIVE (+3). Token replay session currently valid. Attacker has live access to j.morrison’s M365 environment. Q4 — Sensitive data at risk? YES (+2). j.morrison’s mailbox contains engineering proposals and financial communications. Q5 — Containment urgency? IMMEDIATE (+2). Active session = attacker can read email, create rules, send as j.morrison right now. Q6 — Business impact? HIGH (+2). j.morrison has access to financial approval workflows. BEC risk is immediate. Q7 — Regulatory trigger? PROBABLE (+1). If personal data in email was accessed, GDPR notification may apply. Q8 — Confidence? HIGH. Three independent corroborating indicators. No alternative explanation for simultaneous Tor + Bristol sessions.
Total: 16/20 — CONFIRMED TRUE POSITIVE. Action: Full Triage Trinity. Session revocation, evidence preservation, escalation to IR team, management notification.

Try it: score a triage scenario

Scenario: An alert fires: “Unusual volume of file downloads from SharePoint.” The user is m.patel, a marketing coordinator. In the last 2 hours, m.patel downloaded 85 files from the “Executive Strategy” SharePoint library — a library m.patel has never accessed before. The downloads are from m.patel’s usual IP address (Bristol office) and usual device. No sign-in anomalies. No other alerts for m.patel.

Score each question: Q1 — evidence beyond the alert? (no sign-in anomaly, but access to an unusual library is corroborating context). Q2 — scope beyond single entity? (single user). Q3 — active or historical? (downloads occurring in the last 2 hours — active). Q4 — sensitive data? (Executive Strategy library likely contains sensitive business strategy). Q5 — containment urgency? (active downloading but not exfiltrating to external). Q6 — business impact? (marketing coordinator accessing executive strategy = unusual). Q7 — regulatory? (if strategy documents contain no personal data, probably not). Q8 — confidence? (ambiguous — could be insider threat, could be legitimate project assignment).

This scenario should score approximately 10-13 — probable TP. The correct action: preserve the SharePoint audit logs, do NOT alert m.patel (potential insider investigation), escalate to HR and legal, and investigate whether m.patel has legitimate access to this library for a current project.

Compliance Myth: "Experienced analysts do not need a scorecard — they can triage by instinct"

The myth: Senior analysts have seen thousands of alerts. They can classify an alert in seconds without a structured methodology. The scorecard slows them down.

The reality: Experienced analysts triage common alert patterns faster with instinct than with a scorecard. But common alerts are not where breaches are missed. Breaches are missed on novel alerts, subtle indicators, and the one alert that looks like every other FP but is actually a sophisticated attacker mimicking legitimate behavior. The scorecard does not slow experienced analysts on common alerts — it takes 2-3 minutes when the answers are obvious. The scorecard prevents experienced analysts from pattern-matching on the one alert that breaks the pattern. The 3 minutes spent on the scorecard for that alert is the 3 minutes that catches the breach everyone else missed.

Troubleshooting

“The scorecard gives a score of 8 but I am confident it is an FP.” Document your reasoning. If your confidence override in Q8 is HIGH and your professional judgment says FP despite the score, close it with detailed documentation explaining why the scoring questions produced a misleading result. This happens when the alert context makes certain questions score high for structural reasons (e.g., the affected system is production, so Q6 always scores +2, even for routine FPs). Review these cases in the monthly calibration exercise — if certain rules consistently produce misleading scores, the scorecard weighting may need environment-specific adjustment.

“I cannot answer Q1 within 15 minutes because the KQL query takes too long.” Pre-built triage queries (provided in TR2-TR4) are optimised for speed: time-filtered, entity-scoped, and returning only triage-relevant fields. If your queries take more than 60 seconds, the query needs optimisation — not the scorecard. Module TR9.4 provides the production-ready triage query packs that answer Q1-Q3 in under 2 minutes per environment.

Scorecard calibration and analyst consistency

The scorecard is a decision support tool, not an automated classifier. Two analysts scoring the same alert may produce different scores — and both may be defensible. The goal is not identical scores but CONSISTENT classifications: if one analyst scores 12 and another scores 14, both classify as probable TP and both initiate the same containment actions. The scores differ but the outcome is the same.

Calibration exercises. Monthly calibration sessions at NE: Rachel presents 5 anonymised alerts from the previous month. Each analyst independently scores all 5 using the scorecard. The team then compares scores and discusses discrepancies. Discrepancies of 1-2 points on any question are normal (different analysts weigh ambiguous evidence differently). Discrepancies of 3+ points on any question indicate a training gap — the analyst either overweighs or underweighs that evidence category. The calibration identifies the gap and the specific training module (TR2 for cloud evidence interpretation, TR3 for endpoint evidence, TR3.3 for AD alerts) that addresses it.

The override mechanism (Q8). Q8 (analyst confidence and override) exists for situations where the numerical score does not match the analyst’s professional judgment. An alert might score 7 (FP range) on Q1-Q7, but the analyst recognises a pattern from a recent threat intelligence briefing that the scorecard questions do not capture. The analyst overrides to probable TP using Q8. Similarly, an alert might score 10 (probable TP range) but the analyst knows from environmental context (a scheduled pen test, a known IT change) that the alert is a BTP. The analyst downgrades via Q8 with documented reasoning.

The override mechanism is NOT a loophole to bypass the scorecard. Every override requires documented reasoning in the triage report. At NE, Rachel reviews all overrides monthly. If an analyst overrides more than 20% of their classifications, the scorecard itself may need calibration for that environment — or the analyst may be substituting personal judgment for the structured assessment too frequently.

Score distribution analysis. Over time, the score distribution reveals systemic issues. If 80% of alerts score 0-5 (FP), the detection rules are too noisy — submit tuning requests for the highest-volume FP-generating rules. If 40% of alerts score 8-14 (indeterminate), the scorecard questions may not have enough discriminating power for your alert types — consider adding environment-specific questions. If alerts cluster at 6-8 (the FP/probable TP boundary), the boundary itself may need adjustment for your risk appetite — a lower threshold (6) catches more TPs but increases false escalations, while a higher threshold (9) reduces false escalations but risks missing genuine compromises.

At NE, the score distribution after 3 months of operation: 65% in 0-7 (FP/BTP), 20% in 8-14 (probable TP), 15% in 15-20 (confirmed TP). The 65% FP rate aligns with industry averages for organisations with standard Sentinel analytics rules. Rachel’s goal is to reduce the FP rate to 45% through detection tuning — the triage data drives the tuning by identifying which rules produce the most FPs.

Beyond this investigation: The triage scorecard connects to **Detection Engineering** (detection rule quality determines the alert context available for Q1-Q3 — richer alert context enables faster scoring), **SOC Operations** (the scorecard becomes the SOC's standard triage SOP, measured via the triage quality metrics in TR14), and **Practical GRC** (the documented scorecard process satisfies audit requirements for structured incident classification under ISO 27035 and NIST SP 800-61).

Operational Artifact — The Triage Scorecard

Print this. Post it in your SOC. Use it on every alert above Informational severity.
Q1: Evidence of compromise beyond the alert? (YES +3 / NO 0 / UNKNOWN +1) Q2: Scope beyond a single entity? (YES +3 / NO 0 / UNKNOWN +2) Q3: Active or historical threat? (ACTIVE +3 / HISTORICAL 0 / UNCERTAIN +2) Q4: Sensitive data at risk? (YES +2 / NO 0 / PROBABLE +1) Q5: Containment urgency? (IMMEDIATE +2 / SOON +1 / STANDARD 0) Q6: Business impact if TP? (HIGH +2 / MEDIUM +1 / LOW 0) Q7: Regulatory trigger? (YES +2 / NO 0 / UNCERTAIN +1) Q8: Confidence? (HIGH = act on score / LOW + score ≥8 = escalate)
0-7: Likely FP — close with documented reasoning per question. 8-14: Probable TP — preserve evidence, escalate, begin containment. 15-20: Confirmed TP — full Triage Trinity, IR mobilisation, management notification.

You're reading the free modules of this course

The full course continues with advanced topics, production detection rules, worked investigation scenarios, and deployable artifacts. Premium subscribers get access to all courses.

View Pricing See Full Syllabus

← TR0.8 Asset Prioritization and Blast Radius TR0.10 The Triage Report Template →