Scenario 1. Your SOC receives 400 alerts per day. Two analysts work 8-hour shifts. Average triage time is 12 minutes per alert. Your CISO asks how many alerts go untriaged daily.
Zero — the team handles all alerts with efficient prioritization.
aAt 12 minutes per alert, one analyst triages 40 alerts per shift. Two analysts working 8-hour shifts triage 80 alerts per day. 400 minus 80 equals 320 untriaged. No amount of prioritization changes the math when triage capacity is 20% of volume.
320 alerts — 80% of the queue goes untouched.
bTwo analysts times 8 hours times 60 minutes divided by 12 minutes per alert equals 80 alerts triaged per day. 400 minus 80 equals 320 untriaged. This 80% deficit is worse than NE's 71% and demonstrates why enrichment automation (reducing per-alert triage time) is the highest-leverage investment.
200 — each analyst handles roughly half the queue.
cThe constraint is time, not capacity. Two analysts have 960 minutes of combined working time. At 12 minutes per alert, they can process 80 alerts, not 200. The time-per-alert calculation must come before any workload distribution.
100 — after accounting for Defender's automatic resolution of routine alerts.
dThe scenario states 400 alerts requiring triage. Defender's platform automation (AIR, attack disruption) may reduce the volume that reaches analysts, but the question specifies the queue the SOC team faces. Without enrichment automation, all 400 require manual work.
Scenario 2. A playbook queries the user's last 10 sign-ins from SigninLogs and adds the results as an incident comment. An analyst argues this is Tier 2 because it writes to the incident. What tier classification is correct?
Tier 1 — enrichment. The playbook reads data and writes a comment. It does not modify access, isolate a device, or notify anyone outside Sentinel.
aTier classification is determined by what the automation does, not where it writes. An incident comment is enrichment output — it adds context for the analyst. Tier 2 notification involves sending messages to humans outside Sentinel (Teams, email, tickets). The distinction matters because Tier 1 requires minimal governance while Tier 2 requires notification routing validation.
Tier 2 — because writing to an incident modifies the incident record.
bWriting an incident comment does not change the incident's state, severity, assignment, or any entity. It adds context for the analyst. Tier 2 actions (collection, notification) involve capturing evidence to external storage or alerting humans outside the Sentinel workspace. An incident comment is internal enrichment.
Tier 3 — because SigninLogs contain evidence of compromise that could lead to containment.
cThe content of the data does not determine the tier. The tier is determined by what the automation does with the data. This playbook reads and presents. If it used the sign-in data to decide whether to revoke sessions, the revocation action would be Tier 3, but the enrichment query itself remains Tier 1.
It depends on the incident severity — Critical incidents should classify enrichment as Tier 2 for governance purposes.
dTier classification is based on action type, not incident severity. Enrichment is Tier 1 regardless of severity. The governance requirements for the enrichment playbook are the same whether it enriches a Low or Critical incident. Severity affects the confidence threshold for containment (Tier 3), not the tier classification of enrichment.
Scenario 3. Your OAuth consent detection fired 50 times in 30 days. Analyst classification: 30 true positives, 12 false positives, 8 benign true positives. You want to automate revoking the malicious consents. What approach is appropriate?
Auto-revoke at 60% confidence — OAuth consents are low blast radius.
a60% confidence means 40% of automated revocations disrupt legitimate applications. For NE with 810 users, that is roughly 8 false revocations in 30 days. Users lose access to applications they consented to legitimately. The blast radius of OAuth revocation is higher than it appears because it breaks application integrations the user depends on.
Auto-enrich and notify only. At 60% confidence, containment stays manual. Tune the detection first to improve confidence.
bThe composite confidence is 30/50 = 60%. Benign true positives count against containment confidence because revoking a benign consent disrupts the user. At 60%, the enrichment playbook adds publisher verification status and permission scope to the incident. The analyst decides whether to revoke. The detection needs tuning (tighter scoping on publisher verification status or permission type) before containment automation is viable.
Auto-revoke with an approval gate for every match.
cApproval gates are appropriate at 80-95% confidence per the threshold framework in Section 0.4. At 60%, the analyst receives 20 approval requests per month for non-malicious consents. The approval gate becomes noise, and the analyst starts rubber-stamping approvals without investigation — which is worse than no automation at all.
Auto-revoke at 75% confidence by excluding benign true positives from the calculation (30/40).
dBenign true positives must be included in the containment confidence denominator. Revoking a BTP disrupts the user's legitimate application access. The correct calculation is 30/50 = 60%, not 30/40. Excluding BTPs from the confidence calculation would lead to deploying containment automation that disrupts 40% of affected users.
Scenario 4. A Sentinel containment playbook is about to isolate SRV-NGE-DB01, the production ERP database server serving all 810 users. The detection confidence is 96%. What should the playbook do?
Isolate immediately — the 96% confidence exceeds the server threshold and the security risk is critical.
aThe blast radius classification from Section 0.5 categorizes database servers as business-critical infrastructure. 96% confidence is below the 98% threshold for server containment, and even at 98%, the blast radius of isolating the ERP database (all 810 users lose ERP access) requires human evaluation. Automatic isolation of business-critical infrastructure requires both the confidence threshold AND human approval.
Route to human approval with full context: server identity, blast radius impact, enrichment data, and approve/reject/delay options.
bThe ERP database is business-critical infrastructure. The playbook presents the detection data, the blast radius impact (810 users lose ERP), and the containment options to the IR lead or SOC lead. The human evaluates the trade-off between the security risk and the operational impact. For confirmed ransomware encryption in progress, the human may approve immediate isolation. For suspicious process execution, the human may choose to investigate further before isolating.
Skip isolation — the blast radius is too high for any automated action on a production database.
cNever isolating a production server means an attacker operating on the most critical system in the environment can act without constraint. The correct approach is human approval with full context, not avoidance. The playbook adds value by surfacing the decision with enrichment data and a clear impact statement, even when the decision itself is manual.
Isolate and immediately notify the business of the outage.
dNotify-after-isolate does not reduce the blast radius. The ERP is already offline. Notification must happen before isolation for business-critical systems. The approval gate is the mechanism that ensures human judgment evaluates the trade-off before the action executes. Post-action notification is appropriate for standard workstations, not for infrastructure that affects the entire organization.
Scenario 5. You want to build automation that revokes sessions, isolates the endpoint, blocks the attacker IP at the Palo Alto firewall, and sends a Teams notification. Where should this automation run?
Defender XDR custom detection with auto-actions — the detection and most responses are within the Microsoft stack.
aDefender custom detections can isolate endpoints and contain users within the Microsoft stack, but cannot call the Palo Alto API to block firewall rules or post Teams notifications. The Palo Alto integration and Teams notification require external API calls that only Logic Apps (Sentinel playbooks) can provide. One external dependency is enough to move the entire workflow to Sentinel.
Sentinel playbook triggered by an automation rule. The playbook handles all four actions in one workflow with unified error handling.
bThe workflow spans Microsoft Graph (session revocation), MDE API (endpoint isolation), Palo Alto API (firewall block), and Teams connector (notification). Cross-product orchestration with external integration is the defining use case for Sentinel playbooks per the decision framework in Section 0.8. A single playbook handles the entire sequence with conditional logic and unified error management.
Split between Defender (session revocation + isolation) and Sentinel (firewall block + Teams notification).
cSplitting creates coordination complexity. If the Defender action succeeds but the Sentinel playbook fails, the attacker's sessions are revoked but their IP is not blocked at the firewall — a partial containment that may not be visible to the analyst. A single Sentinel playbook with all four actions provides atomic error handling: if step 3 fails, the playbook logs the partial completion and alerts the analyst.
Azure Function with direct API calls — more performant than Logic Apps for time-sensitive containment.
dAzure Functions are useful for complex computation or batch processing that Logic Apps handle poorly. This workflow is sequential API calls with conditions — standard Logic App territory. A Function adds development complexity (code to write and maintain) without a performance benefit that matters. The latency difference between a Logic App and a Function for four API calls is milliseconds, not seconds.
Scenario 6. A Sentinel playbook has been running successfully for six months. The analyst who built it is leaving next week. There is no runbook, no version control, and no monitoring. What is the highest-priority action?
Configure monitoring immediately — you need to know when it breaks.
aMonitoring tells you when the playbook fails, but it does not help you fix it. Without a runbook, when the monitoring alert fires, the team faces a failing playbook that nobody understands. The runbook is the prerequisite that makes monitoring actionable. Write the runbook while the builder is still available, then configure monitoring.
Write the runbook now while the builder is still available, then add monitoring and version control.
bThe runbook captures how the playbook works while the builder can explain it. Reverse-engineering a working Logic App is easier than reverse-engineering a broken one, but both are far harder than asking the person who built it to answer six questions. After the runbook: export the ARM template to version control (enables rollback), then configure monitoring (enables failure detection). All three are mandatory, but the runbook is the most time-sensitive because institutional knowledge is leaving.
Export the ARM template to Git — at minimum you need a recoverable copy.
cThe ARM template enables rollback, which is critical, but it does not explain what the playbook does, why it exists, what it triggers on, or how to fix common failures. The ARM template is JSON that defines the Logic App structure. The runbook is the human-readable document that makes the JSON useful. Both are needed, but the runbook requires the builder's input, which has a deadline.
Rebuild the playbook from scratch with proper governance from day one.
dThe existing playbook has six months of validated production operation. Rebuilding discards that validation and introduces new risk (the rebuild might have different logic, different edge case handling, different entity extraction patterns). Document it, monitor it, version-control it. Rebuilding is appropriate when the existing playbook cannot be documented or maintained — not when the only problem is missing governance artifacts.
Scenario 7. Your maturity scores are: Enrichment 3, Collection 3, Notification 3, Containment 1, Governance 1, Coordination 2, Metrics 1, Improvement 1. Your CISO wants the team to build containment automation next. What do you recommend?
Build containment immediately — it is the lowest operational dimension and the CISO's priority.
aContainment at Level 1 while governance is also Level 1 means deploying containment playbooks without monitoring, without runbooks, and without version control. A containment playbook that makes a wrong decision and is not monitored creates the same pattern as NE's dead playbook, but with higher blast radius. Governance must reach Level 2-3 before containment is safe to deploy.
Raise governance to Level 2-3 first. Add monitoring and runbooks to the existing enrichment and notification playbooks, then build containment with governance in place.
bGovernance is the foundation dimension. The existing Level 3 enrichment and notification playbooks are running without monitoring or runbooks — they could be silently failing right now. Adding governance to existing playbooks is fast (export ARM templates, write runbooks, configure monitoring alerts) and establishes the framework that makes containment deployment safe. The CISO gets containment in Month 2 instead of Month 1, built on infrastructure that prevents regression.
Build a metrics dashboard first — you cannot improve what you cannot measure.
cMetrics without governance means measuring automation that might be silently failing. The monitoring component of governance (SentinelHealth + AzureDiagnostics) is the prerequisite for meaningful metrics. Build governance first, which includes monitoring. The metrics dashboard follows naturally from the monitoring data that governance produces.
Advance enrichment from Level 3 to Level 4 — maximize the value of the existing investment before expanding.
dEnrichment at Level 3 already provides significant triage acceleration. The marginal improvement from Level 3 to Level 4 (watchlist-driven dynamic logic) is smaller than the risk reduction from bringing governance from Level 1 to Level 2-3. The maturity model principle is to fix the weakest dimensions first, because low-scoring dimensions represent systemic risks that affect every other dimension.
Scenario 8. An AiTM detection fires. The enrichment playbook confirms: MFA claim present in token, source IP from Amsterdam (user based in London), python-requests user agent, Entra ID Protection risk level High. The detection's 30-day confidence is 97%. The user is not on the VIP watchlist. What automated response is appropriate?
Enrich and notify only — 97% still means 3% false positive rate, and session revocation is disruptive.
aAt 97% confidence, 1 in 33 containment actions is a false positive. The rollback playbook restores access in under five minutes. Compare that to the alternative: 45 minutes of manual triage during which the attacker is adding MFA persistence, creating inbox forwarding rules, and accessing sensitive data. The cost of the rare false positive is far lower than the cost of delayed containment for a confirmed AiTM.
Auto-contain: revoke sessions and reset the compromised MFA method. The 97% confidence exceeds the 95% threshold, the user is not a VIP, and the rollback playbook is ready.
b97% confidence is in the auto-contain band (95%+) per Section 0.4. The VIP watchlist check passes (user is not a VIP). The blast radius is standard user (moderate, acceptable for auto-containment). Session revocation plus MFA reset is the appropriate containment scope for AiTM — it removes the attacker's access without disabling the entire account. Post-containment verification confirms the sessions are actually revoked.
Auto-contain with an approval gate — 97% is high but a human should confirm before acting.
cThe approval gate band is 80-95% per the threshold framework. At 97%, adding an approval gate introduces 1-2 minutes of delay while the analyst reviews and clicks approve. During an active AiTM attack, those minutes allow the attacker to add persistence (inbox rules, MFA methods, OAuth consents) that session revocation alone will not remove. At 95%+, the auto-contain decision is justified by the confidence data.
Auto-contain: revoke sessions, reset MFA, and disable the account entirely.
dSession revocation plus MFA reset is sufficient containment for AiTM. Account disablement prevents the legitimate user from signing in entirely, which is a higher blast radius action. The threshold for account disablement is 97% per Section 0.4, but session revocation achieves containment without the additional user impact. Reserve account disablement for scenarios where the attacker has password access (beyond token-only access) or where session revocation alone is verified as insufficient.