SA0.13 Check My Knowledge

5 hours · Module 0 · Free

Check My Knowledge — SA0: The Automation Problem

Test your understanding of the automation framework established in this module. Each question presents an operational scenario — choose the response that demonstrates correct application of the three-tier model, confidence thresholds, blast radius assessment, or governance principles.

Question 1: Your SOC receives 400 alerts per day. Two analysts work 8-hour shifts. Average triage time is 12 minutes per alert. How many alerts go untriaged each day?

0 — the team handles all alerts. At 12 minutes per alert, one analyst triages 40 alerts per shift. Two analysts = 80 alerts per day. 400 - 80 = 320 untriaged. The team handles only 20% of the alert volume.

320 alerts (80%). Two analysts × 8 hours × 60 minutes ÷ 12 minutes per alert = 80 alerts triaged per day. 400 - 80 = 320 untriaged. This is an 80% deficit — worse than NE's 71%. Automation is not optional at this volume.

200 — each analyst handles half. The math does not work this way. The constraint is time: 12 minutes per alert × 400 alerts = 80 hours needed. Two analysts provide 16 hours. The deficit is 64 hours.

100 — after factoring in automatic resolution. The question states average triage time, not that any alerts are auto-resolved. With no automation, all 400 require manual triage.

Question 2: A playbook queries the user's last 10 sign-ins from SigninLogs and adds the results as an incident comment. What automation tier is this?

Tier 1 — Enrichment. The playbook reads data (SigninLogs) and writes a comment. It does not modify the user's access, isolate any device, or change any configuration. Read-only context addition = Tier 1.

Tier 2 — because it writes to the incident. Writing an incident comment is enrichment output, not notification or collection. Tier 2 notification involves sending messages to humans outside Sentinel (Teams, email). Tier 2 collection involves exporting data to storage. An incident comment is internal enrichment.

Tier 3 — because SigninLogs might contain evidence of compromise. The content of the data does not determine the tier. The tier is determined by what the automation DOES with the data. This automation reads and presents. It does not contain or respond.

It depends on the incident severity. Tier classification is based on the action type, not the incident severity. Enrichment is Tier 1 regardless of whether the incident is Low, Medium, High, or Critical severity.

Question 3: Your "suspicious OAuth consent" detection has fired 50 times in the last 30 days. Analyst classification: 30 TP, 12 FP, 8 BTP. What automation tier is appropriate for containment (revoking the OAuth consent)?

Auto-revoke (Tier 3, no approval) — 60% confidence is sufficient for OAuth. 60% confidence means 40% of automated revocations disrupt legitimate applications. For an organisation with 810 users, that is roughly 8 false revocations in 30 days. Users lose access to legitimate applications. This is too high for auto-containment.

Auto-enrich and notify only (Tier 1-2). At 60% confidence, containment must remain manual. The playbook enriches the incident with publisher verification status, permission scope, and consenting user details, then presents the data to the analyst. The analyst decides whether to revoke. Consider tuning the detection to increase confidence before re-evaluating containment automation.

Auto-revoke with approval gate (Tier 3 with safeguard). An approval gate is appropriate at 80-95% confidence. At 60%, even with an approval gate, the analyst receives too many approval requests for non-malicious consents. The approval gate becomes noise. The detection needs tuning first.

75% confidence (30/40, excluding BTP). BTPs count against confidence for containment decisions. Revoking a benign true positive disrupts the user — the application they legitimately consented to stops working. The correct calculation is 30/50 = 60%.

Question 4: An automation playbook is about to isolate SRV-NGE-DB01 (the production ERP database server serving 810 users). What should the playbook do?

Isolate immediately — the security risk outweighs the business impact. The blast radius is HIGH (810 users lose ERP access). Immediate isolation without approval is only appropriate for confirmed ransomware encryption activity. For other threat types, the production impact must be weighed against the security risk by a human.

Route to human approval with context: "SRV-NGE-DB01 is the production ERP database. Isolation will take ERP offline for 810 users. Alert details: [enrichment data]. Approve / Reject / Delay 30 min." The SOC lead or IR lead evaluates the trade-off and makes the containment decision with full context.

Skip isolation — the blast radius is too high. Never isolating a production server means the attacker can operate freely on the most critical system. The correct approach is human approval, not avoidance. For confirmed ransomware, even high-blast-radius isolation may be necessary — but a human makes that call.

Isolate and immediately notify the business. Notify-after-isolate does not reduce the blast radius. The ERP is already offline. Notification must happen BEFORE isolation for high-blast-radius systems. The approval gate is the mechanism for pre-action human judgment.

Question 5: You want to build automation that auto-revokes sessions, isolates the endpoint, blocks the attacker IP at the Palo Alto firewall, and sends a Teams notification. Where should this automation run?

Defender XDR custom detection with auto-actions. Defender cannot block IPs at Palo Alto or send Teams notifications. These are external integrations outside Defender's product scope.

Sentinel playbook triggered by an automation rule. The playbook can call Microsoft Graph (session revocation), MDE API (endpoint isolation), Palo Alto API (firewall block), and Teams connector (notification) — all in one workflow with conditional logic and error handling. Cross-product + external integration = Sentinel.

Split between Defender (session + isolate) and Sentinel (firewall + Teams). Splitting creates coordination complexity and partial failure scenarios. A single Sentinel playbook handles the entire workflow with unified error management.

Azure Function. Azure Functions are useful for complex logic that Logic Apps handle poorly, but this workflow is standard Logic App territory — sequential API calls with conditions. Use a Function only when you need loops with state, complex data transformation, or batch processing.

Question 6: A Sentinel playbook has been running successfully for 6 months. The analyst who built it left the company. There is no runbook, no version control, and no monitoring. What is the highest priority action?

Add monitoring immediately. Monitoring tells you when it breaks, but it does not help you fix it. Without a runbook, the team cannot troubleshoot or repair the playbook when it fails.

Write the runbook now, then add monitoring and version control. The runbook captures how the playbook works while the Logic App is still functional — reverse-engineering a working playbook is easier than reverse-engineering a broken one. Then add monitoring (to detect failures) and export the ARM template to version control (to enable rollback). All three are critical, but the runbook is the most time-sensitive because institutional knowledge about the playbook is leaving with the builder.

Export the ARM template to Git. Version control enables rollback, which is important, but the ARM template alone does not explain what the playbook does, why it exists, or how to fix it. The runbook is the highest-priority documentation.

Rebuild the playbook from scratch with proper governance. The existing playbook works. Rebuilding wastes 6 months of validated production operation. Document it, monitor it, version-control it — do not rebuild what already functions.

Question 7: Your maturity assessment scores: Enrichment=3, Collection=3, Notification=3, Containment=1, Governance=1, MSSP=2, Metrics=1, Improvement=1. What is the recommended next investment?

Containment — it is the lowest-scoring operational dimension. Containment at Level 1 (manual) while governance is also Level 1 means you would deploy containment automation without monitoring, without runbooks, and without version control. Build governance first — it makes containment automation sustainable.

Governance — bring it to Level 2-3 before advancing containment. Add monitoring for the existing enrichment and notification playbooks (they are running without it). Export ARM templates. Write runbooks. Then build containment automation with governance already in place. Governance is the foundation that prevents regression in every other dimension.

Metrics — you cannot improve what you cannot measure. Metrics are important, but adding a dashboard without governance means measuring automation that might silently fail. Governance (especially monitoring) is the prerequisite for meaningful metrics.

Enrichment — increase from Level 3 to Level 4 for maximum triage speed. Enrichment at Level 3 is already providing significant value. The marginal improvement from Level 3 to Level 4 is smaller than the risk reduction from bringing governance from Level 1 to Level 2-3. Fix the weakest dimensions first.

Question 8: An AiTM detection fires. The enrichment playbook confirms: MFA claim in token, IP from Amsterdam (user is based in London), python-requests user agent, user risk High. The detection confidence over 30 days is 97%. What is the appropriate automated response?

Enrich and notify only — 97% is below 100%. Waiting for 100% confidence means waiting forever. At 97%, 1 in 33 fires is a false positive. The rollback playbook restores access in 5 minutes. The cost of the rare false positive is far lower than the cost of 45-minute manual containment for a confirmed AiTM.

Auto-contain: revoke sessions + reset MFA. 97% confidence is in the 95%+ band — auto-contain without approval gate. The VIP watchlist check still runs (if the user is a VIP, route to approval instead). The rollback playbook is ready for the 3% false positive rate. Post-containment verification confirms the sessions are actually revoked.

Auto-contain with approval gate — 97% is high but not certain. The 80-95% band uses approval gates. 97% is in the 95%+ band, which auto-contains. Adding an approval gate at 97% adds 1-2 minutes of delay for the analyst to approve — during which the attacker is adding MFA persistence, creating inbox rules, and accessing data. At 97%, the delay is not justified.

Auto-contain and auto-disable the account. Session revocation + MFA reset is sufficient containment for AiTM. Account disable is a higher-blast-radius action that prevents the legitimate user from signing in entirely. Auto-disable should be reserved for scenarios where session revocation alone is insufficient (e.g., the attacker has password access, not just token access).

💬

How was this module?

Your feedback helps us improve the course. One click is enough — comments are optional.

Thank you — your feedback has been received.

You're reading the free modules of this course

The full course continues with advanced topics, production detection rules, worked investigation scenarios, and deployable artifacts. Premium subscribers get access to all courses.

View Pricing See Full Syllabus

← SA0.12 Module Summary