SA0.6 NE's Automation Landscape

5 hours · Module 0 · Free

Figure SA0.6 — Northgate Engineering's automation transformation. Current state: manual, 71% untriaged. Target state after 90-day automation program: every alert enriched, 60% auto-resolved, 0% untriaged.

Operational Objective

Before building automation, you need to understand the current state. Most organisations have some automation running — Defender AIR, attack disruption, a few Sentinel automation rules — but nobody has a complete inventory of what is automated, what is working, what is broken, and what is missing. This sub audits Northgate Engineering's current automation state, identifies the highest-value automation gaps, and establishes the target architecture that the rest of the course builds toward.

Deliverable: A current-state automation audit methodology and the target-state architecture for NE's SOC. You will apply the same audit to your own environment and build your own target-state architecture.

⏱ Estimated completion: 25 minutes

Auditing the current state

Start with what exists. Every M365 E5 environment has automation capabilities deployed by default. The problem is that nobody on the security team knows what they are or whether they work.

Defender XDR Auto Investigation and Response. NE enabled AIR during the initial Defender deployment in 2023. Nobody has reviewed the configuration since. The action center (security.microsoft.com → Action center) shows 847 automated actions in the last 90 days — quarantined emails, blocked files, and 12 account containment actions. Twelve accounts were automatically contained by Defender without anyone on the SOC team knowing. Three of those containment actions were on VIP accounts. Nobody reviewed them. Nobody verified whether the containment was appropriate or whether the users experienced disruption.

This is the danger of invisible automation. Defender AIR is powerful, but if nobody monitors it, false positive containment goes unnoticed, legitimate automated actions are not documented in incident reports, and the SOC team makes manual containment decisions for incidents that Defender has already partially contained — creating conflicting response actions.

Action item: Review the Defender action center weekly. Understand what AIR is doing. Document AIR actions in incident reports. Configure AIR policies to match your organisation’s risk tolerance.

Sentinel automation rules. NE has four automation rules:

Rule 1: Change severity from Medium to High for alerts containing “AiTM” in the title. This is useful — AiTM alerts are consistently High severity regardless of the detection’s default severity.

Rule 2: Assign all incidents from the “Microsoft Entra ID Protection” connector to the SOC queue. This works but adds no value — all incidents go to the SOC queue by default.

Rule 3: Close all incidents with severity “Informational” after 72 hours. This is problematic — some informational alerts escalate in severity when correlated with other signals. Auto-closing them prevents correlation.

Rule 4: Auto-assign High severity incidents to “j.morrison.” This is a single-point-of-failure — when j.morrison is on leave, High severity incidents are assigned to an absent analyst.

The dead playbook. Six months ago, an analyst built a playbook that queried VirusTotal for file hashes extracted from Defender for Endpoint alerts. The playbook worked for two weeks. Then VirusTotal’s free API rate limit changed, the playbook started failing, and nobody noticed because no monitoring was configured. The playbook has been in a failed state for six months, creating the impression that “we tried automation and it didn’t work.”

What is missing

The gaps are more important than the existing automation. NE’s SOC has zero:

Enrichment playbooks. Every enrichment query is manual. When an AiTM alert fires, the analyst opens a new browser tab, navigates to Sentinel, types a KQL query for the user’s sign-in history, reads the results, copies the relevant findings, pastes them into the incident comment, and repeats for each enrichment source (IP reputation, device compliance, alert history). Five enrichment queries × 2 minutes each × 500 alerts per day = 83 hours of manual enrichment.

Evidence collection playbooks. Evidence is collected after the analyst triages the alert — typically 45 minutes to 2 hours after the alert fires. By that time, session tokens have expired (1-hour default for Azure AD tokens), short-lived processes have terminated, and the attacker may have moved laterally. A collection playbook that fires on incident creation captures this evidence immediately.

Notification playbooks. The SOC uses email to notify stakeholders. The email is manually composed by the triaging analyst during an active incident — taking attention away from the investigation. There is no Teams channel for SOC notifications, no automated ticket creation, no MSSP coordination playbook, and no after-hours escalation mechanism.

Containment playbooks. All containment is manual. When an AiTM is confirmed, the analyst opens the Entra admin center, finds the user, revokes sessions, navigates to MFA settings, removes the compromised method, checks for inbox rules, checks for mailbox delegates, and documents every action. This takes 15-25 minutes — during which the attacker may be adding additional persistence.

MSSP coordination. BlueVoyant (NE’s managed SOC partner) monitors the same Sentinel workspace. When BlueVoyant triages an alert and NE triages the same alert independently, both teams do the same work. There is no automated mechanism to indicate “BlueVoyant is triaging this” or “NE has taken ownership.”

The target-state architecture

The target state after completing this course is not “automate everything.” It is “automate the right things with the right safeguards, and preserve human judgment for the rest.”

Tier 1 automation (deployed Month 1): Six enrichment playbooks covering every incident type. IP reputation from TI feeds + AbuseIPDB. User risk from Entra ID Protection. Device compliance from Intune. Threat intelligence correlation from Sentinel TI tables. Alert history from previous incidents. Geo-location and impossible travel analysis. Every incident is enriched in under 30 seconds. The analyst opens an investigation-ready incident, not a raw alert.

Tier 2 automation (deployed Month 2): Three evidence collection playbooks (AiTM, endpoint, email) that capture volatile evidence at alert time. Notification pipeline with Teams adaptive cards (SOC channel), email templates (CISO for Critical), ticket creation (ServiceNow), on-call escalation (after-hours High/Critical), and MSSP coordination (auto-assign ownership to avoid duplicate triage).

Tier 3 automation (deployed Month 3): Identity containment for AiTM (session revocation + MFA reset on 95%+ confidence detection). Endpoint containment for workstations (auto-isolate on high-confidence malware/ransomware detection). Both with VIP watchlist checks, blast radius assessment, and rollback playbooks. Server containment routed to approval gates.

Governance (ongoing): Automation health dashboard in Sentinel (Logic App success rates, execution latency, containment action count, FP rate). Monthly review: tune thresholds, retire underperforming automation, promote validated enrichments to higher tiers. Every playbook has a documented runbook.

⚠ Compliance Myth: "Our managed SOC partner handles automation — we don't need to build our own"

The myth: BlueVoyant (or any MSSP) provides automation as part of their managed SOC service. Building internal automation would duplicate their capability and waste resources.

The reality: Most MSSPs automate THEIR workflow, not yours. They automate alert triage within their platform, analyst assignment within their team, and reporting within their portal. They do not automate YOUR enrichment (your watchlists, your VIP lists, your known-safe IPs), YOUR notification (your Teams channel, your CISO’s email, your ServiceNow instance), or YOUR containment (your conditional access policies, your endpoint isolation, your firewall rules). The MSSP coordinates response — they do not execute it in your tenant. Internal automation handles the actions that require your environment’s context: your users, your devices, your policies, your blast radius assessment. The MSSP and internal automation are complementary, not duplicative.

Decision point: NE’s dead VirusTotal playbook has been failing for 6 months. The SOC lead wants to delete it and start fresh. The analyst who built it wants to fix it. The correct answer is: fix it, but add monitoring first. Delete the current broken playbook. Rebuild it with proper error handling (retry logic, fallback to a second TI source, graceful degradation when the API is unavailable). Add health monitoring (alert when the playbook fails 3 times in 24 hours). Add a runbook (what the playbook does, how it works, what to check when it fails, who owns it). Then deploy the fixed version. The problem was never the playbook — it was the absence of monitoring and ownership. Fix the process, then fix the playbook.

Try it: Audit your current automation

Open your Sentinel workspace and Defender portal. Answer these questions:

Defender AIR: Is it enabled? Open Action Center — how many automated actions in the last 90 days? Were any containment actions on VIP accounts? Did anyone review them?
Sentinel automation rules: How many exist? List them. For each one: does it add value? Is it configured correctly? Is it a single-point-of-failure (assigned to one analyst)?
Sentinel playbooks: How many exist? For each one: when did it last run successfully? When did it last fail? Is it monitored? Does it have a runbook? Who owns it?
What is missing? Use the NE target-state list as a checklist. Which enrichment, collection, notification, and containment playbooks do you need?

If you find dead playbooks, do not delete them yet — diagnose why they died. The failure mode tells you what monitoring and governance you need to prevent the same failure in the playbooks you build in this course.

You discover that Defender AIR has been automatically disabling user accounts for "confirmed compromised" alerts — 12 accounts in the last 90 days. Your SOC team did not know this was happening. What is the most important immediate action?

Disable AIR immediately. Disabling AIR removes a layer of automated defense that has been catching real compromises. The problem is not AIR — it is the lack of visibility into AIR's actions.

Review all 12 actions in the Action Center. For each one: was the containment appropriate? Was the user notified? Was the action documented in the incident report? Were any VIP accounts affected? Then configure AIR monitoring — alert the SOC team whenever AIR takes a containment action so they can review it in real-time going forward.

Leave AIR as-is — it has been working for 90 days without problems. You do not know it has been working without problems. Three of those 12 containment actions were on VIP accounts. If any were false positives, VIP users experienced disruption that nobody acknowledged or resolved. Unmonitored automation is not the same as working automation.

Reconfigure AIR to only investigate, not remediate. This removes the auto-containment capability while preserving the investigation value. But the right answer is not to reduce AIR's capability — it is to add monitoring so the SOC team has visibility. Once you have visibility, you can make informed decisions about which AIR actions to keep and which to modify.

Where this goes deeper. SA8 covers Defender XDR automation in depth — AIR configuration, attack disruption tuning, custom detection rules with auto-actions, and the relationship between Defender XDR automation and Sentinel automation. SA12 builds the full automation program including the 90-day roadmap, team structure, and metrics dashboard. The SOC Operations course covers the operational framework that automation executes — shift procedures, escalation policies, and MSSP coordination that become automated workflows in this course.

Operational Artifact — SOC Automation Current-State Audit Template

Use the NE audit methodology to assess your own SOC: inventory existing automation (Defender AIR, Sentinel rules, playbooks), assess each item (working, broken, monitored, owned), identify gaps (enrichment, collection, notification, containment, MSSP coordination), and document the target-state architecture. This audit produces the input for the 90-day automation roadmap in SA12 — you cannot build the roadmap without knowing where you start.

You're reading the free modules of this course

The full course continues with advanced topics, production detection rules, worked investigation scenarios, and deployable artifacts. Premium subscribers get access to all courses.

View Pricing See Full Syllabus

← SA0.5 The Blast Radius Assessment SA0.7 Sentinel Automation Architecture →