10.7 Playbooks with Logic Apps
Playbooks with Logic Apps
Introduction
Required role: Microsoft Sentinel Contributor for analytics rules. Sentinel Responder for incident management.
Playbooks are Logic Apps workflows triggered by Sentinel incidents or alerts. They execute multi-step automated response: sending notifications to Teams channels, isolating compromised devices via the Defender for Endpoint API, resetting user passwords through Graph API, creating tickets in ServiceNow or Jira, and enriching incidents with threat intelligence lookups. Playbooks are the “hands” of Sentinel automation — while automation rules decide what to do, playbooks do it.
Playbook architecture
A Sentinel playbook is an Azure Logic App with a Sentinel-specific trigger. The trigger provides the incident or alert data to the workflow, which then executes a sequence of actions using Logic Apps connectors.
Trigger types:
“When Microsoft Sentinel incident creation rule was triggered” — the playbook receives the full incident object (entities, alerts, severity, status). Used for: response actions that operate on the incident as a whole (notify the SOC channel, create a ticket, escalate).
“When a response to a Microsoft Sentinel alert is triggered” — the playbook receives a single alert. Used for: entity-specific response actions (isolate the device mentioned in the alert, reset the user’s password, block the IP).
Common Logic Apps connectors for security playbooks:
Microsoft Teams — send messages to channels, post adaptive cards with approve/reject buttons. Microsoft Defender for Endpoint — isolate device, run antivirus scan, collect investigation package. Microsoft Entra ID (Graph API) — revoke user sessions, reset password, disable account, force MFA re-registration. Microsoft Sentinel — add comment to incident, change incident severity, change status, add tags, run KQL query. ServiceNow / Jira — create incident ticket, update ticket, add comment. HTTP — call any REST API (threat intelligence lookups, custom integrations). Outlook — send email notification.
Production playbook examples
Playbook 1: Enrich incident with IP geolocation and reputation.
Trigger: When incident is created. Step 1: Parse incident entities — extract IP addresses. Step 2: For each IP: call a threat intelligence API (VirusTotal, AbuseIPDB, or Microsoft TI) to get reputation score and geolocation. Step 3: Add comment to incident with enrichment results: “IP 203.0.113.47 — Location: Lagos, Nigeria. AbuseIPDB confidence: 87%. VirusTotal detections: 12/94.” Step 4: If reputation score exceeds threshold (>80% malicious confidence), automatically escalate severity to High.
Playbook 2: User compromise containment.
Trigger: When alert is triggered (brute-force-success or AiTM detection). Step 1: Extract compromised Account entity from alert. Step 2: Call Microsoft Graph API → revoke all refresh tokens for the user (forces re-authentication on all sessions). Step 3: Call Microsoft Graph API → reset password to a random value. Step 4: Call Microsoft Graph API → enable per-user MFA if not already enabled. Step 5: Send Teams message to SOC channel: “Automated containment: [user] sessions revoked, password reset, MFA enforced. Incident: [link].” Step 6: Add comment to Sentinel incident documenting all actions taken.
Playbook 3: Device isolation.
Trigger: When alert is triggered (malware detection or ransomware pattern). Step 1: Extract Host entity from alert. Step 2: Call Defender for Endpoint API → isolate device (network isolation — device can only communicate with Defender cloud). Step 3: Call Defender for Endpoint API → initiate investigation package collection. Step 4: Send Teams notification to the endpoint team. Step 5: Add incident comment: “Device [name] isolated. Investigation package collection initiated.”
Playbook 4: Phishing response — URL detonation and mailbox remediation.
Trigger: When alert is triggered (phishing email detection). Step 1: Extract URL entity from alert. Step 2: Submit URL to sandbox detonation API (if available). Step 3: Call Microsoft Graph API → search all mailboxes for the phishing email (by subject line or sender). Step 4: Call Microsoft Graph API → soft-delete matching emails from all recipient mailboxes. Step 5: Add incident comment with remediation results: “Phishing email removed from X mailboxes. URL detonation: [result].”
Playbook permissions and managed identity
Playbooks execute with a managed identity that must have appropriate permissions on the target resources.
Sentinel Responder role. Required on the Sentinel workspace for playbooks that modify incidents (add comments, change severity/status).
Microsoft Graph API permissions. Required for playbooks that manage users: User.ReadWrite.All (reset password, disable account), Directory.ReadWrite.All (revoke sessions), Mail.ReadWrite (purge phishing emails from mailboxes). These are application-level permissions granted to the Logic App’s managed identity through an app registration or managed identity grant.
Defender for Endpoint permissions. Required for playbooks that isolate devices: Machine.Isolate, Machine.CollectForensics.
Principle of least privilege. Grant each playbook only the permissions it needs for its specific actions. A playbook that only adds comments to incidents should not have Machine.Isolate permissions. Create separate managed identities for different playbook tiers: a “notification” identity with read-only permissions, a “containment” identity with write permissions, and an “enrichment” identity with API access.
Triggering playbooks from automation rules
The most common pattern: an automation rule evaluates the incident conditions and triggers the appropriate playbook.
Automation rule: “When incident is created AND severity = High AND analytics rule name contains ‘brute force success’” Action 1: Assign to identity-analyst@northgateeng.com Action 2: Add tag “identity-compromise” Action 3: Run playbook “User-Compromise-Containment”
The automation rule handles instant triage (assign, tag). The playbook handles complex response (revoke tokens, reset password, notify). Both execute within seconds of incident creation — the analyst receives the incident already assigned, tagged, and with containment actions in progress.
Approval workflows: human-in-the-loop automation
Not all containment actions should be fully automated. Disabling a CEO’s account based on a medium-confidence alert creates more damage than the threat itself. Approval workflows put a human decision point in the automation chain.
Pattern: Teams adaptive card approval.
Step 1: Playbook is triggered by automation rule. Step 2: Playbook sends a Teams adaptive card to the SOC channel: “Alert: Brute force success detected for CEO j.smith. Recommended action: revoke sessions and reset password. [Approve] [Deny] [Investigate First].” Step 3: Playbook waits for response (with a timeout — e.g., 30 minutes). Step 4: If “Approve” → execute containment actions. If “Deny” → add incident comment “Containment declined by [analyst].” If timeout → escalate to SOC manager and add incident comment.
When to require approval: For containment actions affecting VIP users, for actions that disrupt business operations (device isolation on a production server), and for actions with high irreversibility (account deletion, mailbox purge).
When to auto-execute without approval: For containment actions affecting standard users where the detection is high-fidelity (AiTM confirmed, honeytoken activated), and for low-impact actions (revoking refresh tokens — the user simply re-authenticates).
Logic Apps error handling patterns
Production playbooks must handle failures gracefully. If the Graph API call to reset a password fails, the playbook should not silently succeed — it must alert the analyst that manual intervention is required.
Pattern 1: Configure “Run After” for failure paths. Each Logic Apps action has a “Run After” setting. Configure a parallel failure path: if the password reset action fails → send a Teams message “AUTOMATED PASSWORD RESET FAILED for [user] — manual action required” → add incident comment documenting the failure.
Pattern 2: Timeout handling. API calls can hang. Set explicit timeouts on HTTP actions (30 seconds for most APIs, 120 seconds for operations that involve backend processing). If timeout → retry once → if retry fails → alert and document.
Pattern 3: Idempotency. If a playbook is triggered twice for the same incident (rare but possible with automation rule re-evaluation), the playbook should not perform the action twice. Add a check at the start: “Has this playbook already executed for this incident? If yes, exit.” Check by querying the incident comments for a tag like “Playbook-ContainUser-Executed.”
Building a playbook library
Organise playbooks into tiers based on the response actions they perform.
Tier 1: Enrichment playbooks (read-only). These playbooks add information to incidents without taking any containment action. Safe to run automatically on all incidents. Examples: IP geolocation lookup, threat intelligence enrichment, user risk score lookup, device compliance status check. Permissions: read-only on Sentinel, read-only on Graph API.
Tier 2: Notification playbooks (inform). These playbooks notify people and systems. Safe to run automatically. Examples: Teams channel notification, email alert to incident owner, ServiceNow ticket creation, Slack message to external SOC partner. Permissions: read-only on Sentinel, send permissions on Teams/email.
Tier 3: Containment playbooks (act). These playbooks take containment actions that affect users, devices, and systems. Require approval for VIP users. Examples: revoke sessions, reset password, enforce MFA, isolate device, block IP in firewall, purge phishing email. Permissions: write permissions on Graph API, Defender for Endpoint, and relevant systems.
Tier 4: Remediation playbooks (recover). These playbooks perform recovery actions after containment. Usually manual trigger only. Examples: re-enable user account after investigation confirms safety, remove device from isolation, restore mailbox rules to pre-attack state. Permissions: write permissions, typically with additional approval requirements.
Testing playbooks before production deployment
Test environment: Run playbooks against a test Sentinel workspace connected to a development tenant. Never test containment playbooks against production accounts.
Test scenarios: For each playbook, create a test incident with realistic entities. Trigger the playbook manually (from the incident → Run playbook). Verify: did each step execute? Did the API calls succeed? Did the incident comment appear? Did the notification arrive?
Staged rollout: Deploy new playbooks in “enrichment only” mode first — the playbook runs but only adds comments describing what it would do, without actually performing containment. After verifying the logic is correct over 1 week of incidents, switch to full execution.
Playbook monitoring and failure handling
Playbooks can fail. API calls time out, permissions expire, rate limits are hit, and target resources become unavailable. Monitor playbook execution.
Logic Apps run history. Navigate to the Logic App → Run history. Each execution shows success or failure for every step. Failed steps show the error message — typically an HTTP error code (401 Unauthorized = permission issue, 429 Too Many Requests = rate limit, 404 Not Found = resource does not exist).
Playbook health analytics rule. Create a scheduled rule that checks for playbook failures:
| |
Failure notification. Configure the Logic App’s “Run after” settings so that if a critical step fails, a notification step runs: send a Teams message to the SOC channel alerting that automated containment failed and manual intervention is needed.
Playbook deployment checklist
Before deploying a playbook to production, verify every item.
Permissions: The managed identity has the minimum required permissions on all target resources. Test by manually triggering the playbook against a test incident. If any step returns 403, permissions are missing.
Error handling: Every critical action has a failure path configured. Test by intentionally providing invalid input (wrong user UPN, non-existent device) to verify error handling works.
Idempotency: The playbook checks whether it has already run for this incident before executing containment actions. Test by triggering the playbook twice for the same incident.
Timeout configuration: All HTTP actions have explicit timeouts. No action relies on default timeout values (which may be too long for real-time response).
Logging: The playbook adds incident comments documenting every action taken (success or failure). After execution, the incident record contains a complete log of automated actions.
Notification on failure: If any containment step fails, a notification is sent to the SOC channel. The analyst knows automated containment failed and manual intervention is required.
Approval gates for VIP users: If the playbook performs containment actions (password reset, account disable, device isolation), VIP users require approval before execution.
Playbook cost considerations
Logic Apps are billed per action execution. A playbook that runs 5 actions per incident costs approximately $0.005 per execution (at Standard tier pricing). At 100 incidents per day, that is $0.50/day — negligible.
However, playbooks that call external APIs (threat intelligence enrichment, ITSM ticket creation) may incur additional costs from those services. A VirusTotal API enrichment playbook that runs on every incident consumes API quota rapidly — consider filtering to only run on High-severity incidents or incidents involving external IPs.
Cost optimisation: Use automation rule conditions to limit which incidents trigger expensive playbooks. “Run enrichment playbook only when severity = High AND entity includes external IP” reduces execution count by 80% while preserving value for the incidents that matter most.
Playbook design workflow: from idea to production
Follow this structured workflow when building a new playbook.
Step 1: Define the trigger scenario. Which specific detection triggers this playbook? What entities will be available? What is the expected playbook execution frequency?
Step 2: Map the response actions. List every action the playbook should perform, in order. For each action: which API is called? What input is needed? What output is produced? What happens if it fails?
Step 3: Identify permissions. For each API call, determine: which connector (Graph API, Defender for Endpoint, Teams, HTTP), which permissions/roles are needed, and whether the managed identity has them.
Step 4: Build the Logic App. Create the Logic App in the Azure portal. Add the Sentinel trigger. Add each action step. Configure the “Run After” settings for failure paths. Add a final step that comments on the incident with a summary of all actions taken.
Step 5: Test with a synthetic incident. Create a test incident with realistic entities. Manually trigger the playbook from the incident page. Verify every step executes. Check the Logic App run history for errors. Verify the incident comment contains the expected summary.
Step 6: Staged rollout. Week 1: playbook runs in “dry-run” mode — comments on incidents describing what it would do without actually executing containment actions. Week 2: enable full execution on a subset of incidents (e.g., only Low-severity to test safely). Week 3: enable for all matching incidents.
Step 7: Production monitoring. Monitor Logic App run history daily during the first week. Set up the playbook health analytics rule (from earlier in this subsection). Add the playbook to the monthly detection review for ongoing health assessment.
Logic Apps connector reference for security playbooks
Microsoft Graph API (via HTTP action): The most versatile connector. Supports: user management (reset password, revoke sessions, disable account, force MFA re-registration), mail operations (search mailboxes, delete phishing emails, read inbox rules), group management (remove user from groups), and application management (revoke OAuth consents). Authentication: managed identity with Graph API permissions.
Microsoft Defender for Endpoint: Purpose-built connector with actions: isolate machine, unisolate machine, run antivirus scan, collect investigation package, restrict code execution, get machine information. Authentication: managed identity with Defender for Endpoint permissions.
Microsoft Teams: Send message to channel, post adaptive card (with action buttons), create channel, send direct message. Used for: SOC notifications, approval workflows, escalation alerts. Authentication: Teams bot or managed identity.
Microsoft Sentinel: Add comment to incident, update incident (severity, status, owner, tags), get incident details, run KQL query. Used for: incident enrichment, status updates, cross-incident queries within the playbook. Authentication: Sentinel Responder role.
HTTP (generic): Call any REST API. Used for: threat intelligence enrichment (VirusTotal, AbuseIPDB, Shodan), ITSM integration (ServiceNow, Jira via REST), custom webhook notifications (Slack, PagerDuty, custom systems). Authentication: API key, OAuth token, or managed identity (depending on the target API).
Azure Key Vault: Retrieve secrets (API keys, credentials) at runtime instead of hardcoding in the Logic App. All sensitive values (TI API keys, ITSM credentials) should be stored in Key Vault and retrieved by the playbook during execution.
SOAR maturity assessment
Rate your Security Orchestration, Automation, and Response maturity to identify improvement areas.
Level 1 — Manual. All incident response is manual. No playbooks. No automation rules beyond basic routing. Analysts perform every action by hand. MTTR: hours to days.
Level 2 — Notification. Playbooks send notifications (Teams, email) when incidents are created. Analysts still perform all containment and remediation manually. MTTR reduced by faster awareness.
Level 3 — Enrichment. Playbooks automatically enrich incidents with TI data, geolocation, and user risk scores. Analysts have context immediately. Triage time reduced by 50-70%.
Level 4 — Semi-automated containment. Playbooks perform containment actions with human approval for sensitive cases. Revoke tokens and reset passwords for standard users automatically. VIP users require approval via Teams adaptive card. MTTR: minutes for automated cases, hours for approval-required cases.
Level 5 — Full SOAR. End-to-end automated response chains for high-confidence detections. Analytics rule → automation rule → enrichment playbook → containment playbook → notification → auto-closure. Manual investigation reserved for complex, ambiguous, or novel threats. MTTR: minutes across the board.
Most organisations should target Level 3 within 3 months of Sentinel deployment and Level 4 within 6 months. Level 5 requires high-fidelity analytics rules (subsection 10.2) and thorough playbook testing (this subsection) — typically achieved after 9-12 months of iterative improvement.
Try it yourself
Create a simple notification playbook: trigger on incident creation → send a Teams message (or email) with the incident title, severity, and entities. Attach the playbook to an automation rule that matches a test analytics rule. Trigger the analytics rule (by generating matching test data) and verify the notification arrives. This validates the end-to-end chain: data → analytics rule → incident → automation rule → playbook → notification.
What you should observe
The notification arrives within 30-60 seconds of incident creation. The Teams message (or email) includes the incident title, severity, and mapped entities. The Logic App run history shows a successful execution with all steps completed. This confirms the automation chain is functioning and provides the foundation for building more complex response playbooks.
Knowledge check
NIST CSF: DE.AE-1 (Baseline of operations established), PR.DS-1 (Data-at-rest is protected). ISO 27001: A.8.15 (Logging), A.8.16 (Monitoring activities). SOC 2: CC7.2 (Monitor system components). Every configuration in this subsection contributes to the logging and monitoring controls that auditors verify.
Check your understanding
1. Your "User Compromise Containment" playbook revokes user sessions, resets passwords, and enforces MFA. Which trigger type should it use?