1.2 Mitigate Incidents Using Microsoft Defender XDR
Mitigate Incidents Using Microsoft Defender XDR
Domain 3 — Manage Incident Response: "Investigate and remediate ransomware and business email compromise incidents identified by automatic attack disruption." Also covers incident classification, assignment, and management across the incident lifecycle.
Introduction
An incident is a collection of correlated alerts that together describe an attack. Managing incidents — from initial triage through investigation to closure — is the core daily workflow of a SOC analyst. This subsection teaches you the complete incident lifecycle in Defender XDR: how incidents are created, how to triage them efficiently, how to investigate the underlying evidence, and how to remediate and close them.
This is not theoretical. Every shift as a SOC analyst starts with the incident queue. You will learn the practical decision framework for: which incident do I look at first? Is this a true positive or a false positive? How far has the attack progressed? What action do I take? When do I escalate?
Here is what this subsection covers:
- How incidents are created — alert grouping, entity correlation, and automatic attack disruption
- The incident queue — how to read it, sort it, and prioritize
- Incident triage — the 5-minute assessment that determines your response
- Investigation workflow — following evidence from the incident through alerts, entities, and raw data
- Remediation actions — containing and eliminating the threat
- Incident classification and closure — documenting the outcome for metrics and learning
How incidents are created
Defender XDR creates an incident when one or more alerts share a common entity. The correlation logic evaluates:
- Shared user — multiple alerts involving the same UserPrincipalName
- Shared device — multiple alerts involving the same DeviceId
- Shared IP address — alerts referencing the same source IP
- Shared email — alerts tied to the same message or campaign
- Temporal proximity — alerts occurring within a time window that suggests a connected attack
A single alert can also become an incident if it is high-severity or triggers automated investigation.
Automatic attack disruption
Defender XDR can automatically disrupt attacks in progress without waiting for analyst intervention. When the correlation engine identifies a high-confidence, multi-stage attack (such as ransomware pre-encryption activity or BEC with active mail forwarding), it takes immediate containment actions:
- Disabling a compromised user account
- Isolating a compromised device
- Blocking a malicious OAuth application
These actions appear in the incident timeline with the label “Automatic attack disruption.” They are reversible — you can re-enable the account or release the device if the action was a false positive.
By the time you open the incident queue, automatic disruption may have already contained the threat. Your job shifts from "stop the active attack" to "verify the disruption was correct, assess the damage, and eradicate the attacker's access." This is a fundamental change in the SOC workflow — the first response is automated, the analyst validates and completes the response.
The incident queue — your shift start
Navigate to security.microsoft.com → Incidents & alerts → Incidents. This is the queue you will check at the beginning of every shift.
Key columns in the incident queue:
| Column | What it tells you | How to use it |
|---|---|---|
| Severity | High / Medium / Low / Informational | Triage high-severity first |
| Incident name | Auto-generated description of the attack type | Provides initial context before opening |
| Status | New / In progress / Resolved | Filter to “New” at shift start |
| Assigned to | Which analyst owns the investigation | Filter to “Unassigned” for new work |
| Categories | Attack categories (phishing, malware, credential access, etc.) | Helps prioritize by attack type |
| Alert count | Number of correlated alerts | More alerts = broader attack, may need higher priority |
| Entities | Users, devices, mailboxes, IPs involved | Quick scope assessment without opening the incident |
| Last activity | When the most recent alert was generated | Recent activity = potentially still active |
| Automated investigation | Whether AIR is running or complete | Check if automated response has already acted |
Effective queue sorting:
- Filter Status = “New” (only unworked incidents)
- Sort by Severity descending (High first)
- Within the same severity, prioritize incidents with recent “Last activity” (potentially still active) over older ones
- Check the “Automated investigation” column — if AIR has already remediated a high-severity incident, it may need only verification, not full investigation
Incident triage — the 5-minute assessment
When you open an incident, your first task is not a deep investigation. It is a rapid assessment to determine: is this real? Is it still active? How bad is it? What do I do next?
The 5-minute triage framework:
Minute 1: Read the incident summary. The incident page shows a narrative summary, the alerts, the affected entities, and a timeline. Read the summary and the alert names — they tell you what Defender XDR thinks happened.
Minute 2: Check the entities. How many users are affected? Which devices? Are any of them high-value (executives, IT admins, domain controllers)? A single-user incident is different from one affecting 50 users.
Minute 3: Check the timeline. When did it start? Is there recent activity? An incident with its last alert 3 hours ago may be dormant. An incident with alerts from 5 minutes ago is potentially active.
Minute 4: Check automated actions. Has automatic attack disruption acted? Has AIR completed? If the user is already disabled and the device isolated, the immediate threat is contained — your focus shifts to verification and eradication.
Minute 5: Classify and decide.
| Assessment | Classification | Next action |
|---|---|---|
| Clearly a true positive, actively ongoing | True Positive | Escalate to Tier 2 / begin investigation immediately |
| Clearly a true positive, disruption already contained it | True Positive | Verify containment, investigate scope, eradicate |
| Likely true positive, need more evidence | True Positive (Pending) | Assign to yourself, begin investigation |
| Known false positive (recognized pattern) | False Positive | Close with documentation |
| Cannot determine in 5 minutes | Unknown | Assign, investigate, decide after deeper analysis |
Triage is a classification decision, not an investigation. The goal is to determine priority and next action within 5 minutes. If you need more than 5 minutes, classify as "Unknown" and begin the investigation. The longer you spend triaging one incident, the longer other incidents sit unexamined.
Investigation workflow
After triage, the investigation follows the evidence from the incident down through the alerts to the raw data.
Level 1: The incident. Shows the big picture — how many alerts, which entities, what attack categories. This is your map.
Level 2: The alerts. Each alert within the incident tells a specific part of the story. “Suspicious sign-in activity” is one chapter. “Inbox rule with suspicious keywords” is another. Click each alert to see its details, evidence, and recommended actions.
Level 3: The entities. Click on a user to see their complete activity profile: recent sign-ins, devices, email activity, alerts. Click on a device to see its timeline, alerts, and security recommendations. Entities connect the dots between alerts.
Level 4: Advanced Hunting. When the portal investigation reaches its limit — you need to answer a question the alert details do not cover — pivot to Advanced Hunting and write a KQL query. This is where Module 6 pays off.
| |
| Timestamp | ActionType | Application | IPAddress | Type |
|---|---|---|---|---|
| 08:02 | EmailReceived | northgate-voicemail.com | 198.51.100.44 | EmailEvents |
| 08:14 | LogonSuccess | Exchange Online | 198.51.100.44 | IdentityLogonEvents |
| 08:15 | InboxRuleCreated | Exchange Online | 198.51.100.44 | CloudAppEvents |
Remediation actions
Once the investigation confirms the threat, take action. Defender XDR provides remediation actions directly from the incident and alert views.
| Action | Where to take it | When to use it |
|---|---|---|
| Disable user account | Incident → User entity → Disable account | Confirmed account compromise — prevents further attacker access |
| Revoke user sessions | Incident → User entity → Revoke sessions | Token replay — invalidates all active tokens |
| Reset password | Incident → User entity → Reset password | Credential theft — forces new credential |
| Isolate device | Incident → Device entity → Isolate device | Confirmed device compromise — cuts network access |
| Soft delete emails | Threat Explorer or Alert → Remediate | Phishing campaign — removes emails from mailboxes |
| Block sender | Alert → Block sender | Ongoing campaign — prevents future delivery |
| Block file (custom indicator) | Settings → Indicators → Add file hash | Malware — blocks the file across all devices |
If you reset the password without disabling the account first, the attacker's existing token may still be valid until it expires (typically 1 hour). Disable the account (kills all sessions immediately), then reset the password (prevents re-authentication with the old password), then re-enable the account when the user is ready. This sequence is critical for the Module 11 containment playbook.
Incident classification and closure
After investigation and remediation, classify and close the incident. This documentation drives SOC metrics and learning.
| Classification | When to use | What it tells leadership |
|---|---|---|
| True Positive | The alert was correct — a real threat was detected | Detection is working. Count these to measure detection value. |
| False Positive | The alert was incorrect — no threat exists | Indicates a tuning need. Count these to measure detection accuracy. |
| Benign True Positive | The alert correctly detected the activity, but it was authorized (e.g., a penetration test) | Detection works, but the activity was expected. No action needed. |
Add a comment explaining what happened, what actions were taken, and any follow-up needed. This comment becomes the incident record — when your CISO asks “what happened with that high-severity incident last week,” the comment is the answer.
Try it yourself
In a lab environment, your incident queue is likely empty or contains only low-severity informational alerts. This is expected — no real attacks are occurring. The key learning is the navigation: you know where the queue is, how it is sorted, and what each column means. When a real incident appears (in production or during the Module 11 simulation), you will navigate to it instinctively.
If you ran the Advanced Hunting query and got results, you practiced the most important investigation pivot: from the portal UI to KQL. The portal shows you curated views; KQL shows you the raw data. Both are needed.
Check your understanding
1. You open the incident queue at shift start. There are 3 new incidents: High severity from 10 minutes ago, Medium severity from 2 hours ago, High severity from 6 hours ago with "Automatic attack disruption" noted. Which do you triage first?
2. An incident has been auto-disrupted: the user account is disabled and the device is isolated. What is your next step?
3. During investigation, you need to see all sign-in activity for the affected user during a specific 4-hour window. The incident details show some sign-ins but you suspect there are more. Where do you go?
4. You confirm an incident is a true positive. You disable the user account and revoke sessions. Your colleague says "just reset the password, that is enough." Why is disable + revoke + reset the correct sequence, not just reset?