The Defender XDR portal is your workspace. Knowing where the buttons are is necessary but not sufficient. What matters is how you use the portal operationally: what you check first when you start your shift, how you triage the queue without wasting time on false positives, when you investigate versus when you escalate, how you document your work so the next analyst can pick up where you left off, and how you maintain your effectiveness across hundreds of alerts per week.
This subsection describes the daily workflow practiced by analysts in operational SOCs. It is based on real shift patterns, not theoretical frameworks. The specific times and sequences described here are guidelines that you should adapt to your environment, team size, and alert volume — but the underlying principles are universal.
---
Shift start routine
Every shift starts the same way, regardless of what happened on the previous shift. This routine takes approximately 15 minutes and ensures you have situational awareness before you start working the queue.
Check the incident queue (5 minutes). Open the Defender portal and navigate to Incidents. Filter to Status = New, sort by Severity descending. Count the new incidents since the last shift. Read the incident names and severities without opening them yet — you are building a mental map of what needs attention, not investigating. If any incident is marked High or Critical severity, note it. If any incident was auto-assigned by an automation rule but not yet acknowledged, note that too. The goal of this step is to answer one question: is anything on fire right now?
Read the shift handover (3 minutes). If your SOC uses a handover document — a Teams channel, a shared OneNote, a wiki page, or even an email — read it. The previous shift's handover should tell you: what incidents were actively being investigated when they left, what actions are pending (waiting for user response, waiting for management approval to isolate a device, waiting for a vendor to respond), and whether any data pipeline issues were detected. If there is no formal handover document, check the most recent incident comments in the queue — analysts who follow good practices leave comments on incidents they were working.
Check data pipeline health (5 minutes). This is the step most analysts skip and most SOC managers wish they would not. If a data connector stopped flowing, your detection rules are blind — alerts that should fire will not fire, and you will have a false sense of security from an empty queue. Run a connector health check by navigating to Microsoft Sentinel → Data connectors (if your org uses Sentinel) or by running a quick Advanced Hunting query against each critical table:
| Table | LastEvent | DataAge (minutes) |
| IdentityLogonEvents | 2026-03-21 08:42 | 3 |
| DeviceProcessEvents | 2026-03-21 08:41 | 4 |
| EmailEvents | 2026-03-21 08:40 | 5 |
| CloudAppEvents | 2026-03-21 08:38 | 7 |
| AlertEvidence | 2026-03-21 08:35 | 10 |
Healthy pipeline: All tables show data within the last 10 minutes. Normal ingestion latency for Defender XDR tables is 5-15 minutes. If any table shows a DataAge greater than 60 minutes, investigate the connector. If DeviceProcessEvents is stale, your endpoint detections are blind. If EmailEvents is stale, phishing alerts will not fire.
If any table shows a data age greater than 60 minutes, this is your first priority — not the incident queue. A silent data pipeline is more dangerous than a noisy queue. Escalate connector issues to your engineering team or Microsoft support immediately.
Check Threat Analytics (2 minutes). Navigate to Threat Analytics in the Defender portal. Are there new threat reports marked with "Impact" on your environment? Microsoft publishes threat analytics when new campaigns or vulnerabilities are actively exploited. If a new threat report shows your environment has exposed or impacted assets, this takes priority over the standard queue — you may need to run hunting queries or apply emergency protections before working general incidents.
---
Triage methodology
After the shift start routine, you work through the incident queue. Triage is the process of quickly assessing each new incident to determine whether it requires investigation, and if so, how urgently. Effective triage is the difference between a SOC that catches real attacks early and a SOC that drowns in false positives while real attacks slip through.
The 5-minute triage rule. For each new incident, spend no more than 5 minutes on initial triage. In those 5 minutes, you need to reach one of four classifications:
True Positive (TP) — the alert accurately describes malicious or unauthorized activity. Action: assign to yourself (or the appropriate Tier 2 analyst), begin investigation or escalation.
False Positive (FP) — the alert fired on legitimate activity. Action: close the incident with a comment explaining why it is a false positive. If this alert pattern fires repeatedly on the same legitimate activity, create a suppression rule or tuning recommendation. Do not simply close it silently — the next analyst who sees the same pattern needs your reasoning.
Benign True Positive (BTP) — the alert accurately describes the activity, but the activity is authorized. Example: a penetration test triggers lateral movement alerts. Action: close with comment documenting the authorization (reference the change request number or pen test scope document).
Informational / Unknown — the evidence is insufficient for classification in 5 minutes. Action: assign to yourself, flag for deeper investigation during dedicated investigation time.
What to check during the 5-minute triage:
First, read the incident name and severity. The auto-generated incident name in Defender XDR usually describes the core alert — "Multi-stage incident involving phishing and credential theft on one endpoint" tells you a lot in one sentence.
Second, check the entities involved. How many users? How many devices? How many mailboxes? A single-user, single-device incident is likely contained. An incident involving 15 users across 8 devices is potentially widespread.
Third, read the first alert. Open the highest-severity alert in the incident. Read the alert description, check the MITRE ATT&CK mapping, and look at the evidence (process tree for endpoint alerts, email details for email alerts, sign-in details for identity alerts).
Fourth, check automated investigation status. If Defender XDR's automated investigation has already analyzed the alert and taken remediation actions, your triage is faster — review the automated findings and decide whether you agree with the classification.
Fifth, classify and act. Based on the evidence, classify as TP/FP/BTP/Unknown and take the corresponding action.
The queue is a triage list, not a task list
A common mistake for new analysts is treating every incident as a task to be completed. The queue is a triage list — your job is to quickly sort incidents by urgency and impact, then dedicate investigation time to the ones that matter. If you spend 45 minutes on a false positive that could have been classified in 3 minutes, you have lost 42 minutes that could have been spent on a real attack. Speed in triage is not about cutting corners — it is about pattern recognition that develops with experience.
---
Priority-based investigation
After triaging the queue, you shift to investigation. Investigation time should be structured, not reactive. Work incidents in priority order:
Priority 1 — Active attacks in progress. Indicators: attack disruption actions triggered, ransomware-related alerts, alerts showing active data exfiltration, alerts showing ongoing lateral movement. These incidents get your full attention immediately. If necessary, interrupt other work.
Priority 2 — High-severity confirmed true positives. Incidents classified as TP during triage with High or Critical severity. These need investigation within the current shift. Do not defer to the next shift unless you have documented your progress in the incident comments.
Priority 3 — Medium-severity true positives and unknowns. Incidents that require investigation but are not actively progressing. These should be investigated within 24 hours.
Priority 4 — Low-severity true positives and operational items. Policy violations, informational alerts, and configuration issues. These can be batched and handled during quiet periods. Do not let these accumulate indefinitely — schedule a recurring block (30-60 minutes per shift) for clearing low-priority items.
---
Documentation standards
Every incident you touch should have comments that allow another analyst to understand the current state without asking you. This is not bureaucracy — it is operational necessity. Analysts go on leave, shift patterns rotate, and incidents that span multiple days need continuity.
What to document in incident comments:
Your classification and reasoning. "Classified as FP. Alert fired on legitimate admin PowerShell activity by admin.t.clark running a scheduled compliance script. Script is in the approved automation list (ref: CHANGE-2026-0142)."
Your investigation progress. "Investigated the process chain. Confirmed malicious macro in invoice.docx delivered via email from compromised external account. User j.morrison's device (DESKTOP-NGE042) isolated. Investigation package collected. Pending: decode the Base64 PowerShell payload and check if the file hash appears on other devices."
Actions taken and pending. "Actions taken: device isolated, user sessions revoked, inbox rules checked (none found). Pending: password reset requires manager approval per IR policy — ticket INC-NE-2026-0321 raised. Handover to next shift if not approved by 17:00."
Escalation notes. "Escalated to Tier 2 — incident involves 12 devices and potential data exfiltration. Tier 2 lead: S. Patel, notified via Teams at 14:30."
The golden rule of documentation: another analyst should be able to read your comments and continue the investigation without contacting you. If they need to ask you questions to understand where you left off, your documentation is insufficient.
---
Shift handover
At the end of each shift, write a handover that covers three things:
Active incidents. Which incidents are you currently investigating? What is the current status? What actions are pending?
Pipeline and environment issues. Were there any data connector issues during your shift? Did any automation rules malfunction? Are there any environment-wide concerns (patch deployment in progress, scheduled maintenance windows, pen test running)?
Notable observations. Did you see any patterns in the queue that might indicate an emerging campaign? Did you create any new suppression rules? Did you identify any tuning opportunities for noisy detection rules?
Keep the handover concise — five to ten bullet points, not a two-page report. The next analyst needs a 2-minute briefing, not a novel.
---
Managing alert fatigue
Alert fatigue is the gradual degradation of an analyst's attention and response quality caused by exposure to high volumes of alerts, most of which are false positives or low-value true positives. It is the single biggest threat to SOC effectiveness and is the root cause of most "how did we miss that" post-incident reviews.
Recognizing alert fatigue in yourself: You start closing incidents without fully reading the alert details. You classify ambiguous alerts as FP without investigating because "it's probably another false positive." You stop checking the process tree and rely solely on the alert title. You skip the data pipeline health check because it has been healthy for weeks. If you notice these behaviors, take a break, switch to a different task (threat hunting, rule tuning, documentation), and return to the queue with fresh attention.
Organizational countermeasures: The most effective countermeasure is aggressive false positive reduction. Every week, review the top 10 most common alert types in your queue. For each one that is predominantly false positive, create a suppression rule, tune the detection threshold, or add an exclusion. A SOC that receives 500 alerts per week with a 90% false positive rate (50 real alerts buried in 450 noise alerts) is less effective than a SOC that receives 100 alerts per week with a 50% false positive rate (50 real alerts in 50 noise alerts). The real attack count is identical — but the analyst's ability to find them is vastly different.
Rotation and variety. Analysts who spend every shift working the same queue on the same alerts experience faster fatigue. Rotate between queue triage, threat hunting, detection engineering (writing new rules), and investigation to maintain engagement. If your team is too small for formal rotation, allocate 20-30% of each shift to non-queue activities.
| Title | AlertCount | FPCount | AvgSeverity |
| Suspicious PowerShell command line | 187 | 142 | 2.1 |
| Email messages containing malicious URL removed after delivery | 93 | 12 | 1.8 |
| Suspicious process injection observed | 67 | 51 | 2.4 |
Tuning opportunity: "Suspicious PowerShell command line" fired 187 times in 30 days with 142 false positives (76% FP rate). This single alert type consumed approximately 15 hours of analyst triage time (187 × 5 min). Review the false positives to identify a common pattern (specific script, specific user, specific device) and create a suppression rule. Reducing this alert's FP rate from 76% to 20% would save ~10 hours of analyst time per month.
---
The Action Center
The Action Center is the unified view of all remediation actions — both automated and manual — taken across your environment. Navigate to it from the Defender portal navigation. The Action Center has two tabs: Pending (actions awaiting analyst approval) and History (completed actions).
Pending actions appear when your automation level is set to "Semi — require approval for any remediation" or "Semi — require approval for core folders remediation." In these modes, automated investigation identifies remediation actions but waits for an analyst to approve them before executing. Common pending actions include quarantining a malicious file, removing a persistence mechanism (scheduled task, registry key), and stopping a malicious process.
Check the Pending tab at least once per shift. Pending actions that sit unapproved for days mean your automated investigation is identifying threats but you are not allowing it to remediate them — you are getting the detection benefit but losing the response benefit. If you consistently approve the same types of pending actions, consider upgrading your automation level to "Full — remediate threats automatically" for those action types.
History shows every completed action with timestamps, the analyst who approved it (or "Automated" for fully automated actions), the device and file affected, and the result (successful or failed). Use the History tab during incident reviews to verify that all remediation actions completed successfully and to document the response timeline for incident reports.