7.1 Microsoft Sentinel: SIEM + SOAR Architecture

16-20 hours · Module 7

Microsoft Sentinel: SIEM + SOAR Architecture

SC-200 Exam Objective

Domain 1 — Manage a SOC Environment (40-45%): "Design and configure a Microsoft Sentinel workspace." Understanding Sentinel's architecture is prerequisite to every configuration and operational task in this module.

Introduction

Every module in this course generates data. Defender XDR generates alerts and incidents. Defender for Endpoint generates device telemetry. Entra ID generates sign-in and audit logs. Purview generates DLP and audit events. Defender for Cloud generates security alerts and recommendations. Until now, you have accessed this data through product-specific portals — the Defender portal for XDR data, the Purview portal for audit data, the Azure portal for cloud security data.

Sentinel is where all of this data converges. It is the central data platform that ingests security data from every source, stores it in a queryable format, runs detection rules against it, generates incidents when threats are found, and enables automated response through playbooks. Sentinel is not just another portal — it is the data infrastructure that makes cross-product investigation (the technique you learned in Modules 3.9, 4.10, and 5.10) operationally scalable.

This subsection teaches you what Sentinel is architecturally, how it processes data from ingestion to response, and how it differs from traditional SIEM platforms. This architectural understanding is essential for the configuration decisions in subsections 7.2 through 7.12.

What a SIEM does and why it matters

A Security Information and Event Management (SIEM) system serves five functions in security operations.

Figure 7.1: The five core SIEM functions. Sentinel implements all five, plus SOAR (Security Orchestration, Automation, and Response) capabilities that extend the Respond function with full workflow automation through Logic Apps playbooks.

Collect — ingest log data from every security-relevant source: endpoints, identities, email, cloud infrastructure, network devices, firewalls, applications, and third-party security tools. Without comprehensive collection, threats that generate signals in uncollected sources are invisible.

Store — retain the collected data for investigation and compliance. Retention periods vary by regulation and operational need: some data needs 90 days for active investigation, other data needs 7 years for regulatory compliance. Storage must be cost-effective at scale — a mid-size organisation generates 10-50 GB of security log data per day.

Detect — run detection rules (analytics rules in Sentinel) against the stored data. Rules evaluate patterns that indicate threats: a sign-in from a new country followed by inbox rule creation (BEC pattern), a process creating a scheduled task after executing a suspicious download (malware persistence), or a storage account accessed from a Tor exit node (data exfiltration). When a rule matches, an alert is generated and grouped into an incident.

Investigate — query the stored data to understand what happened. KQL queries (Module 6) are the primary investigation tool. The investigation combines data from multiple tables to build the cross-product timelines you practised in Modules 3.9, 4.10, and 5.10.

Respond — take action when a threat is confirmed. Automated responses (playbooks) can disable accounts, isolate devices, block IPs, create tickets, send notifications, and orchestrate multi-step remediation workflows. Manual responses use the investigation findings to guide containment and eradication actions.

How Sentinel differs from traditional SIEM

Traditional SIEM platforms (Splunk, IBM QRadar, ArcSight, LogRhythm) were designed for on-premises deployment. Sentinel was designed for cloud-native operation. The differences are not cosmetic — they fundamentally change the operational model.

Traditional SIEM vs Microsoft Sentinel

Aspect	Traditional SIEM	Microsoft Sentinel
Deployment	On-premises servers, sized at purchase	Cloud-native, scales automatically
Capacity planning	Buy hardware for peak + growth	Pay per GB ingested — no hardware
Scaling	Buy more servers, re-architect	Scales transparently — ingest more, pay more
Maintenance	Patch OS, update SIEM, manage storage	Microsoft manages infrastructure
Query language	SPL (Splunk), AQL (QRadar), custom	KQL (shared with Defender XDR, Azure)
Microsoft integration	Requires connectors, often incomplete	Native first-party — one-click connectors
Automation	Separate SOAR product (Demisto, Phantom)	Built-in SOAR (Logic Apps playbooks)
Pricing model	License-based (per EPS, per source)	Consumption-based (per GB/day)
Time to deploy	Weeks to months	Hours (workspace creation to data ingestion)
Threat intelligence	Separate TI platform	Built-in TI integration + Microsoft TI

The practical impact: With traditional SIEM, increasing your log coverage from 10 sources to 50 sources requires hardware upgrades, storage expansion, and capacity planning. With Sentinel, you enable additional data connectors and the platform scales automatically. The barrier to comprehensive visibility is budget (cost per GB), not infrastructure. This changes the security conversation from "we can't collect that data because we don't have capacity" to "we can collect any data — the question is whether the detection value justifies the ingestion cost."

The Log Analytics workspace: Sentinel’s data foundation

Sentinel does not have its own data store. It runs on top of a Log Analytics workspace — the Azure data platform that stores log data in tables and provides KQL query capability. When you “create a Sentinel workspace,” you are actually enabling the Sentinel solution on an existing (or new) Log Analytics workspace. The workspace is the foundation. Sentinel adds the security layer: analytics rules, incidents, automation, hunting, workbooks, and the security-specific features.

Understanding this architecture matters because: Log Analytics workspace settings (retention, access controls, pricing tier) directly affect Sentinel’s capabilities and cost. Some settings are configured at the workspace level (retention policies, data collection rules) and others at the Sentinel level (analytics rules, automation rules). The workspace can contain non-security data (performance counters, application logs) alongside security data — Sentinel queries can access all data in the workspace, not just security tables.

Figure 7.2: Sentinel's layered architecture. Data sources feed into the Log Analytics workspace (the data platform). Sentinel runs on top of the workspace, providing analytics rules, incidents, automation, hunting, workbooks, and threat intelligence. The workspace stores the data; Sentinel makes it actionable for security operations.

The SOAR component: automation rules and playbooks

SIEM collects data and detects threats. SOAR (Security Orchestration, Automation, and Response) automates the response. Sentinel includes SOAR natively through two mechanisms.

Automation rules are lightweight, no-code logic that runs when an incident is created or updated. They can: change the incident severity (escalate a medium-severity incident to high if it involves a VIP user), assign the incident to a specific analyst or team (route BEC incidents to the email security team), add tags for categorisation (tag incidents involving external IPs with “external-threat”), suppress known false positive patterns (auto-close incidents from a known-benign source), and trigger playbooks for more complex automation.

Automation rules are the first line of automated response — they handle the routine incident management tasks that otherwise consume analyst time. In a mature SOC, automation rules handle 60-80% of incident management actions automatically: assigning, tagging, severity adjustment, and false positive suppression. The analyst’s time is reserved for investigation and response decisions that require human judgement.

Playbooks are full automation workflows built on Azure Logic Apps. They can call any API, integrate with any service, and orchestrate multi-step response workflows. Examples: when a high-severity BEC incident is created, a playbook automatically resets the user’s password, revokes all sessions, checks for inbox forwarding rules (and deletes them if found), sends a notification to the user’s manager, creates a ticket in ServiceNow, and posts an alert to the SOC Slack channel. All within 30 seconds of the incident being created.

Playbooks bridge the gap between detection and response. Without automation, the sequence is: detection → analyst reads alert → analyst decides action → analyst takes action (multiple portal clicks). With a playbook, the sequence is: detection → playbook takes pre-approved actions immediately → analyst reviews and adjusts. The time from detection to initial containment drops from minutes (manual) to seconds (automated).

Module 9 covers analytics rules (detection) and automation rules in detail. Module 8 covers playbook creation. This subsection establishes the architectural context: Sentinel is not just a data collection platform — it is a detection and response platform where automation is a core capability, not a bolt-on.

Data flow: from source to investigation

Understanding how data flows through Sentinel clarifies the configuration decisions in subsequent subsections.

Step 1: Ingestion. Data sources send log data to the Log Analytics workspace through data connectors (Module 8). Microsoft sources (Defender XDR, Entra ID, Azure Activity) use built-in connectors with minimal configuration. Third-party sources (firewalls, SaaS applications, endpoint tools) use Syslog/CEF, API connectors, or custom data collection rules. Each data source writes to a specific table: Defender XDR writes to SecurityAlert and SecurityIncident, Entra ID writes to SigninLogs and AuditLogs, endpoints write to DeviceProcessEvents and DeviceNetworkEvents.

Step 2: Storage. Ingested data is stored in the workspace according to the table’s log tier assignment: Analytics (full query capability, 90-day default retention), Basic (limited query, 30-day retention, lower cost), or Archive (no live query, long-term retention, lowest cost). The tier determines what you can do with the data — Analytics tier data supports full KQL queries and analytics rules. Basic tier data supports limited queries. Archive tier data must be restored before querying. Subsection 7.4 covers tier selection in detail.

Step 3: Detection. Sentinel analytics rules run on a schedule (every 5 minutes, every hour, etc.) and evaluate KQL queries against the ingested data. When a rule’s query returns results, an alert is created. Alerts are grouped into incidents by the correlation engine based on shared entities (the same user, IP, or device appears in multiple alerts). Incidents appear in the Sentinel incident queue for analyst investigation. Module 9 covers analytics rule creation.

Step 4: Investigation. The analyst opens the incident, reviews the alerts, and uses KQL to query the workspace data for additional context. This is where the skills from Modules 1-6 converge: KQL queries (Module 6) against sign-in data (Module 1), endpoint data (Module 2), audit data (Module 3), cloud security data (Module 4), with Copilot assistance (Module 5).

Step 5: Response. Based on the investigation findings, the analyst takes containment and remediation actions — either manually (through the Defender portal or Azure portal) or through playbooks that automate pre-approved response actions. Automation rules can also trigger automatic response at the moment of incident creation, before the analyst even sees the incident.

Why Sentinel for Microsoft environments

If your security stack is primarily Microsoft (M365, Azure, Entra ID, Defender XDR), Sentinel provides native integration that no third-party SIEM can match. The Defender XDR data connector ingests all Defender product data with one click. The Entra ID connector ingests sign-in and audit data with one click. The Azure Activity connector ingests management plane data with one click. The data arrives in well-structured tables with consistent schema. The KQL queries you write for Defender XDR Advanced Hunting work in Sentinel with minimal modification (some table names differ, but the query patterns are identical).

For organisations running mixed environments (Microsoft + third-party firewalls, non-Microsoft endpoint tools, SaaS applications), Sentinel provides Syslog/CEF connectors for network devices, API connectors for SaaS platforms, and custom data collection rules for bespoke data sources. The unified workspace means you query Microsoft data and third-party data with the same KQL, in the same workspace, with the same investigation workflow.

The alternative — running a third-party SIEM alongside Defender XDR — creates a split-brain problem: some data is in the SIEM, some is in Defender XDR, and cross-product investigation requires querying both platforms. Sentinel eliminates the split brain by serving as both the SIEM and the investigation platform.

SC-200 exam assumption

The SC-200 exam assumes Sentinel is your SIEM. Questions reference Sentinel tables (SecurityAlert, SigninLogs, DeviceProcessEvents), Sentinel features (analytics rules, automation rules, hunting, workbooks), and the unified security operations platform (Sentinel + Defender XDR). The exam does not test third-party SIEM products. Your Sentinel knowledge is the foundation for 40-45% of the exam questions.

Try it yourself

If you have a Sentinel workspace from Module 0 setup, navigate to Microsoft Sentinel in the Azure portal. Review the Overview page: the data ingestion volume chart (how much data is entering the workspace daily), the active analytics rules count, the open incidents count, and the enabled data connectors. If no Sentinel workspace exists, create one now — the setup takes 5 minutes and is covered step-by-step in subsection 7.3. Having a live workspace to work with through the rest of this module transforms the content from theoretical to hands-on.

What you should observe

The Overview page shows a dashboard with ingestion trends, incident metrics, and connector status. In a fresh lab workspace, you may see minimal data (if no connectors are enabled) or moderate data (if you connected Defender XDR and Entra ID in Module 0). The key metric to note: daily ingestion volume in GB. This number drives cost (subsection 7.5) and determines which log tiers are appropriate (subsection 7.4).

Knowledge check

Check your understanding

1. What is the relationship between a Log Analytics workspace and Microsoft Sentinel?

Sentinel runs on top of a Log Analytics workspace. The workspace is the data platform that stores log data in tables and provides KQL query capability. Sentinel adds the security layer: analytics rules, incidents, automation, hunting, workbooks, and threat intelligence. You cannot have Sentinel without a Log Analytics workspace — Sentinel is a solution enabled on the workspace, not a separate data store.

They are the same thing — different names for the same product

Sentinel replaces Log Analytics — you no longer need the workspace

Sentinel stores security data, Log Analytics stores non-security data

The layered architecture is a core exam concept. Workspace settings (retention, pricing tier, access controls) directly affect Sentinel's capabilities. Understanding that Sentinel is a security solution on top of a data platform — not a standalone product — is essential for configuration decisions throughout this module.

2. Your organisation currently uses Splunk. Management asks why you recommend migrating to Sentinel for an M365-heavy environment. What is the strongest argument?

Native integration with the Microsoft security ecosystem. Sentinel's first-party connectors ingest Defender XDR, Entra ID, and Azure data with one-click configuration and deliver it in well-structured tables optimised for security investigation. Splunk requires custom connectors, data parsing, and field extraction to achieve the same integration — and the result is often less complete because third-party integration cannot match the depth of first-party data access. The shared KQL language across Sentinel and Defender XDR means analysts write one query language for all investigation contexts.

Sentinel is free — Splunk is expensive

Sentinel is always better than Splunk for any environment

Microsoft requires Sentinel for M365 environments

Sentinel is not free (it charges per GB ingested), not universally superior (Splunk has advantages in some non-Microsoft-heavy environments), and not required by Microsoft. The argument is fit-for-purpose: for M365-heavy environments, Sentinel's native integration provides better data quality, faster deployment, and lower operational overhead than any third-party SIEM.

3. What is the difference between an automation rule and a playbook in Sentinel?

Automation rules are lightweight, no-code logic for incident management tasks: assigning incidents, changing severity, adding tags, and triggering playbooks. Playbooks are full automation workflows built on Azure Logic Apps that can call any API, integrate with any service, and orchestrate multi-step response actions (reset passwords, isolate devices, create tickets, send notifications). Automation rules decide what to do. Playbooks execute the complex actions.

They are the same — playbooks is the old name for automation rules

Automation rules run automatically, playbooks require manual trigger

Automation rules are for alerts, playbooks are for incidents

Automation rules and playbooks serve different purposes at different complexity levels. Automation rules handle simple incident management (assignment, tagging, severity). Playbooks handle complex response workflows (API calls, multi-step actions). Both can be triggered automatically. The two-tier model lets you handle simple automation without Logic Apps overhead, and reserve Logic Apps for complex workflows that need API integration.

7.2 Workspace Architecture and Design Decisions →