10.1 Analytics Rules: Architecture and Rule Types
Analytics Rules: Architecture and Rule Types
Introduction
Required role: Microsoft Sentinel Contributor for analytics rules. Sentinel Responder for incident management.
An analytics rule is a KQL query that runs on a schedule against the data in your Sentinel workspace. When the query returns results, an alert is generated. Alerts are grouped into incidents — which appear in the incident queue for analyst investigation. This is the core detection mechanism in Sentinel: you define the threat pattern as KQL, Sentinel evaluates it continuously, and when the pattern appears in your data, the SOC is notified.
This subsection teaches the four rule types, how the execution model works, and when to use each type.
How analytics rules execute
The execution model determines when and how often your KQL query runs against the workspace data.
Scheduled rules run on a timer that you configure: every 5 minutes, every hour, every 4 hours, every 24 hours. At each execution, the rule evaluates a KQL query against a lookback window (the time range of data the query examines). If the query returns one or more rows, an alert is generated for each row (or for the result set, depending on alert grouping configuration).
The relationship between schedule and lookback matters. If a rule runs every 5 minutes with a 5-minute lookback, there are no gaps — every event is evaluated exactly once. If a rule runs every hour with a 5-minute lookback, there are 55 minutes of data per hour that the rule never evaluates — a detection gap. If a rule runs every 5 minutes with a 1-hour lookback, events in the overlap window are evaluated multiple times — potentially generating duplicate alerts.
Best practice: Set the lookback equal to or slightly larger than the schedule interval. A rule that runs every 5 minutes should have a 5-minute or 10-minute lookback. A rule that runs every hour should have a 60-minute or 75-minute lookback (the extra 15 minutes provides overlap that catches events with slight ingestion latency). Never set a lookback significantly shorter than the schedule — this creates detection gaps.
The four rule types
| Rule Type | Execution | Latency | Use Case |
|---|---|---|---|
| Scheduled | Timer (5 min to 14 days) | Minutes to hours | Most detections — custom KQL queries |
| NRT (Near-Real-Time) | Every ~1 minute | Seconds to minutes | High-priority detections requiring fastest response |
| Microsoft Security | Pass-through | Seconds | Create incidents from Defender product alerts |
| Anomaly | ML-based, periodic | Hours | Behavioural anomaly detection (UEBA-related) |
Scheduled rules are the workhorse of Sentinel detection. You write the KQL query, set the schedule, define the threshold, map entities, and configure alert grouping. 90% of your custom detection rules will be scheduled rules. They support the full KQL language including join, union, and complex aggregations. Subsection 9.2 covers creation in detail.
NRT (Near-Real-Time) rules run approximately every minute against the most recent data. They use a simplified KQL model — no schedule or lookback configuration (the system manages the window automatically). NRT rules do not support all KQL operators (notably, some time-based functions behave differently). Use NRT for detections that require sub-minute response: security log cleared (Event ID 1102), high-severity TI match, or critical honeytoken activation. Subsection 9.3 covers NRT rules.
Microsoft Security rules are pass-through rules that create Sentinel incidents from alerts generated by other Microsoft security products (Defender XDR, Defender for Cloud, Entra ID Protection). They do not contain KQL — they filter incoming SecurityAlert events by provider and severity and create incidents. If you have bi-directional incident sync enabled for Defender XDR (Module 8.3), Microsoft Security rules for Defender products are redundant — disable them to avoid duplicates.
Anomaly rules use machine learning to detect behavioural anomalies — deviations from established baselines. They are pre-built by Microsoft (you do not write the KQL or train the model) and can be customised by adjusting the sensitivity threshold. Anomaly rules feed into the UEBA system (subsection 10.8). They detect patterns that are difficult to express as static KQL rules: a user who suddenly accesses 10x more files than their 30-day average, or a device that connects to 50 unique IPs in an hour when its normal baseline is 5.
Rule execution internals: what happens when a rule fires
Understanding the internal execution flow helps you troubleshoot rules that do not behave as expected.
Step 1: Query execution. At the scheduled time, Sentinel executes the rule’s KQL query against the workspace data within the lookback window. The query runs with the permissions of the Sentinel service — not a specific user.
Step 2: Threshold evaluation. If the query returns more results than the configured threshold (default: > 0 results), the rule triggers. If the query returns zero results or fewer than the threshold, no alert is generated.
Step 3: Alert creation. For each result row (or for the entire result set, depending on event grouping), Sentinel creates an alert in the SecurityAlert table. The alert includes: the rule name, severity, MITRE mapping, entity mappings (populated from the KQL output columns), custom details, and the raw query results.
Step 4: Incident creation. Based on the incident settings (subsection 10.1 alert grouping), the alert is either added to an existing incident or creates a new incident. The incident appears in the SecurityIncident table and the incident queue.
Step 5: Automation execution. Any automation rules matching the incident’s conditions execute immediately (subsection 10.6). If an automation rule triggers a playbook, the playbook begins executing (subsection 10.7).
The entire chain — from query execution to playbook completion — typically takes 30-120 seconds for a scheduled rule. For NRT rules, the chain starts within seconds of the event arriving in the workspace.
The SecurityAlert table: understanding alert data
Every alert generated by analytics rules is written to the SecurityAlert table. Understanding this table helps you build workbooks, automation, and cross-rule correlation.
Key columns: TimeGenerated (when the alert was created), AlertName (rule name), AlertSeverity, Tactics (MITRE ATT&CK tactics), Techniques (MITRE ATT&CK techniques), Entities (JSON array of mapped entities), ExtendedProperties (JSON object containing custom details), ProviderName (for Sentinel-generated alerts: “Azure Sentinel”), Status (New, InProgress, Resolved, Dismissed).
| |
Rules with DailyAvg > 10 are generating significant SOC workload. If they are true positives, consider whether the response can be automated (playbook). If they are false positives, tune the rule.
Rule capacity planning
Sentinel workspaces have practical limits on analytics rule count and execution frequency.
Rule count limits. Each workspace supports up to 512 active scheduled analytics rules. For most organisations, 100-200 active rules provide comprehensive detection. If you approach 512, review for: redundant rules covering the same detection, rules querying data sources that are no longer connected, and rules that could be consolidated (e.g., three separate brute-force rules for different thresholds → one rule with dynamic severity).
Execution resource budget. Each rule execution consumes workspace query resources (measured in CPU seconds). A workspace with 200 rules running every 5 minutes executes 2,400 queries per hour. Complex rules (multiple joins, large lookback windows) consume more resources per execution. If total resource consumption exceeds the workspace capacity, rules start queuing — increasing detection latency for all rules.
Monitoring resource consumption:
| |
Rules with AvgDuration > 60 seconds are expensive. Optimise their KQL (filter early, materialise lookups, reduce join scope) or increase the schedule interval to reduce total execution frequency.
Right-sizing the schedule: Not every rule needs a 5-minute schedule. Categorise rules by detection urgency (subsection 10.2 schedule optimisation). Move low-priority rules to 1-hour or 4-hour schedules to free up execution resources for high-priority rules.
Detection layering: defence in depth for analytics
A single detection layer (one rule per threat technique) is fragile. If the attacker modifies their technique slightly, the rule misses. Detection layering means creating multiple rules for the same threat — each targeting a different observable in the attack chain.
Example: layered detection for credential compromise.
Layer 1 (Prevention signal): Multiple failed MFA challenges from the same user in a short window. Detects: MFA fatigue/push spam attack. Table: SigninLogs.
Layer 2 (Initial access signal): Successful sign-in from an IP flagged as risky by Entra ID Protection. Detects: compromised credential used from attacker infrastructure. Table: SigninLogs.
Layer 3 (Persistence signal): New inbox rule created that forwards email to an external address. Detects: attacker establishing persistence in the compromised mailbox. Table: CloudAppEvents.
Layer 4 (Action signal): Mass email read or unusual outbound email volume from the compromised account. Detects: BEC — the attacker is reading or sending email from the compromised mailbox. Table: CloudAppEvents.
If Layer 1 fires, the analyst investigates and potentially catches the attack before Layer 2 occurs. If Layer 1 is missed (the attacker used a technique that bypasses MFA without push spam), Layer 2 or Layer 3 catches the attack at a later stage. No single rule failure allows the entire attack to proceed undetected.
Each layer should be a separate analytics rule — not one complex rule with multiple conditions. Separate rules: provide independent alerting (Layer 3 fires even if Layer 1 did not), allow different schedules per layer (Layer 1 on 5-minute NRT, Layer 4 on 1-hour scheduled), and enable independent tuning (each layer’s threshold and exclusions can be adjusted without affecting the others).
Rule templates: starting points, not finished products
Content Hub and the Analytics rule templates gallery provide hundreds of pre-built rule templates. These are starting points that require customisation for your environment.
Template customisation checklist:
Review the KQL query. Does it reference tables that are populated in your workspace? If it queries CommonSecurityLog but you have no CEF connectors, the rule will never fire.
Adjust the threshold. Templates often use conservative thresholds designed for large enterprises. A threshold of 50 failed logons may be appropriate for a 10,000-user organisation but too high for a 200-user environment. Adjust based on your baseline volume.
Verify entity mapping. Ensure the template’s entity mappings match your data. Some templates map entities from columns that only exist in specific data source formats.
Update the schedule. Templates may default to 5-minute or 1-hour schedules. Choose a schedule appropriate for the detection’s urgency and the expected event volume.
Add custom details. Most templates include minimal custom details. Add the context fields that your analysts need for triage.
Add exclusions. Templates do not know your environment’s benign patterns. After initial deployment, identify false positive patterns and add exclusions to the KQL.
Test before enabling. Run the template’s KQL manually against 7 days of data. Review the results. If the volume is manageable and the results look accurate, enable the rule.
Template sources beyond Content Hub
Microsoft Sentinel GitHub repository (github.com/Azure/Azure-Sentinel): the official open-source repository contains hundreds of analytics rule templates, hunting queries, playbooks, and workbooks contributed by Microsoft and the community. Many templates in Content Hub originate here — the GitHub repository often has newer versions or community-contributed rules that have not yet been packaged into Content Hub solutions.
Sigma rules. Sigma is an open, vendor-neutral format for detection rules. The Sigma community maintains thousands of rules that can be converted to KQL using the sigma CLI tool. This is a rich source of detection rules for attack techniques that Content Hub does not cover. Convert with: sigma convert -t kusto -p microsoft365defender rule.yml (for Defender tables) or sigma convert -t kusto -p sentinel rule.yml (for Sentinel tables).
Your own incident library. Every true positive incident you investigate is a potential detection rule. After closing a true positive, ask: “Did an analytics rule detect this, or did I find it through manual investigation or UEBA?” If you found it manually, write a rule that would have detected it automatically — ensuring the next occurrence is caught by the detection engine rather than requiring analyst hunting.
Alert suppression vs deduplication: understanding the difference
Two mechanisms reduce duplicate or redundant alerts — they solve different problems.
Alert suppression (on the rule). Configurable in the rule wizard: “After this rule fires, suppress it for X hours.” During the suppression window, the rule does not generate new alerts — even if the KQL query returns results. Use for: rules that detect a condition that persists for a period (e.g., a misconfiguration that exists until fixed). Without suppression, the rule fires every schedule interval until the condition is resolved, generating dozens of duplicate incidents.
Alert grouping (on the incident). Configurable in the incident settings: group alerts with matching entities into a single incident. Multiple alerts are generated but consolidated into one incident. Use for: rules that detect repeated discrete events from the same source (e.g., brute-force — each failed logon generates an alert, all from the same IP, grouped into one incident).
The key difference: Suppression prevents alerts from being generated. Grouping allows alerts to be generated but organises them into manageable incidents. Use suppression when the detection fires repeatedly for the same underlying condition. Use grouping when the detection fires repeatedly for the same ongoing attack.
Alert grouping: from alerts to incidents
A single analytics rule can generate many alerts. Alert grouping determines how those alerts are organised into incidents.
No grouping (default): Every alert becomes a separate incident. Appropriate for high-fidelity rules where each alert warrants independent investigation (e.g., security log cleared — every instance is a separate potential compromise).
Group all alerts into a single incident: All alerts from this rule within a configurable time window (up to 24 hours) are grouped into one incident. Appropriate for noisy rules where multiple alerts represent the same ongoing threat (e.g., brute-force detection — 50 failed logon alerts from the same IP should be one incident, not 50).
Group alerts with matching entities: Alerts that share the same entities (same user, same IP, same device) are grouped into one incident. Different entities create separate incidents. This is the most common and most useful grouping — it creates one incident per affected entity, automatically correlating related alerts.
Example: A brute-force rule fires 30 alerts in an hour — 20 from IP 203.0.113.47 targeting user j.morrison, and 10 from IP 198.51.100.22 targeting user s.chen. With entity-based grouping on IP: two incidents (one per attacking IP). With entity-based grouping on Account: two incidents (one per target user). With no grouping: 30 incidents (unmanageable).
Rule severity and MITRE ATT&CK mapping
Every analytics rule must have a severity (Informational, Low, Medium, High) and should have at least one MITRE ATT&CK technique mapping.
Severity guidelines:
High — confirmed or highly likely threat requiring immediate investigation. Examples: security log cleared, honeytoken activated, TI match on known APT infrastructure, ransomware encryption pattern detected. Expectation: analyst investigates within 1 hour.
Medium — potential threat requiring investigation within a shift. Examples: sign-in from a sanctioned country, suspicious PowerShell execution, inbox rule forwarding to external address. Expectation: analyst investigates within 4 hours.
Low — minor policy violation or low-confidence detection. Examples: failed MFA challenge, single failed logon from unusual location, non-critical GPO change. Expectation: review within 24 hours.
Informational — automated enrichment or telemetry, no analyst action required. Examples: watchlist match for logging purposes, baseline deviation within normal range. Used for hunting context, not incident creation.
MITRE ATT&CK mapping enables the coverage analysis from Module 7.10 (Content Hub). Every rule should map to at least one ATT&CK technique — this allows you to visualise your detection coverage against the ATT&CK matrix and identify gaps. A rule detecting brute-force maps to T1110 (Brute Force). A rule detecting inbox rule creation maps to T1114.003 (Email Collection: Email Forwarding Rule). A rule detecting scheduled task creation maps to T1053 (Scheduled Task/Job).
Every KQL query you write in this module is a question asked of your data every 5 minutes, every hour, or every day: "Has this threat pattern appeared since I last checked?" The quality of your detection — what threats you catch and how many false positives you generate — is determined by the quality of these KQL queries and the precision of their configuration. Module 6 taught you KQL. Module 8 connected the data. This module teaches you to ask the right questions.
Try it yourself
Navigate to Sentinel → Analytics → Rule templates. Browse the available templates. Filter by severity (High) and review 5 templates: read the description, examine the KQL query, check the MITRE ATT&CK mapping, and note the schedule and lookback configuration. This gives you a sense of how production detection rules are structured before you create your own in subsection 10.2.
What you should observe
High-severity rule templates typically have short schedules (5-15 minutes), focused KQL queries targeting specific threat patterns, entity mappings for the key entities (Account, IP, Host), and ATT&CK technique mappings. The KQL queries follow patterns you learned in Module 6: filter → aggregate → threshold → project entities. The templates are your starting point — you will customise them for your environment.
Knowledge check
NIST CSF: DE.AE-1 (Baseline of operations established), PR.DS-1 (Data-at-rest is protected). ISO 27001: A.8.15 (Logging), A.8.16 (Monitoring activities). SOC 2: CC7.2 (Monitor system components). Every configuration in this subsection contributes to the logging and monitoring controls that auditors verify.
Check your understanding
1. Your analytics rule runs every hour with a 5-minute lookback. What problem does this create?
2. A brute-force rule generates 30 alerts in an hour — 20 from one IP and 10 from another. You want one incident per attacking IP. Which alert grouping do you configure?
3. When should you use an NRT rule instead of a scheduled rule?