9.2 Creating Scheduled Analytics Rules
Creating Scheduled Analytics Rules
Introduction
Scheduled analytics rules are the primary detection mechanism in Sentinel. You define the threat pattern as a KQL query, configure how often the query runs, set the threshold for when to alert, map entities for investigation, and define how alerts group into incidents. This subsection walks through the complete creation process with production-ready examples.
The creation wizard: step by step
Navigate to Sentinel → Analytics → Create → Scheduled query rule. The wizard has five tabs: General, Set rule logic, Incident settings, Automated response, and Review + create.
Tab 1: General. Name the rule with a clear, actionable description: “Multiple failed sign-ins followed by success from same IP” is better than “Brute force.” Set severity based on the guidelines from subsection 9.1. Map to MITRE ATT&CK technique(s). Add a description that explains what the rule detects and why it matters — the analyst reading the incident needs this context.
Tab 2: Set rule logic. This is where you define the detection.
The KQL query. Write or paste your KQL query. The query must return at least one row for an alert to generate. Every column in the output is available for entity mapping and custom details. Design the query to project the fields an analyst needs for triage: timestamps, user identities, IP addresses, device names, and contextual details.
Production example — brute-force detection:
| |
This query detects the most dangerous brute-force pattern: many failures followed by a success — indicating the attacker guessed the correct password. The output projects every field the analyst needs: the attacking IP, the compromised account, the failure count, and the timestamp of the successful compromise.
Production example — inbox rule forwarding to external domain:
| |
This rule detects BEC persistence — an attacker who has compromised a mailbox and creates a forwarding rule to exfiltrate email to an external address.
Schedule configuration. Set the frequency (how often the rule runs) and the lookback period (how far back the query examines). For the brute-force rule: run every 5 minutes, lookback 1 hour (matching the TimeWindow in the query). For the inbox rule: run every 15 minutes, lookback 1 hour.
Alert threshold. Generate an alert when the query returns more than 0 results (default). You can also set a threshold: “Generate alert when number of query results is greater than 10” — useful for noisy detections where a small number of results is normal.
Event grouping. Choose between “Group all events into a single alert” (one alert per rule execution, with all result rows in the alert) or “Trigger an alert for each event” (one alert per result row). For the brute-force rule: trigger per event (each attacking IP gets its own alert). For the inbox rule: trigger per event (each forwarding rule gets its own alert).
Tab 3: Incident settings. Enable incident creation (default: enabled). Configure alert grouping as described in subsection 9.1: group by entities for entity-correlated incidents.
Tab 4: Automated response. Attach automation rules or playbooks that run when the rule generates an incident. Covered in subsections 9.6 and 9.7.
Tab 5: Review + create. Review all settings and save. The rule begins executing on the configured schedule immediately.
Query design patterns for detection rules
Pattern 1: Threshold detection. Count events and alert when the count exceeds a threshold. Used for: brute-force, port scanning, mass file download.
| |
Pattern 2: Rare event detection. Alert when a specific event type occurs that should never (or almost never) happen. Used for: security log cleared, honeytoken activation, admin role assigned to unexpected user.
| |
Pattern 3: Sequence detection. Alert when event A is followed by event B within a time window. Used for: brute-force-then-success, privilege escalation chains, phishing-then-inbox-rule.
| |
Pattern 4: Baseline deviation. Alert when current activity deviates significantly from historical norms. Used for: unusual data access volume, first-time access to sensitive resource, abnormal process execution.
| |
Pattern 5: Watchlist/TI match. Alert when an observed indicator matches a threat intelligence or watchlist entry. Used for: known malicious IP, domain, or file hash detection.
| |
Testing rules before deployment
Never deploy a rule directly to production without testing. A poorly written rule generates false positives that overwhelm the SOC queue, or false negatives that miss real threats.
Test step 1: Run the query manually. Paste the KQL query into the Logs blade and run it against historical data. Does it return results? Are the results true positives (real threats or realistic test data)?
Test step 2: Estimate alert volume. Run the query against 7 days of data with the production threshold. How many alerts would it have generated per day? If the answer is more than 10 per day for a single rule, the threshold needs tuning or the query needs refinement.
Test step 3: Verify entity mapping. Check that the output columns match the entity mapping you will configure. If you want to map an Account entity, the query must output a column with the user principal name or account name.
Test step 4: Deploy in simulation mode. Set the rule to “Disabled” initially. Review the query results manually for one week. If the false positive rate is acceptable, enable the rule. If not, tune the query or threshold before enabling.
Additional production detection examples
Impossible travel detection (location-based):
| |
This detects sign-ins that would require travelling faster than 1000 km/h — physically impossible. Adjusting the speed threshold accounts for VPN use (which can appear as different locations for the same user).
New admin role assignment:
| |
Privileged role assignment outside of a PIM workflow may indicate an attacker escalating privileges with a compromised admin account.
Rule naming conventions
Consistent naming enables quick identification in the Analytics blade, workbooks, and automation rule conditions.
Recommended format: [Priority]-[Attack Stage]-[Description]
Examples:
P1-InitialAccess-BruteForceSuccessFromExternalIPP1-Persistence-InboxRuleForwardingToExternalP2-LateralMovement-RDPFromUnusualSourceP2-PrivilegeEscalation-NewGlobalAdminAssignmentP3-Discovery-ADReconQueryBurst
The priority prefix (P1/P2/P3) enables automation rule conditions: “If analytics rule name starts with P1 → assign to senior analyst AND run containment playbook.” The attack stage prefix enables workbook grouping by kill chain phase.
Schedule optimisation: balancing detection latency with resource consumption
Every rule execution consumes workspace query resources. A workspace with 200 rules running every 5 minutes executes 2,400 queries per hour — which can cause query throttling and increased latency for all rules.
High-priority rules (P1): 5-minute schedule. Brute-force success, security log cleared, honeytoken activation, high-confidence TI match. These justify the resource cost for fast detection.
Standard rules (P2): 15-60 minute schedule. Inbox rule creation, suspicious process execution, unusual admin activity. These threats develop over minutes to hours — a 15-minute detection delay is acceptable.
Low-priority rules (P3): 1-4 hour schedule. Policy violations, posture deviations, informational anomalies. These are not time-critical and benefit from less frequent execution.
Daily rules: 24-hour schedule. Compliance checks, baseline reports, dormant account detection. Run once per day during off-peak hours to minimise resource impact.
Monitoring rule execution health
After deployment, monitor whether your rules are executing successfully.
| |
Rules with less than 100% success rate need investigation. Common failures: the KQL query times out (optimise the query or increase the timeout), the query references a table that does not exist or has no data (connector was disconnected), or a KQL syntax error was introduced during editing.
Query timeout handling. Scheduled rules have a maximum execution time. If your KQL query is complex (multiple joins, large time windows, many tables), it may time out. Optimise by: adding time filters as early as possible in the query (where TimeGenerated > ago(1h) as the first operator), reducing join complexity (materialise intermediate results with let statements), and narrowing the lookback window.
KQL query performance for analytics rules
Analytics rules run automatically — you cannot watch them execute and optimise in real time. Write efficient KQL from the start.
Performance pattern 1: Filter early. Place where TimeGenerated > ago(...) and other filters at the beginning of the query. This reduces the data volume that subsequent operators process.
Performance pattern 2: Materialise lookups. If your query joins with a watchlist or a small lookup table, materialise it first with let:
| |
The materialize function evaluates the watchlist once and caches the result — avoiding re-evaluation for each row in SigninLogs.
Performance pattern 3: Avoid unnecessary columns. Use project to select only the columns needed for the detection and entity mapping. Processing fewer columns = faster execution.
Performance pattern 4: Limit join cardinality. If joining two large tables, add filters to both sides before the join to reduce the number of rows being matched.
The starter rule library: 10 rules every Sentinel deployment needs
Deploy these 10 rules as the minimum viable detection library. They cover the most common attack techniques with well-tested KQL.
- Brute-force followed by success (T1110) — SigninLogs. Schedule: 15 min.
- Inbox rule forwarding to external (T1114.003) — CloudAppEvents. Schedule: 15 min.
- Security log cleared (T1070.001) — SecurityEvent. NRT rule.
- New Global Admin assignment (T1098.003) — AuditLogs. Schedule: 15 min.
- Suspicious PowerShell execution (T1059.001) — DeviceProcessEvents. Schedule: 15 min.
- Impossible travel (T1078) — SigninLogs. Schedule: 1 hour.
- Mass file download (T1530) — CloudAppEvents. Schedule: 1 hour.
- TI match on sign-in IP (Custom) — SigninLogs + Watchlist. Schedule: 15 min.
- OAuth consent grant to risky app (T1550.001) — AuditLogs. Schedule: 15 min.
- MFA fatigue/push spam (T1621) — SigninLogs. Schedule: 5 min.
Each rule maps to a specific MITRE technique, uses one of the five query patterns from earlier in this subsection, and includes entity mappings for Account and IP. Together, these 10 rules provide baseline detection coverage for the most common M365 attack vectors. Expand from this foundation using the detection engineering lifecycle (subsection 9.11).
Starter rule KQL: MFA fatigue and OAuth consent grant
Two of the starter rules deserve detailed KQL because they detect increasingly common attack techniques.
MFA fatigue / push spam detection (T1621):
| |
This detects the increasingly common MFA fatigue attack: an attacker with a valid password sends repeated MFA push notifications hoping the user will eventually approve out of frustration. The join checks whether the user subsequently approved — indicating successful compromise.
Suspicious OAuth consent grant (T1550.001):
| |
OAuth consent phishing tricks users into granting a malicious application access to their mailbox, files, or directory. The rule detects consent grants for high-risk permissions — the same permissions an attacker needs for BEC and data exfiltration.
Rule deployment pre-flight checklist
Before enabling any new analytics rule, verify every item.
KQL validation: Query runs without errors in the Logs blade. Results contain the expected columns for entity mapping. No hardcoded values that should be parameters or watchlist lookups.
Schedule and lookback: Lookback >= schedule interval (no detection gaps). Schedule appropriate for detection urgency (P1: 5 min, P2: 15-60 min, P3: 1-4 hr).
Threshold: Tested against 7-30 days of historical data. Expected daily alert volume is manageable (< 10/day for most rules). Threshold set above the baseline noise level.
Entity mapping: At least Account and IP mapped (for identity detections). Entity column names match the KQL output exactly. Entity identifiers use the correct format (FullName for UPN, Address for IP).
Custom details: At least 3 context fields added. Dynamic alert title configured. Dynamic severity configured if applicable.
MITRE ATT&CK: At least one technique mapped. Tactic is appropriate for the detection stage.
Alert grouping: Entity-based grouping configured for rules that can fire multiple times per entity. Time window set appropriately.
Naming: Follows the [Priority]-[Stage]-[Description] convention. Name is searchable and unambiguous.
Documentation: Rule description explains what it detects, why it matters, and what the analyst should do. Version history tracked (in description or Git).
Try it yourself
Create a scheduled analytics rule using the brute-force detection query above. Set the schedule to every 15 minutes with a 1-hour lookback. Set the threshold to FailureThreshold = 5 (lower for a lab with limited activity). Map entities: IP (IPAddress), Account (CompromisedAccount). Enable alert grouping by IP entity. Run the rule and verify it executes without errors. If your lab has sign-in activity, generate a few failed sign-ins followed by a success to trigger the rule.
What you should observe
The rule appears in the Active rules list with status "Enabled." After the first execution cycle (15 minutes), check the rule's last run time and result count in the Analytics blade. If the rule matched events, an incident appears in the incident queue with the mapped entities (IP and Account) visible in the incident details.
Knowledge check
Check your understanding
1. Your scheduled rule runs every 5 minutes with a 1-hour lookback. What operational issue does this create?