DE1.3 Scheduled Rules — Frequency and Trigger Logic

2-3 hours · Module 1 · Free
Operational Objective
The Frequency-Latency Trade-off: Query frequency determines how quickly the rule detects threats. A rule that runs every 5 minutes detects within 5 minutes. A rule that runs every 24 hours detects within 24 hours. But higher frequency means more query executions per day, more compute consumption, and potentially more duplicate alerts. Trigger logic determines the threshold at which query results become alerts — too low and the SOC drowns in noise, too high and real attacks slip through. This subsection teaches how to choose the right frequency for each detection, how trigger logic determines when an alert fires, and the NE-specific frequency allocation strategy.
Deliverable: A frequency selection methodology based on severity, a trigger threshold configuration approach, and the NE frequency allocation strategy.
⏱ Estimated completion: 25 minutes

Detection latency explained

Detection latency — the time between a technique executing and the alert firing — has two components: ingestion latency (how long between the event occurring and it appearing in the workspace) and rule latency (how long between the event appearing in the workspace and the next rule execution).

Ingestion latency varies by data source. Microsoft first-party connectors (SigninLogs, DeviceProcessEvents, EmailEvents) typically ingest within 1-5 minutes. Third-party connectors via CEF/Syslog vary from 1-15 minutes depending on the forwarder configuration and batching interval. Custom tables via Data Collection Rules vary widely — some near real-time, some batched hourly.

Rule latency equals the query frequency. A rule that runs every 5 minutes has a maximum rule latency of 5 minutes (the event arrives just after the last execution) and an average rule latency of 2.5 minutes. A rule that runs every hour has an average rule latency of 30 minutes.

Total detection latency = ingestion latency + rule latency. For a SigninLogs query running every 5 minutes: approximately 3 minutes ingestion + 2.5 minutes average rule latency = approximately 5.5 minutes average total detection latency. For the same query running every hour: 3 minutes + 30 minutes = 33 minutes.

DETECTION LATENCY BY FREQUENCY — NE IMPACT ANALYSIS5-MINUTEAvg latency: ~5.5 min288 executions/dayCHAIN-MESH: detect recon113 min before ransomware15-MINUTEAvg latency: ~10.5 min96 executions/dayCHAIN-HARVEST: detect AiTM217 min before BEC sent30-MINUTEAvg latency: ~18 min48 executions/daySuspicious inbox rulesAdequate for Medium severity1-HOURAvg latency: ~33 min24 executions/dayAnomalous file accessMedium severity, low volume24-HOURAvg latency: ~12 hr1 execution/dayCA policy changesDaily review itemsNE COMPUTE BUDGET: ~500 total daily executions target5 rules × 288 + 10 rules × 96 + 15 rules × 48 + 10 rules × 24 + 10 rules × 1 = 3,120 executions/day — within workspace limits

Figure DE1.3 — Detection latency by frequency with NE-specific examples. Higher frequency = faster detection = more compute. The NE allocation distributes compute budget across severity tiers.

Frequency selection by severity

Not every rule needs to run every 5 minutes. The frequency should match the severity and urgency of the detection — and the available compute budget.

Critical severity → NRT or 5-minute frequency. Detections where delayed detection leads to irreversible damage. Ransomware pre-encryption (vssadmin shadow copy deletion), confirmed credential compromise (LSASS memory access), active C2 beacon activation. For these, consider NRT rules (DE1.6) instead of 5-minute scheduled rules. Limit to 3-5 rules at this frequency.

High severity → 5-15 minute frequency. Detections where the attack is in progress but containment within 15 minutes prevents significant damage. AiTM token theft, lateral RDP to a domain controller, PIM Global Admin activation, suspicious inbox rule creation from a risky session. The attacker has a foothold but has not achieved their objective. Allocate 10-15 rules at this frequency.

Medium severity → 30 minutes to 4 hours. Detections where the pattern is suspicious but not confirmed malicious. Anomalous sign-in properties, unusual file access volumes, first-time access to a sensitive resource, app registration with elevated permissions. These require investigation to confirm. Running them at 5-minute frequency adds alert volume without proportional detection value. Allocate 15-25 rules.

Low/Informational severity → 12-24 hours. Detections for posture monitoring — configuration changes, new app registrations, permission grants, user lifecycle events. These are not incidents requiring immediate response. They are data points for the daily security review or the weekly posture report. Allocate 10-15 rules.

Trigger logic

The trigger condition determines when the query results become an alert. The default trigger is “Alert when number of query results is greater than 0” — any row returned generates an alert.

Threshold triggers filter for volume-based patterns. A single failed sign-in is normal. Fifty failed sign-ins from one IP in 30 minutes is a password spray. Setting the trigger to “greater than 50” means the rule fires only when the aggregate threshold is exceeded — not on every individual failure. The threshold applies to the row count returned by the query, not to a field within the results. If your query uses summarize count() by IPAddress and returns 3 rows (3 IPs each exceeding the threshold), the trigger evaluates against “3 rows returned” — it fires because 3 > 0.

Dynamic thresholds via KQL: For more sophisticated triggering, encode the threshold in the query itself rather than the trigger configuration. Use | where FailedAttempts > 50 and DistinctUsers > 5 in the KQL — the query returns 0 rows when the condition is not met (no alert) and returns rows only when the multi-dimensional threshold is exceeded (alert fires). This approach keeps the trigger at “greater than 0” while the KQL handles the complex threshold logic.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
// THRESHOLD IN KQL  multi-dimensional trigger
let timeWindow = 30m;
let failThreshold = 50;
let userThreshold = 5;
SigninLogs
| where TimeGenerated > ago(timeWindow)
| where ResultType == "50126"  // Invalid username or password
| summarize
    FailedAttempts = count(),
    DistinctUsers = dcount(UserPrincipalName),
    TargetUsers = make_set(UserPrincipalName, 25),
    FirstAttempt = min(TimeGenerated),
    LastAttempt = max(TimeGenerated)
    by IPAddress
| where FailedAttempts > failThreshold
    and DistinctUsers > userThreshold
// Only rows exceeding BOTH thresholds are returned
// Trigger config: "Alert when results > 0"  the KQL handles filtering

Event grouping within a trigger

When the query returns multiple rows and the trigger fires, Sentinel can handle the results in two ways:

Group all events into a single alert. All rows from the query execution are bundled into one alert. The alert details contain all results as an array. Use this for detections where the aggregate pattern is the finding (one password spray = one alert containing all targeted users).

Trigger an alert for each event. Each row produces its own alert. Use this for detections where each row represents an independent finding that requires separate tracking (each compromised account from a spray needs its own investigation thread).

The choice affects SOC workflow directly. A spray rule with “each event” grouping and 15 rows returned produces 15 alerts — if alert-to-incident grouping is also “one per alert” (DE1.9), the SOC sees 15 incidents for one attack. Usually wrong. A spray rule with “all events in one alert” produces 1 alert, which creates 1 incident containing the complete spray data. Usually correct.

⚠ Compliance Myth: "Set all rules to 5-minute frequency for maximum detection speed"

The myth: Faster is always better. Running all rules at 5-minute frequency provides the fastest possible detection across the board.

The reality: Running 50 rules at 5-minute frequency means 50 × 288 = 14,400 query executions per day. Each execution scans data and consumes compute resources. Low-severity rules running at 5-minute frequency generate alert volume that overwhelms the SOC without proportional security value — the analyst triages these alerts hours later regardless of when they fired. The 5-minute frequency should be reserved for the 5-10 rules detecting critical and high-severity techniques. The remaining rules at lower frequencies reduces total daily executions by 70-80% without meaningfully increasing detection latency for non-critical detections. Northgate Engineering’s target allocation: ~3,120 daily executions across 50 rules — achievable by matching frequency to severity.

Try it yourself

Exercise: Calculate your daily query execution budget

For each active scheduled rule, calculate: 1440 / frequency_in_minutes = daily executions. Sum across all rules. Compare to your workspace query limits (check Azure Monitor documentation for current limits). If your total exceeds the limit, identify which low-severity rules can be moved to lower frequencies.

For NE's target library of 50 rules: 5 × 288 (5-min) + 10 × 96 (15-min) + 15 × 48 (30-min) + 10 × 24 (1-hr) + 10 × 1 (24-hr) = 1,440 + 960 + 720 + 240 + 10 = 3,370 daily executions. Well within workspace limits.

Check your understanding

A detection rule for PIM Global Admin activation runs every 60 minutes. An attacker activates Global Admin at 14:05. The rule last ran at 14:00. When is the earliest the alert can fire, and what is the maximum detection latency?

Answer: Earliest alert: 15:00 (the next scheduled execution). Maximum detection latency: ~58 minutes (55 min rule latency + ~3 min ingestion). For PIM Global Admin activation — which could indicate the start of CHAIN-PRIVILEGE — this latency gives the attacker nearly an hour to create app registrations, access executive mailboxes, and begin exfiltration before the SOC is alerted. This rule should run at 5-minute frequency (High severity, significant impact), reducing maximum detection latency to ~8 minutes. That is the difference between catching the attacker during app registration and catching them after exfiltration is complete.

Troubleshooting: Frequency and trigger issues

“My rule fires too often.” Two causes: the threshold is too low for your environment’s baseline activity, or the frequency is creating duplicate detections across overlapping lookback windows. Run the query in Advanced Hunting for 7 days. Count unique detections per day. If it exceeds 10 per day for a single rule, increase the threshold or add filtering.

“My rule never fires.” Either the threshold is too high (no events reach it), the attack pattern does not occur, or the lookback/frequency alignment is wrong (DE1.2). Validate by running the query manually against a known time period when the activity should have occurred.

“My rule fires with stale data — alerting on events from hours ago.” The lookback window is too long relative to the frequency. If the rule runs every hour with a 24-hour lookback, events from 23 hours ago are re-evaluated on every execution. Without deduplication (alert grouping), the same event generates alerts repeatedly. Reduce the lookback to match the frequency + buffer, or configure alert grouping to suppress duplicates.


References used in this subsection

  • Course cross-references: DE1.2 (lookback window), DE1.5 (severity-frequency alignment), DE1.6 (NRT for sub-minute latency), DE1.9 (alert-to-incident grouping), DE9 (threshold tuning)

You're reading the free modules of Detection Engineering

The full course continues with advanced topics, production detection rules, worked investigation scenarios, and deployable artifacts. Premium subscribers get access to all courses.

View Pricing See Full Syllabus