1. You are hunting for users with anomalous download volume. User A downloaded 500 files this week (their normal is 400). User B downloaded 50 files this week (their normal is 5). Which user should you investigate first, and which KQL pattern identifies them?
User A — 500 downloads is a higher absolute volume. Use summarize | sort by count desc to rank by volume.
User B — their deviation ratio is 10x (50/5) versus User A's 1.25x (500/400). The per-entity baseline comparison (TH2.1) identifies User B because the behavior change is larger relative to their own history, even though the absolute volume is smaller. Sorting by count ranks User A higher. Sorting by deviation ratio ranks User B higher. In hunting, the behavior change is the signal — not the absolute volume.
Both equally — any deviation from baseline warrants investigation regardless of magnitude.
Neither — 500 and 50 are both below the z-score threshold of 3 for the population.
Correct. The deviation ratio (current / baseline) identifies the entity whose behavior changed the most. User B's 10x increase is a stronger hunting signal than User A's 1.25x increase. Population z-scores might rank User A higher (higher absolute volume), but the per-entity comparison correctly prioritizes User B.
2. You run make-series with step 1d (daily bins) on sign-in data to detect AiTM token replay bursts. The query returns no anomalies. A colleague suggests the AiTM session produced a burst of 40 token refreshes within a single hour. Why might the daily bins have missed it?
Daily bins aggregate the hour-level burst into a single day's total. If the user's normal daily sign-in count is 50, and the day with the AiTM burst had 90 total sign-ins (50 normal + 40 burst), the daily value of 90 may not be anomalous enough to trigger series_decompose_anomalies at the daily level — because 90 is only 1.8x the normal of 50. With hourly bins, the burst hour would show 40 sign-ins against a baseline of 2–3 per hour — a clear anomaly. The bin size must match the technique's temporal pattern. AiTM token replay produces intra-hour bursts. Hourly or sub-hourly bins are required.
Daily bins are always sufficient — the anomaly detection threshold should be lowered instead.
The query should use AADNonInteractiveUserSignInLogs instead of SigninLogs — the table choice is the issue, not the bin size.
make-series cannot detect intra-day bursts at any bin size — use a different operator.
Correct. The bin size averaged out the burst. The third option is also partially correct — AADNonInteractiveUserSignInLogs is the better table for token replay — but the question is specifically about why daily bins missed a known intra-hour burst, which is a bin size problem.
3. Your hunt query uses a let statement that references a 30-day baseline. The query joins the baseline against current data and then uses the baseline again to calculate a population average. The query times out after 10 minutes. What is the fix?
Reduce the baseline window from 30 days to 7 days to decrease data volume.
Wrap the let statement in materialize(). The baseline is referenced twice (join + aggregate), so without materialize, the full 30-day scan runs twice. materialize() caches the result after the first scan, and the second reference uses the cache. This halves the computation and should bring the query within the 10-minute timeout.
Split the query into two separate queries and run them sequentially.
Run the query in Sentinel Log Analytics instead of Advanced Hunting — it has longer timeout limits.
Correct. materialize() is the targeted fix for multi-reference subquery timeouts. The first option (shorten baseline) would work but degrades the baseline quality. The third option (split queries) would work but requires manual state management between queries. The fourth option may have different limits but does not fix the fundamental redundant computation.
4. You run autocluster() on 200 hunt results (sign-ins from new IPs). It identifies 3 clusters covering 90% of results. You exclude the clustered results and investigate the remaining 20. After investigation, 18 are legitimate (VPN changes from a provider not in your exclusion list) and 2 are genuinely suspicious. What should you do with the 18 false positives?
Discard them — they are noise and do not contribute to the hunt outcome.
Document the 18 as a new FP category (TH1.11): "VPN provider [name] IPs not in current exclusion list." Add these IPs to the HuntExclusions_VPNIPs watchlist. When the detection rule is built from this hunt, the watchlist exclusion prevents these 18 from becoming recurring false positive alerts. The 18 FPs just made the detection rule more precise before it was even deployed — they are the most useful part of the hunt (TH1.11).
Add them as a 4th cluster in the autocluster output and rerun the analysis.
Lower the autocluster SizeWeight to capture them in the initial clustering pass.
Correct. False positives are data. The 18 VPN-related FPs produce a watchlist update that improves both the current hunt's exclusion accuracy and the future detection rule's precision. TH1.11 covers this FP analysis loop in detail.
5. You write a hunt query that filters SigninLogs with: `| where AppDisplayName == "microsoft azure portal"`. The query returns zero results. You know users access the Azure Portal daily. What is wrong?
The Azure Portal generates sign-ins in a different table, not SigninLogs.
Anti-pattern 1: case-sensitive comparison. The == operator is case-sensitive. The actual value in SigninLogs is "Microsoft Azure Portal" (title case), not "microsoft azure portal" (lowercase). The fix: use =~ for case-insensitive comparison, or use has which is case-insensitive by default. This anti-pattern returns zero results with no error — the query runs successfully but matches nothing.
The time window is too short — extend to 30 days.
AppDisplayName is a dynamic field that needs parse_json before comparison.
Correct. The == operator is case-sensitive in KQL. "microsoft azure portal" != "Microsoft Azure Portal". Use =~ or has. This is the most common anti-pattern in hunting queries because it produces zero results (easily mistaken for "no activity") rather than an error.
6. You need the most recent sign-in for each of 500 suspect users. Which approach is correct?
For each user, run a separate query: `SigninLogs | where UPN == "user" | sort by TimeGenerated desc | take 1`. Repeat 500 times.
SigninLogs | where UPN in (suspectList) | sort by TimeGenerated desc | take 500 — this returns the 500 most recent sign-ins globally.
SigninLogs | where UPN in (suspectList) | summarize arg_max(TimeGenerated, *) by UserPrincipalName — this returns the most recent sign-in per user in a single pass. arg_max extracts the full row with the maximum TimeGenerated for each group. One query, 500 results (one per user), all columns preserved.
SigninLogs | where UPN in (suspectList) | top 1 by TimeGenerated — this returns only the single most recent sign-in across all 500 users.
Correct. arg_max(TimeGenerated, *) by UserPrincipalName is the per-entity extraction pattern. It returns one row per user (the most recent) with all columns. The other approaches either require 500 separate queries, return global instead of per-user results, or return only one row total.
7. Your hunt finds a phishing email delivered at 09:45 UTC and a risky sign-in at 10:23 UTC for the same user. You join with: `| where SignInTime between (PhishTime .. (PhishTime + 48h))`. Another analyst points out that a different sign-in for the same user at 08:30 UTC (before the phishing email) also appears in the results. Why?
The between clause includes events before PhishTime — it is a bidirectional window.
The join matched on UserPrincipalName but the between clause only filtered the SignInTime. The 08:30 sign-in matched the join condition (same user) and was not eliminated by the between clause if it occurred on a different day that fell within the 48-hour window of a different phishing email. If the user received multiple phishing emails, each creates a 48-hour window. The 08:30 sign-in may fall within the window of a previous phishing email. The fix: add an additional filter requiring SignInTime > PhishTime (strict ordering) and ensure each sign-in is matched to only the most recent preceding phishing email, not all of them.
The between operator in KQL is inclusive of both boundaries and rounds to the nearest hour.
The 08:30 sign-in is a false positive that should be filtered out manually.
Correct. Joins that match on entity (user) with temporal filters can produce unexpected results when multiple events match in either table. The between clause constrains timing but the join may match across multiple phishing emails if the user received more than one. Strict ordering (SignInTime > PhishTime) and careful deduplication are needed.
8. You want to track which applications each user accesses and how frequently, to compare a suspect user's app profile against their baseline. Which aggregation pattern produces a per-user application-to-frequency map?
summarize count() by UserPrincipalName, AppDisplayName — produces one row per user-app combination (a flat table, not a map).
make_set(AppDisplayName) by UserPrincipalName — produces a list of apps per user but without frequency counts.
First summarize count() by User, App, then summarize make_bag(bag_pack(App, Count)) by User. This produces one row per user with a JSON bag: {"Outlook": 45, "Teams": 30, "SharePoint": 12}. The bag is the per-user app profile — it maps each application to its access frequency. Compare the suspect user's bag against their baseline bag to identify new or disproportionate application access.
top-nested 10 of UserPrincipalName, top-nested 5 of AppDisplayName — produces a hierarchical frequency table but only for the top users and top apps.
Correct. The two-step pattern — summarize count per user+app, then make_bag per user — produces a key-value profile. The first option is a flat table (useful but not a map). The second loses frequency. The fourth is top-N only, not a complete profile.
💬
How was this module?
Your feedback helps us improve the course. One click is enough — comments are optional.
The full course continues with advanced topics, production detection rules, worked investigation scenarios, and deployable artifacts. Premium subscribers get access to all courses.