Basic summarize with count() and dcount() handles simple aggregations. Hunting requires more — conditional counting, set operations on aggregated data, rolling windows, and multi-level aggregation that answers layered questions like "for each user, across how many unique countries, accessing how many distinct applications, with how many sessions exceeding the baseline?" This subsection teaches the aggregation patterns that campaign modules use for complex behavioral analysis.
Deliverable: Fluency with countif(), make_set(), make_bag(), set intersections, and multi-pass aggregation patterns used in M365 hunting campaigns.
⏱ Estimated completion: 25 minutes
Beyond count and dcount
Every analyst knows count() and dcount(). The aggregation patterns below are less common but appear in every campaign module.
countif() avoids multiple passes over the same data. Without it, you would need separate queries for each condition, then join the results. With it, one pass produces all the counts.
make_set() and make_bag() — collecting values per entity
//Per-userprofile:allIPs,allcountries,allapps,alldevicesSigninLogs|whereTimeGenerated>ago(7d)|whereResultType==0|summarizeIPs=make_set(IPAddress,20),Countries=make_set(tostring(LocationDetails.countryOrRegion),10),Apps=make_set(AppDisplayName,15),Devices=make_set(tostring(DeviceDetail.displayName),10),SignInCount=count()byUserPrincipalName//Eachuser's complete authentication profile in one row
// make_set(x, N) collects up to N distinct values
// This profile is the basis for anomaly comparison:
// "this user normally signs in from 3 IPs in 2 countries
// accessing 5 apps from 2 devices"
// A hunt that finds them signing in from a 4th country
// accessing a 6th app from a 3rd device = investigate
Comparing two users’ profiles — shared compromise indicators
When a hunt finds two suspect users, comparing their profiles reveals whether they share infrastructure (same attacker) or are independent compromises:
Other set operations available: set_intersect() (common elements), set_union() (all elements from both), set_has_element() (membership test).
Multi-pass aggregation — layered questions
Some hunting questions require aggregating, then aggregating again. “How many users have more than 3 new IPs?” First aggregate per user (count new IPs), then aggregate across users (count users exceeding threshold).
//Buildakey-valueprofileperuser:app→sign-incountSigninLogs|whereTimeGenerated>ago(7d)|whereResultType==0|summarizeSignInCount=count()byUserPrincipalName,AppDisplayName|summarizeAppProfile=make_bag(bag_pack(AppDisplayName,SignInCount))byUserPrincipalName//Eachuserhasabag:{"Outlook":45,"Teams":30,"SharePoint":12}//Usefulforcomparingprofilesbetweenusersorbetweentimeperiods//Anattacker's app profile will differ from the user'snormalprofile//—accessingadmintoolstheuserneverused,ormail-onlyaccess//whentheusernormallyaccesses5+applications
Figure TH2.14 — Four advanced aggregation patterns. Each enables a multi-dimensional hunting analysis in a single query.
Try it yourself
Exercise: Build a per-user authentication profile
Run the make_set() profile query against your environment. Pick 3 users from the results. Examine their profiles: IPs, countries, apps, devices. Does the profile match your expectation for their role?
Then run the set_difference query. How many users have new IPs this week? This is the foundation of the TH4 campaign — you are previewing it with production data.
⚠ Compliance Myth: "Complex aggregation queries are hard to audit — keep queries simple for compliance"
The myth: Compliance teams cannot audit complex KQL queries. Simpler queries are more defensible.
The reality: Compliance audits evaluate whether the hunting methodology is followed and whether findings are documented — not whether the KQL is simple or complex. A complex query that correctly identifies behavioral anomalies with documented results is more defensible than a simple query that misses them. Document the purpose of each query in the hunt record (TH1.7). The auditor reviews the record, not the syntax. Query complexity is an analyst skill concern, not a compliance concern.
Extend these patterns
KQL also supports `variance()`, `variancep()`, and `binary_all_and()` / `binary_all_or()` for specialized aggregations. `variance()` is useful in TH12 (C2 beaconing) for detecting low-variance inter-arrival times that indicate automated communication. `binary_all_and()` is useful for flag-based analysis where multiple conditions must all be true. These are niche — the four patterns in this subsection cover 90% of hunting aggregation needs.
The full course continues with advanced topics, production detection rules, worked investigation scenarios, and deployable artifacts. Premium subscribers get access to all courses.