6.2 Analyze Query Results Using KQL
Analyze Query Results Using KQL
Domain 4 — Manage Security Threats: "Create custom hunting queries by using KQL." Summarization and visualization are tested directly — expect questions where you must choose the correct aggregation function for a scenario.
Introduction
In subsection 6.1, you learned to filter and shape data — narrowing millions of rows to the ones relevant to your investigation. But individual rows are only part of the picture. To understand an attack, you need to ask aggregate questions: “How many failed sign-ins came from each country?” “Which IP addresses generated the most traffic?” “What does the sign-in pattern look like over time?”
This subsection teaches you the summarize operator and its aggregation functions — the tools that transform raw log rows into answers. You will learn to count events, count unique values, group results by dimensions, create time-series data, and render visual charts. By the end, you will write a single query that answers “how many failed sign-ins per country per hour for the last 7 days” and displays the result as a time chart.
The operators covered in this subsection are:
summarize— the aggregation engine that groups rows and computes values- Aggregation functions —
count(),dcount(),sum(),avg(),min(),max(),make_set(),make_list(),arg_max(),percentile() bin()— grouping timestamps into time windows for trend analysisrender— generating visual charts from query resultstop— a shortcut for getting the N highest or lowest results
Each function is introduced with what it does and why you need it, then demonstrated with an investigation scenario and practiced in your lab.
The summarize operator — aggregating data for analysis
What it is
summarize groups rows by one or more columns and computes aggregate values for each group. It transforms a table of individual events into a table of summaries — replacing thousands of rows with a concise answer.
Why it matters
During an investigation, individual events tell you what happened. Summaries tell you the scope and pattern. “There are 47 failed sign-ins” is more actionable than scrolling through 47 individual rows. “The failures come from 3 countries, with 41 from Russia and 6 from the Netherlands” is even more actionable. summarize produces these answers.
summarize operator groups individual rows by a key column (IPAddress) and computes aggregations (count, dcount) for each group. 47 individual failure events become 3 rows of actionable intelligence.How it works
The basic syntax is: | summarize AggregationFunction by GroupingColumn
Start with a simple count:
| |
This says: “Group all failed sign-ins by IP address and count how many came from each IP. Sort highest count first.”
| IPAddress | FailedSignins |
|---|---|
| 198.51.100.44 | 47 |
| 203.0.113.88 | 12 |
| 86.12.45.89 | 3 |
Multiple aggregation functions in one query
You can compute multiple aggregations simultaneously:
| |
| IPAddress | FailCount | TargetedUsers | FirstSeen | LastSeen | TargetApps |
|---|---|---|---|---|---|
| 198.51.100.44 | 47 | 5 | Mar 15 08:12 | Mar 21 13:41 | ["Microsoft Office","Azure Portal","Exchange Online"] |
| 203.0.113.88 | 12 | 1 | Mar 18 22:01 | Mar 18 22:14 | ["Microsoft Office"] |
dcount() (distinct count) tells you how many unique users were targeted. make_set() creates a deduplicated list of targeted applications. min()/max() give you the campaign timespan.count() and dcount() together.The aggregation functions reference
These are the functions you will use inside summarize. Each has a specific investigation purpose.
| Function | What it computes | Investigation use | Example |
|---|---|---|---|
count() | Total number of rows in the group | “How many events?” | count() |
dcount(Column) | Number of unique values | “How many distinct users/IPs/apps?” | dcount(UserPrincipalName) |
sum(Column) | Total of a numeric column | “Total data volume transferred” | sum(SentBytes) |
avg(Column) | Average of a numeric column | “Average session duration” | avg(DurationMs) |
min(Column) | Smallest value | “When did this first occur?” | min(TimeGenerated) |
max(Column) | Largest value | “When was the most recent event?” | max(TimeGenerated) |
make_set(Column) | Deduplicated JSON array of values | “Which unique apps were accessed?” | make_set(AppDisplayName) |
make_list(Column) | JSON array including duplicates | “List all error codes in order” | make_list(ResultType) |
arg_max(TimeCol, *) | The row with the maximum value | “Most recent record per device” | arg_max(TimeGenerated, *) |
percentile(Column, N) | Nth percentile value | “What is the P95 response time?” | percentile(DurationMs, 95) |
count() vs dcount() — know the difference, it changes the investigationcount() tells you how many events occurred. dcount() tells you how many unique entities were involved. 47 failed sign-ins (count) from 1 user (dcount) = brute force against one person. 47 failed sign-ins from 47 users = password spray against the organization. Same count, completely different attack type. Always use both together.
The bin() function — grouping by time windows
What it is
bin() rounds timestamp values to a specified interval, grouping events into time windows. When combined with summarize, it creates time-series data — the foundation of trend analysis and visualization.
Why it matters
“47 failed sign-ins” is useful. “47 failed sign-ins, with 40 occurring in a single 5-minute window” is much more useful — it tells you the attack was concentrated, not spread over days. bin() reveals the temporal pattern.
| |
This groups all failures into 1-hour windows and counts how many occurred in each window.
| TimeGenerated | FailCount |
|---|---|
| Mar 15 08:00 | 2 |
| Mar 15 09:00 | 0 |
| Mar 18 22:00 | 34 |
| Mar 18 23:00 | 8 |
| Mar 21 13:00 | 5 |
bin(TimeGenerated, 5m) for minute-by-minute analysis during an active incident, bin(TimeGenerated, 1d) for weekly trends.The render operator — visualizing results
What it is
render generates a visual chart from query results. It transforms tabular data into bar charts, time series, pie charts, and other visualizations directly in the Sentinel Logs query editor.
Why it matters
Patterns that are invisible in a table of numbers become obvious in a chart. A time series showing a spike is instantly readable. A bar chart showing one country dominating failure counts tells a story that 20 rows of numbers do not.
| |
This produces a line chart with time on the x-axis and failure count on the y-axis. The attack spike becomes an unmistakable visual peak.
Visualization types:
| Render type | Best for | Example |
|---|---|---|
timechart | Trends over time | Sign-in failures per hour |
barchart | Comparing categories | Failures per country |
columnchart | Comparing categories (vertical) | Events per application |
piechart | Proportion of a whole | Success vs failure ratio |
scatterchart | Correlation between two values | Event count vs response time |
| |
Try it yourself
| |
In your lab, you may see only 2-3 applications (Microsoft Office, Azure Portal, perhaps Exchange Online). In production, this query reveals the application usage profile of your organization — critical baseline knowledge for identifying when an attacker accesses an unusual application.
The top operator — a shortcut for ranked results
top N by Column returns the N rows with the highest (or lowest) value in the specified column. It is a shortcut for | order by Column desc | take N.
| |
Returns the 5 IPs with the most failures. Quick, clean, and used frequently in investigation for rapid prioritization.
The arg_max pattern — most recent record per entity
What it is
arg_max(TimeGenerated, *) within a summarize ... by EntityId returns the entire most recent row for each entity. This is the deduplication pattern — essential when a table contains multiple records per device, user, or IP and you want only the latest.
Why it matters
DeviceInfo contains a row for every device check-in (multiple per day). If you want the current status of each device, you need the most recent record, not all of them.
| |
This returns exactly one row per device — the most recent — with all columns from that row. You used this pattern in Module 7 (if following the non-standard build order) and will use it in every device investigation.
arg_max pattern appears in almost every device and user queryAny table that records periodic status updates (DeviceInfo, AADNonInteractiveUserSignInLogs) needs deduplication with arg_max to get the current state. Without it, you see every historical record, making counts and status reports inaccurate.
Investigation scenario: building a failed sign-in dashboard query
Combine everything from this subsection into a single comprehensive query:
| |
| IPAddress | FailCount | UniqueUsers | UniqueApps | FirstAttempt | LastAttempt | Countries |
|---|---|---|---|---|---|---|
| 198.51.100.44 | 47 | 5 | 3 | Mar 15 08:12 | Mar 21 13:41 | ["RU"] |
| 203.0.113.88 | 12 | 1 | 1 | Mar 18 22:01 | Mar 18 22:14 | ["NL"] |
Check your understanding
1. You need to determine how many unique users were targeted by a specific IP address. Which aggregation function answers this?
dcount() (distinct count) is the investigation function for answering "how many unique entities." Use it alongside count() — the combination reveals attack type. Same count + low dcount = focused attack. Same count + high dcount = spray attack.2. Your time-series chart shows a flat line all week with a single massive spike on Wednesday at 22:00. What does this pattern indicate?
bin(TimeGenerated, 5m)) to see the attack's exact start and end time, then pivot to the source IP analysis.3. DeviceInfo contains multiple rows per device per day. You want a table showing the current status of each device (one row per device). Which pattern do you use?
arg_max deduplication pattern is essential for any table with periodic updates. It returns one row per group key (DeviceId) with the most recent timestamp, and * means "include all columns from that row." You will use this constantly.4. You want to understand the temporal pattern of an attack. Which bin() interval would you choose for initial analysis, and when would you change it?