In this module
1.2 The Query Pipeline — How Operators Chain Left to Right
Figure 1.2 — Operational workflow from input through documented output.
The Query Pipeline — How Operators Chain Left to Right
Introduction
Figure — The Query Pipeline — How Operators Chain. Applied to security investigation workflows at NE.
KQL is a pipeline language. You start with a data source (a table), then pipe the data through a sequence of operators using the | (pipe) character. Each operator takes the table from the previous step, transforms it, and passes the result to the next operator. The final operator's output is the query result.
This is fundamentally different from SQL, where you write a declarative statement and the engine decides the execution order. In KQL, you control the execution order explicitly — the operators execute top to bottom, left to right, exactly as you write them. The order you choose affects both the result and the performance.
The anatomy of a KQL query
SigninLogs // Step 1: start with a table
| where TimeGenerated > ago(24h) // Step 2: filter rows by time
| where ResultType != 0 // Step 3: filter to failures only
| extend Country = tostring( // Step 4: extract country from
LocationDetails.countryOrRegion) // dynamic column
| summarize // Step 5: aggregate
FailCount = count(),
DistinctUsers = dcount(UserPrincipalName)
by IPAddress, Country
| where FailCount > 20 // Step 6: filter aggregated results
| sort by FailCount desc // Step 7: order output// SLOW: summarize all data, then filter
SigninLogs
| where TimeGenerated > ago(30d)
| summarize count() by UserPrincipalName, IPAddress
| where UserPrincipalName == "j.morrison@northgateeng.com"
// FAST: filter first, then summarize the small dataset
SigninLogs
| where TimeGenerated > ago(30d)
| where UserPrincipalName == "j.morrison@northgateeng.com"
| summarize count() by IPAddress// Extend BEFORE summarize: Country is available for grouping
SigninLogs
| where TimeGenerated > ago(24h)
| extend Country = tostring(LocationDetails.countryOrRegion)
| summarize count() by Country
// Extend AFTER summarize: Country does not exist yet — ERROR
SigninLogs
| where TimeGenerated > ago(24h)
| summarize count() by UserPrincipalName
| extend Country = tostring(LocationDetails.countryOrRegion) // ERROR: LocationDetails doesn't exist after summarizeSigninLogs
| where TimeGenerated > ago(24h)
| where ResultType != "0"
| extend Country = tostring(LocationDetails.countryOrRegion)
// | summarize FailCount = count() by IPAddress, Country // <-- commenting this out
// | where FailCount > 20 // <-- and this
// | sort by FailCount desc // <-- and this// Run each of these separately:
SigninLogs | where TimeGenerated > ago(24h) | count
// → 1,200,000
SigninLogs | where TimeGenerated > ago(24h) | where ResultType != "0" | count
// → 45,000
SigninLogs | where TimeGenerated > ago(24h) | where ResultType != "0"
| extend Country = tostring(LocationDetails.countryOrRegion) | count
// → 45,000 (extend does not change row count)
SigninLogs | where TimeGenerated > ago(24h) | where ResultType != "0"
| extend Country = tostring(LocationDetails.countryOrRegion)
| summarize FailCount = count() by IPAddress, Country | count
// → 3,200 (aggregation reduces to unique groups)// SLOW: join two full tables (millions × millions)
SigninLogs
| where TimeGenerated > ago(24h)
| join kind=inner OfficeActivity on $left.UserPrincipalName == $right.UserId
// FAST: aggregate each, then join the summaries (thousands × thousands)
let signInSummary = SigninLogs
| where TimeGenerated > ago(24h)
| summarize SignIns = count() by UserPrincipalName;
let activitySummary = OfficeActivity
| where TimeGenerated > ago(24h)
| summarize Actions = count() by UserId;
signInSummary
| join kind=leftouter activitySummary
on $left.UserPrincipalName == $right.UserId// Query A: extend THEN where
SigninLogs
| where TimeGenerated > ago(24h)
| extend Country = tostring(LocationDetails.countryOrRegion)
| where Country == "US"
| summarize count() by UserPrincipalName
// Query B: where THEN extend
SigninLogs
| where TimeGenerated > ago(24h)
| where tostring(LocationDetails.countryOrRegion) == "US"
| summarize count() by UserPrincipalName// Query C: summarize THEN extend
SigninLogs
| where TimeGenerated > ago(24h)
| summarize TotalEvents = count(), FailCount = countif(ResultType != "0") by UserPrincipalName
| extend FailRate = round(100.0 * FailCount / TotalEvents, 1)
| where FailRate > 50
// Query D: extend THEN summarize (WRONG)
SigninLogs
| where TimeGenerated > ago(24h)
| extend FailRate = round(100.0 * countif(ResultType != "0") / count(), 1) // ERROR
| where FailRate > 50// Step 1: start with the table and time filter — verify data exists
SigninLogs
| where TimeGenerated > ago(1h)
| count
// → 12,500 rows. Good — data exists.
// Step 2: add the entity filter — verify the user has activity
SigninLogs
| where TimeGenerated > ago(1h)
| where UserPrincipalName =~ "suspect@northgateeng.com"
| count
// → 47 rows. Good — the user has recent sign-ins.
// Step 3: add projection — see the relevant columns
SigninLogs
| where TimeGenerated > ago(1h)
| where UserPrincipalName =~ "suspect@northgateeng.com"
| project TimeGenerated, IPAddress, AppDisplayName, ResultType,
tostring(LocationDetails.countryOrRegion)
// → 47 rows with 5 columns. Review the data — what patterns emerge?
// Step 4: add aggregation based on what you observed
SigninLogs
| where TimeGenerated > ago(1h)
| where UserPrincipalName =~ "suspect@northgateeng.com"
| summarize count() by IPAddress, tostring(LocationDetails.countryOrRegion)
// → 3 rows: 2 IPs from GB, 1 IP from NG. The Nigerian IP is suspicious.
// Step 5: focus on the suspicious IP
SigninLogs
| where TimeGenerated > ago(1h)
| where UserPrincipalName =~ "suspect@northgateeng.com"
| where IPAddress == "198.51.100.44"
| project TimeGenerated, AppDisplayName, ResultType
| sort by TimeGenerated asc
// → Timeline of the adversary session.SigninLogs // | count → ?
| where TimeGenerated > ago(24h) // | count → ?
| where ResultType != 0 // | count → ?
| summarize count() by IPAddress // | count → ?
| where count_ > 10 // | count → ?Try it yourself
Take a query you wrote recently and add | count after eve...
Take a query you wrote recently and add | count after every operator to see how the row count changes at each stage. For example:
Run each version (uncomment one | count at a time). This shows you exactly how many rows each operator processes — and where the biggest reductions happen. The operator with the largest row reduction is your most important filter.
Check your understanding
1. You write a query: SigninLogs | summarize count() by IPAddress | where IPAddress has "10.0.". A colleague suggests rewriting it as: SigninLogs | where IPAddress has "10.0." | summarize count() by IPAddress. Both return the same result. Which is better and why?
The myth: Reading the Microsoft KQL documentation is sufficient to learn this concept. The documentation explains the syntax — what more do you need?
The reality: Documentation teaches SYNTAX. This course teaches APPLICATION. The KQL docs show how summarize works with generic examples. This course shows how summarize reveals a password spray pattern hiding in 45,000 sign-in events. The difference between knowing the syntax and knowing WHEN and WHY to use it in a security context is the difference between an analyst who copies queries and one who writes them. Every operator in this course is taught through a security investigation scenario — not abstract data manipulation.
Troubleshooting
"The query returns an error I do not understand." KQL error messages reference the specific line and operator that failed. Read the error message from left to right: it names the operator, the expected input type, and the actual input type. Most errors are type mismatches (passing a string where a datetime is expected) or field name typos. The getschema operator shows every field name and type for any table: TableName | getschema.
"The query runs but returns unexpected results." Add | take 10 after each operator in the pipeline and examine the intermediate output. This reveals WHERE the data transforms in a way you did not expect. Debug the pipeline stage by stage, not the entire query at once.
You are writing a KQL query and cannot remember the exact operator syntax. Do you check the Microsoft documentation or ask Claude?
Check the documentation first for SYNTAX questions (exact parameter names, return types, operator behavior). The KQL documentation at learn.microsoft.com/kusto is the authoritative source. Claude may generate plausible but incorrect syntax — especially for newer operators or recently changed behavior. Use Claude for DESIGN questions: 'I need to detect accounts with sign-ins from more than 3 countries in 24 hours — what is the best KQL approach?' Claude excels at designing the query strategy; the documentation excels at providing the exact syntax.
You've learned how KQL processes data.
K0 gave you the query language's place in the Microsoft security stack. K1 took you through the semantics — tables, operators, the pipe model, and why KQL isn't SQL. Now you write the queries that find what attackers hope you miss.
- 12 modules of query craft — filtering and shaping, joins and unions, time-series analysis, summarisation, string manipulation, and geospatial analysis
- 68 KQL exercises — every one with a realistic dataset, a reference solution, and a discussion of alternative approaches
- K11 — Threat Hunting with KQL — the course's flagship module. Hypothesis-driven methodology, MITRE ATT&CK-aligned hunting across 7 techniques, UEBA composite risk scoring, and retroactive IOC sweeps
- K13 Capstone — The Hunting Lab — three complete investigation scenarios requiring every query skill from the course
- Hunt management and ROI metrics — the operating model that justifies KQL hunt programs to leadership