In this module

1.7 Check My Knowledge

3-4 hours · Module 1 · Free

Check My Knowledge

1. You run a query against DeviceProcessEvents and use extend ParentPID = toint(InitiatingProcessParentId). The query returns fewer rows after you add | where ParentPID > 0. Some missing rows have a valid InitiatingProcessParentId. What happened?

InitiatingProcessParentId may contain values that toint() cannot convert (non-numeric strings, values exceeding 32-bit int range, or null values). The toint() conversion returned null for those rows, and null > 0 evaluates to false, excluding them. Fix: use tolong() instead (process IDs can exceed 32-bit range), and add an explicit null check: where isnotnull(ParentPID) and ParentPID > 0. Or, investigate the rows with failed conversion: where isempty(InitiatingProcessParentId) or not(InitiatingProcessParentId matches regex "^[0-9]+$").

DeviceProcessEvents has corrupted data in that column

The where filter should use >= instead of >

Type conversion failure → null → comparison failure → silent row exclusion. Always verify conversion results, and use tolong() for process IDs that may exceed 32-bit int range.

2. You write this query to find sign-ins from hosting providers:

SigninLogs
| where TimeGenerated > ago(7d)
| where NetworkLocationDetails has "hosting"
| summarize count() by UserPrincipalName

A colleague points out that this misses sign-ins where NetworkLocationDetails is null. Why does this matter for threat detection?

Adversaries using certain authentication methods (service principal sign-ins, token-based access, some API calls) may not populate the NetworkLocationDetails field. If the field is null, the has "hosting" check fails, and the sign-in is excluded from results. The adversary's hosting-provider sign-in is invisible. For detection rules, this is critical — the rule silently misses a class of adversary behavior. The fix: separate the detection into two queries — one for sign-ins with NetworkLocationDetails containing "hosting", and one for sign-ins with null NetworkLocationDetails (which should be investigated separately as potentially suspicious).

It does not matter — null NetworkLocationDetails means the sign-in is legitimate

The query should use contains instead of has

Null fields hide adversary activity. Authentication methods that do not populate location fields are exactly the methods adversaries prefer. A detection rule that ignores nulls has a blind spot where the adversary operates.

3. Why does the pipeline order matter in this query?

SigninLogs
| where TimeGenerated > ago(30d)
| extend City = tostring(LocationDetails.city)
| summarize dcount(City) by UserPrincipalName
| where dcount_City > 5

What happens if you move the extend after the summarize?

If you move extend after summarize, the query fails. After summarize, the only columns that exist are UserPrincipalName and dcount_City. The LocationDetails column no longer exists — it was consumed by the summarize operation. You cannot extend a column from data that has already been aggregated away. The pipeline is sequential: each operator works on the output of the previous step, and summarize reduces the column set to only the by-columns and aggregation columns. Any column you need in the output must be referenced before or within the summarize.

The query produces the same result — order does not matter for extend

The query runs but produces fewer rows

After summarize, only the by-columns and aggregation columns exist. All other columns from the source table are gone. The pipeline is sequential — you cannot reference data that was consumed by a previous operator.

4. You are reviewing a colleague's detection rule. The query starts with SigninLogs | extend Country = tostring(LocationDetails.countryOrRegion) | where TimeGenerated > ago(1h). What would you change and why?

Move the TimeGenerated filter BEFORE the extend. Currently, the extend runs against ALL rows in SigninLogs before the time filter reduces the dataset. If SigninLogs has 30 days of data (millions of rows), the extend processes every row — extracting JSON from LocationDetails for rows that will be immediately discarded by the time filter. Rewritten: SigninLogs | where TimeGenerated > ago(1h) | extend Country = tostring(LocationDetails.countryOrRegion). The time filter reduces to ~50,000 rows first, then extend processes only those 50,000. For a detection rule running every 5 minutes, this difference matters — the optimized version runs in seconds, the original may take 30+ seconds.

Nothing — the query engine automatically optimizes the order

Remove the extend entirely — it is not needed

Filter before extending. The query engine provides some automatic optimization, but explicitly placing time filters first is a best practice that guarantees partition pruning occurs before any computation.

💬

How was this module?

Your feedback helps us improve the course. One click is enough — comments are optional.

Thank you — your feedback has been received.

You've learned how KQL processes data.

K0 gave you the query language's place in the Microsoft security stack. K1 took you through the semantics — tables, operators, the pipe model, and why KQL isn't SQL. Now you write the queries that find what attackers hope you miss.

12 modules of query craft — filtering and shaping, joins and unions, time-series analysis, summarisation, string manipulation, and geospatial analysis
68 KQL exercises — every one with a realistic dataset, a reference solution, and a discussion of alternative approaches
K11 — Threat Hunting with KQL — the course's flagship module. Hypothesis-driven methodology, MITRE ATT&CK-aligned hunting across 7 techniques, UEBA composite risk scoring, and retroactive IOC sweeps
K13 Capstone — The Hunting Lab — three complete investigation scenarios requiring every query skill from the course
Hunt management and ROI metrics — the operating model that justifies KQL hunt programs to leadership

Unlock the full course with Premium See Full Syllabus

← Previous Next →