6.1 Construct KQL Statements for Microsoft Sentinel
Construct KQL Statements for Microsoft Sentinel
Domain 4 — Manage Security Threats: "Identify threats by using Kusto Query Language (KQL)" and "Create custom hunting queries by using KQL." This subsection builds the foundation for every KQL exam question.
Introduction
This subsection teaches you the fundamental building blocks of KQL — the operators you will use in every query you write for the rest of this course and your career as a SOC analyst.
KQL (Kusto Query Language) was designed by Microsoft for querying large volumes of structured log data. It is the query language behind Microsoft Sentinel, Microsoft Defender XDR Advanced Hunting, Azure Monitor, and Azure Data Explorer. If you have used SQL before, KQL will feel partially familiar — both query tabular data — but KQL is read-only (you cannot insert, update, or delete data) and its syntax flows differently.
If you have never used a query language before, that is fine. This subsection assumes no prior query experience. You will write your first query within the next few minutes, and by the end of this subsection you will construct a complete investigation query that traces a brute-force attack from first failure to successful compromise.
Here is what you will learn in this subsection, in order:
- The pipe model — how KQL processes data step by step, and why understanding this model is the key to writing any query
- The
whereoperator — filtering rows to find the data relevant to your investigation, with every comparison operator you need for security work - The
projectoperator — controlling which columns appear in your output, and why column order matters - The
extendoperator — adding new calculated columns that transform raw log data into investigation-ready information - The
letstatement — naming values and sub-queries for clarity and reuse, including a critical detection pattern you will use in Module 9 - The
order byoperator — sorting results for timeline reconstruction - The
searchoperator — finding data when you do not know which table contains it - A complete investigation query — built step by step from a realistic alert scenario, combining every operator into a single query that tells the story of an attack
Each operator is introduced with its purpose, taught through a security investigation example, and practiced in your lab environment. You will run every query yourself, see the output, and understand what it means before moving to the next operator.
Open your Sentinel workspace now (portal.azure.com → Microsoft Sentinel → Logs) and keep it alongside this page. You will use it throughout.
The pipe model — how KQL thinks
Every KQL query follows the same pattern: start with a table, then transform the data step by step using the pipe character (|). Each step takes the output of the previous step and does something to it — filtering, shaping, computing, or sorting.
Think of it like a funnel:
- Start wide — select a table with millions of rows
- Filter down — keep only the rows that matter
- Shape the output — choose which columns to display
- Summarize or sort — organize the results for analysis
Here is the simplest possible KQL query:
| |
This returns every row in the SigninLogs table. In a production environment with 500 users, that could be hundreds of thousands of rows for a single day. It is technically valid KQL, but useless for investigation — you would drown in data.
Now add the pipe to start filtering:
| |
This says: “Start with everything in SigninLogs. Then, keep only the rows where the timestamp is within the last hour.” The pipe (|) means “take the output from the line above and pass it to the next operation.”
Run this in your lab environment now. Paste the query into the Sentinel Logs editor and click Run.
| TimeGenerated | UserPrincipalName | IPAddress | ResultType | Location | AppDisplayName | ... |
|---|---|---|---|---|---|---|
| 14:32:01 | j.morrison@yourdomain.onmicrosoft.com | 86.12.45.89 | 0 | GB | Microsoft Office | ... |
| 14:28:44 | s.patel@yourdomain.onmicrosoft.com | 86.12.45.89 | 0 | GB | Azure Portal | ... |
ResultType = 0 means success. Your lab will show the sign-ins you generated in Module 0.6. Notice how many columns the table returns — SigninLogs has over 50 columns. Most are irrelevant to any given investigation. You will learn to control which columns appear using project later in this subsection.The key mental model: every line in a KQL query transforms the data from the line above. The table name is your starting point. Every pipe adds a transformation. The final output is what appears in your results pane. This pipe-by-pipe flow is how you should read, write, and debug every KQL query.
The where operator — filtering to what matters
What it is
where evaluates a condition against every row and keeps only the rows where the condition is true. Everything else is discarded. It is the most frequently used operator in security KQL — nearly every query starts with one or more where clauses that narrow millions of rows down to the handful relevant to your investigation.
Why it matters
A production Sentinel workspace ingests tens of millions of events per day. Without where, every query returns all of them. With where, you ask precise questions: “Show me failed sign-ins from outside our network in the last hour.” Each where clause narrows the dataset further, like adding filters to a camera lens until only the subject is in focus.
How it works
Syntax: | where ColumnName Operator Value
Start with the query you already ran, and add a filter for failed sign-ins:
| |
Run this now. You are looking at failed sign-ins in the last 24 hours. ResultType != "0" means “not successful” — every non-zero ResultType code represents a specific failure reason.
| TimeGenerated | UserPrincipalName | IPAddress | ResultType | ResultDescription |
|---|---|---|---|---|
| 13:41:22 | j.morrison@yourdomain.onmicrosoft.com | 198.51.100.44 | 50126 | Invalid username or password |
| 13:41:19 | j.morrison@yourdomain.onmicrosoft.com | 198.51.100.44 | 50126 | Invalid username or password |
ResultType 50126 specifically means wrong password. In your lab, you may not see failures unless you deliberately entered a wrong password during Module 0.6 setup — that is expected.You now have a filter chain: all sign-ins → last 24 hours → failures only. Each where reduces the dataset further.
has for text searches (fast, indexed). Use contains only when you need partial-word matching.The comparison operators for security work
These are the operators you will use inside where clauses. You do not need to memorize this table — you will internalize them through use. But reference it when you need an operator and cannot remember the syntax.
| Operator | Meaning | Example | When to use it |
|---|---|---|---|
== | Equals (case-sensitive) | where ResultType == "0" | Exact match on a known value |
!= | Not equals | where ResultType != "0" | Exclude a known value |
>, <, >=, <= | Greater/less than | where TimeGenerated > ago(1h) | Time ranges, numeric comparisons |
has | Contains whole term (case-insensitive, indexed) | where UserPrincipalName has "morrison" | Find a user by partial name — fast |
contains | Contains substring (case-insensitive, not indexed) | where ResultDescription contains "password" | Search within free-text fields — slow |
startswith | Begins with | where IPAddress startswith "198.51" | Filter by IP range prefix |
in | Matches any value in a list | where ResultType in ("50126", "50053") | Check against multiple known values |
!in | Does NOT match any value | where AppDisplayName !in ("Microsoft Office", "Azure Portal") | Exclude a list of known-good values |
between | Within a range | where TimeGenerated between (datetime(2026-03-20) .. datetime(2026-03-21)) | Precise time window during an incident |
has vs contains — this matters more than you thinkhas searches for whole terms using the table's built-in index. It is fast — milliseconds on a table with millions of rows. contains searches for any substring match without using an index. It is slow — seconds or minutes on the same table. In almost every security investigation, has gives the same result as contains. Use has by default. Switch to contains only when you need to match a partial word inside a larger string (rare). This is one of the most impactful performance habits you can build early.
Combining conditions
Use and and or to build compound filters. You can chain where clauses on separate lines (implicit and) or combine conditions in a single line.
These two queries are identical:
| |
Method 1 is preferred for readability — each line states one filter condition, making the query easier to understand and debug. During an active investigation, readable queries save time.
The third filter (IPAddress !startswith "86.12") excludes your office network (assuming 86.12.x.x is your known IP range). Now you are filtering out the noise — your own users who mistyped passwords — and focusing on external failed sign-in attempts.
Try it yourself — build it before you read on
| |
Why this query matters for investigation: If j.morrison’s account is compromised, the attacker will access applications beyond the user’s normal pattern. Filtering out expected apps (Office, Azure Portal) reveals unexpected access — the applications the attacker is targeting. This pattern appears in Module 11 (AiTM investigation) when identifying post-compromise activity.
Note the operator choices: has for the username (fast indexed search for a whole term). !in for the application exclusion (check against a list of known-good values). These choices are deliberate — and choosing the right operator becomes instinct with practice.
where first (reduces rows), compute with extend second (adds columns), shape with project third (selects columns), sort with order by last. Placing project before where removes columns you need for filtering.The project operator — controlling what you see
What it is
project selects which columns appear in your query output and discards the rest. It is the KQL equivalent of choosing which columns to display in a spreadsheet.
Why it matters
SigninLogs has over 50 columns. During an investigation, you need 5-8 of them. Without project, your results pane is overwhelmed with irrelevant data — authentication protocols, resource IDs, correlation IDs — none of which help you answer “who signed in from where and did it succeed?” project strips the noise and lets you focus on the columns that answer your investigation question.
How it works
Add project to the query you have been building:
| |
Run this. Your results now show exactly the columns relevant to a failed sign-in investigation — when it happened, who was targeted, from what IP, why it failed, what country, and which application. The 40+ irrelevant columns are gone. The data is immediately readable.
The project variants
| Variant | What it does | When to use it |
|---|---|---|
project | Keep ONLY these columns | Default choice — you know exactly which columns you need |
project-away | Remove these columns, keep everything else | When you want most columns but need to remove a few (e.g., stripping PII before sharing results) |
project-rename | Rename a column | When column names are unclear: project-rename FailureReason = ResultDescription |
project-reorder | Change column display order without removing any | When you want key columns first but need access to all others |
project permanently removes columns from the query pipelineAfter a project statement, the columns you did not include no longer exist for subsequent operations. If you write | project TimeGenerated, UserPrincipalName on line 3, then | where IPAddress == "198.51.100.44" on line 4, the query fails — IPAddress was removed on line 3. Rule: place project at the end of the query, after all filtering and computation. Filter first, compute second, project last.
The extend operator — adding computed columns
What it is
extend adds a new column to every row based on a calculation or transformation. Unlike project, it does not remove any existing columns — it adds to them. The new column exists for all subsequent operations in the query.
Why it matters
Raw log data is factual but not always useful in its raw form. A timestamp of 2026-03-21T08:14:22Z is precise, but during an investigation you want to know “how long ago was this?” An IP address of 198.51.100.44 is specific, but you want to know “is this internal or external?” An email field of j.morrison@northgateeng.com contains useful structure, but you need just the domain. extend transforms raw data into investigation-ready context.
How it works
Add extend to compute the time since each failed sign-in:
| |
| TimeGenerated | HoursAgo | UserPrincipalName | IPAddress | ResultDescription |
|---|---|---|---|---|
| 13:41:22 | 1.3 | j.morrison@... | 198.51.100.44 | Invalid username or password |
| 08:12:05 | 6.8 | m.chen@... | 203.0.113.88 | Account is locked |
HoursAgo gives you an instant sense of recency without mental timestamp arithmetic. 1.3 hours = very recent, possibly still active. 6.8 hours = happened during the morning, lower urgency. During a fast-moving investigation, eliminating mental math frees attention for analysis.extend patterns you will use in every investigation
These four patterns appear so frequently that they are worth learning now. You will use them throughout Modules 7-15.
| |
Each creates a new column without modifying existing data. Together, they transform raw IPAddress = 198.51.100.44 into NetworkZone = External, Country = US, TimeCategory = After Hours — which is immediately useful for investigation.
Try it yourself — build it before you read on
hourofday(TimeGenerated) returns the hour as an integer 0-23. | |
Why this matters for investigation: After-hours sign-ins from unfamiliar cities are a primary indicator of account compromise. The AiTM attacker in Module 11 replayed stolen tokens at 3am from a different country. This query would flag that immediately — “After Hours” from a city the user has never signed in from. You are building the investigation pattern now; you will use this exact logic in Modules 11 and 13.
The let statement — naming things for clarity and reuse
What it is
let assigns a name to a value, expression, or sub-query. It creates a reference you can use later in the query. let does not change the data — it makes the query more readable, more maintainable, and more powerful.
Why it matters
Without let, complex queries become unreadable walls of text. With let, you name your building blocks and the query reads like a set of instructions: “define the time window, define the target user, then search for their sign-ins.” During an active investigation where you run the same query pattern against 5 different users, let means changing one line instead of hunting through 10 lines of logic.
More importantly, let enables sub-queries — defining a dataset in one block and referencing it in the main query. This unlocks the most important investigation patterns in KQL.
How it works — simple let (naming a value)
| |
This does exactly the same thing as writing ago(24h) and the full UPN inline. But when you need to change the time window to 7 days or switch to investigating s.patel, you change one line at the top. During an investigation where you are running the same pattern for multiple users, this saves minutes per query — and during an active incident, minutes matter.
How it works — let with a sub-query (the powerful pattern)
This is where let transforms KQL from a query tool into an investigation engine.
| |
Read this carefully — it is one of the most important query patterns in this entire course.
The let block creates a list of IP addresses that had more than 10 failed sign-ins. It asks: “which IPs are brute-forcing accounts?” The main query then finds SUCCESSFUL sign-ins from those exact IPs. It asks: “did any of those brute-forcing IPs eventually get in?”
The combination answers the critical investigation question: “Did the brute-force attack succeed?”
| TimeGenerated | UserPrincipalName | IPAddress | AppDisplayName |
|---|---|---|---|
| 08:14:22 | j.morrison@... | 198.51.100.44 | Exchange Online |
| 08:15:01 | j.morrison@... | 198.51.100.44 | SharePoint Online |
let statement must end with a semicolonThe most common KQL syntax error for beginners. Every let line ends with ;. The main query after the let blocks does NOT end with a semicolon. If your query fails with "Syntax error: expected ';'", check the let statements first.
The order by operator — sorting results for investigation
What it is
order by (alias: sort by) arranges rows by one or more columns. It does not filter or transform — it reorders.
Why it matters
In security investigation, sort order determines how you read the data. desc (newest first) answers “what just happened?” — the default for alert triage. asc (oldest first) answers “what happened in sequence?” — the default for timeline reconstruction. Choosing the right sort order is the difference between seeing the most recent event first (triage mode) and reading the attack story from beginning to end (investigation mode).
How it works
| |
The search operator — when you don’t know where to look
What it is
search scans across all columns in one or more tables for a value. It is the broadest possible query — the KQL equivalent of Ctrl+F across your entire workspace.
Why it matters
During an investigation, you often start with a single IOC — an IP address, a domain name, a file hash — and you do not know which table contains data about it. Is this IP in SigninLogs? DeviceNetworkEvents? CommonSecurityLog? search finds it everywhere, then you switch to targeted where queries for the actual investigation.
How it works
| |
search for discovery, not investigationsearch is slow because it scans every column in every row. Use it once to discover which tables contain your IOC. Then switch to TableName | where ColumnName == "value" for the actual investigation. If you use search throughout an investigation, your queries will take 10-100x longer than targeted queries.
Putting it all together — your first complete investigation query
You have now learned every operator needed to build a real investigation query. This section combines them into a single query, built step by step from a realistic scenario.
Scenario: An alert fires: “Multiple failed sign-ins followed by a success from IP 198.51.100.44.” You need to determine who was targeted, whether the attack succeeded, and what the attacker did after gaining access.
Build this with your lab open. Run each step.
Step 1: Find all activity from the suspicious IP.
| |
This gives you everything — failed and successful — from that IP. But it is too raw. Add context.
Step 2: Classify each sign-in and add time context.
| |
Every row now has a clear SUCCESS/FAILED label and a recency value.
Step 3: Shape and sort for timeline reading.
| |
Sorting asc (oldest first) gives you a timeline — read the attack story from top to bottom.
| TimeGenerated | HoursAgo | Outcome | User | App | Result |
|---|---|---|---|---|---|
| 08:12:01 | 6.8 | FAILED | j.morrison | Microsoft Office | Invalid password |
| 08:12:03 | 6.8 | FAILED | j.morrison | Microsoft Office | Invalid password |
| 08:12:05 | 6.8 | FAILED | j.morrison | Microsoft Office | Invalid password |
| 08:12:44 | 6.7 | FAILED | s.patel | Azure Portal | Invalid password |
| 08:12:46 | 6.7 | FAILED | s.patel | Azure Portal | Account locked |
| 08:14:22 | 6.7 | SUCCESS | j.morrison | Exchange Online | |
| 08:15:01 | 6.7 | SUCCESS | j.morrison | SharePoint Online |
Capstone exercise
Write this from scratch — no looking back
Close this page (or scroll so you cannot see the queries above). Open your Sentinel Logs editor. Write a query from memory that does the following:
1. Queries SigninLogs for the last 7 days
2. Filters to a specific user (pick one of your test users)
3. Adds a column classifying each sign-in as "Business Hours" or "After Hours"
4. Adds a column showing the country from LocationDetails
5. Adds a column showing "Success" or "Failed"
6. Shows only: TimeGenerated, the time category, the country, the application name, and the outcome
7. Sorts by time, newest first
Do not look at the course until you have a working query. If the query fails, read the error message, remove lines from the bottom until it runs, identify the broken line, fix it. The debugging is part of the learning.
| |
Self-check: Did you use has (not contains) for the username? Did you remember tostring() for the JSON extraction? Did you place project after all extend statements? If your query works but looks different from this solution, that is fine — multiple valid approaches exist. What matters is that the output answers the question: “What did this user do, when, from where, and did it succeed?”
Common mistakes and how to fix them
| Error message you see | What went wrong | How to fix it |
|---|---|---|
'ColumnName' is not a recognized column | You used project earlier in the query and removed the column you are now referencing | Move project to the end of the query, after all filtering and extend operations |
Syntax error: expected ';' | Missing semicolon after a let statement | Add ; at the end of every let line. The main query after the let blocks does NOT have a semicolon. |
| Query returns 0 results | Time range too narrow, or a filter is excluding everything | Widen ago() to 7d. Remove where clauses one at a time from the bottom to identify which filter is too aggressive. |
Cannot convert from 'dynamic' to 'string' | Filtering or projecting a JSON/dynamic field without type conversion | Wrap the field in tostring(): tostring(LocationDetails.city) |
| Query runs but takes 60+ seconds | Using search across all tables, or contains on a large table | Replace search with targeted where on specific tables. Replace contains with has. |
The expression refers to column 'X' which doesn't exist | Typo in column name, or the column exists in a different table | Check the table schema: `TableName |
When a query fails, do not stare at 10 lines trying to spot the mistake. Remove the last line. Run it. Still fails? Remove the next-to-last line. Run it. The moment the query succeeds, the line you just removed contains the error. Read the error message for that specific line and fix it. This is faster than reading and is the same approach experienced analysts use in production.
Check your understanding
1. You are investigating a suspicious IP address (203.0.113.45) but you do not know which Sentinel table contains data about it. Which operator do you use first, and why would you switch to a different approach once you find the data?
search finds where data lives. where investigates it efficiently. Using search for an entire investigation wastes time. Using where against every table manually wastes effort. The two-step approach — discover with search, investigate with where — is the standard workflow.2. Your query uses | where UserPrincipalName contains "morrison" and takes 45 seconds on a table with 10 million rows. How do you make it faster without changing the results?
has vs contains choice is the highest-impact KQL performance optimization. On large tables, the difference is often 10-50x. A time range reduction also helps, but the operator change is the more targeted fix because it addresses the root cause (missing index usage).3. You write | project TimeGenerated, UserPrincipalName on line 3, then | where IPAddress == "198.51.100.44" on line 4. The query fails. Why?
project permanently removes columns from the pipeline. Everything after it can only reference the columns you explicitly kept. Place project at the end — always.4. An alert says "Brute force attack detected against j.morrison." You need to determine whether the attack succeeded. What is the correct investigation approach?
let sub-query correlates failures and successes from the same source — the only approach that answers "did the brute force work?"5. You are writing a query during an active incident. It is 12 lines long and fails with a syntax error. What is the fastest debugging approach?