2.2 Core Operators
Core Operators
These are the operators you will use in every investigation query. Master these six and you can answer the majority of security questions: where filters rows, project selects columns, extend adds calculated columns, summarize aggregates data, sort orders results, and take limits output.
where — Filter Rows
where keeps rows that match a condition. It is the operator you will use more than any other.
| |
This returns successful sign-ins to Microsoft Teams in the last seven days. Multiple where clauses act as logical AND — every condition must be true. You can combine conditions in a single where with and and or:
| |
Comparison operators:
| Operator | Meaning |
|---|---|
== | Equals (case-sensitive for strings) |
!= | Not equals |
=~ | Equals (case-insensitive) |
>, <, >=, <= | Numeric and datetime comparisons |
in | Matches any value in a list (case-sensitive) |
in~ | Matches any value in a list (case-insensitive) |
!in | Does not match any value in a list |
between | Within a range (inclusive) |
The in operator is particularly useful for filtering on known values:
| |
This returns sign-ins that failed with specific error codes: invalid credentials (50126), account locked (50053), or MFA required but not completed (50074).
has vs contains — The Performance Difference
This is the most important distinction in KQL string matching. Getting it wrong costs you minutes per query on large tables.
has checks the term index — it looks up complete words and runs fast. contains scans every character of every string in the column — it runs slow. On a table with millions of rows, has returns in seconds while contains can take minutes.
| |
| |
Both return similar results in most cases, but has runs faster because it checks the term index for “paypal” rather than scanning every character of every email address.
When you must use contains: when your search term is part of a larger word and not a standalone term. For example, searching for “admin” inside “sysadmin” requires contains because “admin” is a substring of the term “sysadmin”, not a separate term.
The full family of string operators:
| Operator | Behaviour | Performance |
|---|---|---|
has | Term match (case-insensitive) | Fast (indexed) |
has_cs | Term match (case-sensitive) | Faster |
contains | Substring match (case-insensitive) | Slow (scan) |
contains_cs | Substring match (case-sensitive) | Slightly faster scan |
startswith | Prefix match (case-insensitive) | Moderate |
endswith | Suffix match (case-insensitive) | Moderate |
matches regex | Regular expression | Slowest — use sparingly |
has_any | Term match against a list | Fast |
has_all | All terms must match | Fast |
Case-sensitive variants (the _cs suffix) are faster than their case-insensitive counterparts because the engine can use stricter index matching. Use them when you know the exact casing — for example, when matching known IOC hashes.
project — Select and Rename Columns
project controls which columns appear in your output. It reduces the width of your result set, making it easier to read and reducing the data the engine processes.
| |
You can rename columns in the same step:
| |
If you want most columns but need to remove a few, project-away is more efficient than listing every column you want to keep. | project-away TenantId, OperationName, CorrelationId drops those three and keeps everything else.
extend — Add Calculated Columns
extend adds a new column while keeping all existing columns. Use it for calculations, extractions, and transformations.
| |
This adds two columns: HourOfDay (the hour the sign-in occurred) and Domain (extracted from the user’s email). The original columns remain — extend never removes data.
Common extend patterns for security:
| |
summarize — Aggregate Data
summarize groups rows and computes aggregations. This is how you count things, find unique values, and identify patterns.
| |
Key aggregation functions:
| Function | What it does | Example use |
|---|---|---|
count() | Count rows per group | Failed sign-ins per user |
dcount(column) | Count distinct values | Unique IPs per user |
make_set(column) | Collect distinct values into an array | All countries a user signed in from |
make_list(column) | Collect all values (including duplicates) | All applications accessed |
arg_max(col, *) | Return the row with the maximum value | Most recent sign-in per user |
arg_min(col, *) | Return the row with the minimum value | First sign-in per user |
sum(column) | Sum a numeric column | Total data volume per device |
avg(column) | Average of a numeric column | Average response time |
countif(condition) | Count rows matching a condition | Failed vs successful in one query |
Multiple aggregations in one query:
| |
This gives you a complete sign-in profile per user in a single query: total sign-ins, failures, successes, how many distinct IPs they used, and which countries they signed in from.
sort and take — Order and Limit
sort by orders results. take limits the number of rows returned.
| |
This returns the top 10 most frequent alerts in the last week — a quick view of what is generating the most noise in your environment.
| top 10 by FailedAttempts desc is equivalent to | sort by FailedAttempts desc | take 10. Use whichever reads more naturally.
Operator decision: which string operator should you use?
Try it yourself
| |
An IP targeting many distinct users with failures is a strong indicator of password spray. An IP targeting one user with many failures is brute force. The dcount(UserPrincipalName) distinguishes between the two — this is a pattern you will use repeatedly in investigation scenarios.
Check your understanding
1. What is the practical difference between has and contains?
2. What does extend do differently from project?
3. You want to count failed sign-ins per user and also collect the list of IP addresses used. Which operators achieve this?
4. What is the most efficient way to find the single most recent sign-in for each user?