In this module

1.5 Null Handling — The Silent Query Killer

3-4 hours · Module 1 · Free

Operational Objective

This subsection teaches null handling — the silent query killer — a core KQL skill for security investigation and detection engineering in Microsoft Sentinel and Defender XDR. Every concept is demonstrated against security log data from the Northgate Engineering environment.

Deliverable: Working proficiency with the KQL operators and patterns covered in this subsection, demonstrated through security investigation exercises.

Estimated completion: 25 minutes

Figure 1.5 — Operational workflow from input through documented output.

Null Handling — The Silent Query Killer

Introduction

Figure — Null Handling — The Silent Query Killer. Applied to security investigation workflows at NE.

Null values in KQL cause more incorrect investigation conclusions than any other single issue. A query that silently excludes rows because of nulls can make you believe an attacker did not access a mailbox, did not create a forwarding rule, or did not authenticate from an external IP — when in fact the evidence exists but your query missed it because a comparison field was null.

This subsection teaches you to identify, handle, and defend against null-related query failures.

What null means in KQL

print isempty("")         // true — empty string
print isempty(null)       // true — null
print isnull("")          // false — empty string is not null
print isnull(null)        // true — null is null
print isnotempty("")      // false
print isnotempty(null)    // false
print isnotnull("")       // true — empty string is not null
print isnotnull(null)     // false

// ALL of these return false:
print null == "hello"
print null != "hello"    // yes, even != returns false
print null == null        // even null compared to itself
print null > 0
print null < 100

// You want all sign-ins NOT from the UK
SigninLogs
| where tostring(LocationDetails.countryOrRegion) != "GB"

SigninLogs
| extend Country = tostring(LocationDetails.countryOrRegion)
| where Country != "GB" or isempty(Country)

// When filtering for "not X", always include nulls
| where FieldName != "value" or isempty(FieldName)

// Replace null with a meaningful default
| extend Country = coalesce(
    tostring(LocationDetails.countryOrRegion), 
    "No location data")

// Only process rows where the field has data
| where isnotempty(tostring(LocationDetails.countryOrRegion))
| extend Country = tostring(LocationDetails.countryOrRegion)

// Explicitly check what has null values and why
SigninLogs
| where TimeGenerated > ago(24h)
| extend Country = tostring(LocationDetails.countryOrRegion)
| summarize 
    WithLocation = countif(isnotempty(Country)),
    WithoutLocation = countif(isempty(Country))

// inner join: null keys NEVER match — rows with null are excluded from both sides
// leftouter join: null keys on the left are preserved (with null for right columns)
// leftanti join: null keys on the left ARE included (they have no match on the right)

// Instead of inner join (loses null-IP rows):
SigninLogs
| where TimeGenerated > ago(24h)
| join kind=inner (Watchlist) on IPAddress

// Use leftouter join (preserves null-IP rows):
SigninLogs
| where TimeGenerated > ago(24h)
| join kind=leftouter (Watchlist) on IPAddress
| extend WatchlistMatch = isnotempty(WatchlistColumn)

// Version 1 (null-blind): misses sign-ins with no location data
SigninLogs
| where TimeGenerated > ago(1h)
| where tostring(LocationDetails.countryOrRegion) != "GB"

// Version 2 (null-aware): catches sign-ins with no location AND non-GB
SigninLogs
| where TimeGenerated > ago(1h)
| extend Country = tostring(LocationDetails.countryOrRegion)
| where Country != "GB" or isempty(Country)
| extend LocationStatus = iff(isempty(Country), "No location data", Country)

// count() counts all rows, including those with null values
// dcount(Column) counts distinct non-null values
// countif(condition) counts rows where condition is true — null fails the condition

// sum(), avg(), min(), max() all ignore null values
// If ALL values are null, these return null (not 0)

| summarize 
    DistinctIPs = dcount(IPAddress),
    NullIPCount = countif(isempty(IPAddress)),
    TotalEvents = count()

print 5 + int(null)          // → null
print strcat("hello", "")    // → "hello"  (empty string, not null)
print strcat("hello", tostring(null))  // → "hello"  (null converts to empty string in strcat)
print strlen(tostring(null)) // → 0  (null becomes empty string)

SigninLogs
| where TimeGenerated > ago(24h)
| extend SessionDuration = datetime_diff('minute', TimeGenerated, CreatedDateTime)
// If CreatedDateTime is null for some rows, SessionDuration is null for those rows
// A subsequent where SessionDuration > 30 silently drops the null rows

| extend SessionDuration = datetime_diff('minute', TimeGenerated, 
    coalesce(CreatedDateTime, TimeGenerated))
// If CreatedDateTime is null, use TimeGenerated as fallback → SessionDuration = 0

print isnull(int(null))        // true — the value is null
print isempty("")              // true — the string is empty
print isempty(tostring(null))  // true — null converts to empty string, which is empty
print isnull("")               // false — empty string is NOT null
print isnotempty("")           // false — empty string is empty
print isnotempty("hello")      // true

SigninLogs
| where TimeGenerated > ago(24h)
| summarize 
    TotalRows = count(),
    NullIP = countif(isempty(IPAddress)),
    NullLocation = countif(isempty(tostring(LocationDetails.countryOrRegion))),
    NullUA = countif(isempty(UserAgent)),
    NullDevice = countif(isempty(tostring(DeviceDetail.operatingSystem))),
    NullRisk = countif(isempty(RiskLevelDuringSignIn)),
    NullCAP = countif(array_length(ConditionalAccessPolicies) == 0)
| extend
    PctNullIP = round(100.0 * NullIP / TotalRows, 1),
    PctNullLocation = round(100.0 * NullLocation / TotalRows, 1),
    PctNullUA = round(100.0 * NullUA / TotalRows, 1),
    PctNullDevice = round(100.0 * NullDevice / TotalRows, 1),
    PctNullRisk = round(100.0 * NullRisk / TotalRows, 1),
    PctNullCAP = round(100.0 * NullCAP / TotalRows, 1)

| extend Country = coalesce(
    tostring(LocationDetails.countryOrRegion),
    tostring(LocationDetails.state),
    "Unknown"
)

| extend IPType = case(
    isempty(IPAddress), "No IP recorded",
    ipv4_is_in_range(IPAddress, "10.0.0.0/8"), "Corporate",
    ipv4_is_in_range(IPAddress, "172.16.0.0/12"), "Corporate",
    ipv4_is_in_range(IPAddress, "192.168.0.0/16"), "Corporate",
    "External"
)

Expand for Deeper Context

Null means "no value." It is not zero, not an empty string, not false. It is the absence of a value entirely.

- null — the value does not exist - "" — the value exists and is an empty string - 0 — the value exists and is the number zero - false — the value exists and is the boolean false

These are four different things. KQL treats them differently:

Key distinction: isempty() returns true for both null and empty string. isnull() returns true only for null. In security queries, you almost always want isnotempty() — which catches both null and empty string.

How null breaks queries

Comparison with null always returns false:

This means any where clause that compares against a column containing null will exclude that row — regardless of the comparison operator.

Real-world impact:

This query excludes sign-ins with null location data — sign-ins where the location was not recorded. If the adversary used a technique that suppresses location data (certain API-based sign-ins, service principal auth), their sign-ins disappear from your results. You conclude "no sign-ins from outside the UK" when the correct conclusion is "no sign-ins from outside the UK among sign-ins that have location data."

The safe version:

This includes rows where Country is null or empty — making them visible instead of invisible.

The null-safe patterns

Pattern 1: Include nulls explicitly

Pattern 2: Coalesce to a default

Pattern 3: Check for null before using a field

Pattern 4: Separate null investigation

If WithoutLocation is a significant percentage, investigate those sign-ins separately — they may be service principals, managed identities, or suppressed location events.

Null in join operations

Null handling becomes critical in joins. When you join two tables on a key column, rows with null keys behave differently depending on the join type:

This has investigation consequences. If you join SigninLogs to a watchlist on IPAddress, and some sign-in events have null IPAddress, those events are excluded from an inner join. If the adversary's sign-in events have null IP addresses (certain authentication flows do not record IP), they vanish from your investigation results.

Safe join pattern for investigation:

The leftouter join preserves all sign-in events, adding watchlist data where it matches and null where it does not. You can then filter or flag matches without losing the non-matching rows.

Building null-aware detection rules

Every detection rule should be reviewed for null blindness. Use this checklist:

1. Identify the key filtering columns — which columns does the where clause reference? 2. Check null rates — run the null audit query from the Try It exercise for each column 3. Assess security impact — if a null value in this column could represent adversary activity, the detection rule has a blind spot 4. Add null handling — either include nulls in the filter (or isempty(Column)) or create a separate alert for null values in critical columns

Example: null-aware detection for external sign-ins:

Version 2 is the correct pattern for production detection rules. The LocationStatus column makes it immediately visible to the triaging analyst whether the alert fired because of a foreign country or missing location data — two different investigation paths.

Null in aggregation functions

Aggregation functions handle null differently:

This means dcount(IPAddress) does not count rows where IPAddress is null. If 10 sign-ins have no IP address, they are excluded from the distinct count. For security investigations, this can undercount the adversary's footprint.

Safe dcount pattern:

Now you see the complete picture: the distinct IP count, how many events had no IP, and the total.

Null propagation in expressions

Null is contagious in arithmetic and string expressions. Any operation involving null produces null:

This propagation creates subtle bugs in calculated columns:

Defense: Use coalesce to provide default values before calculation:

The isempty vs isnull vs isnotnull distinction

KQL has multiple null-checking functions that behave differently:

For string columns: Use isempty() — it catches both null and empty string, which is what you want. An IPAddress that is null and an IPAddress that is "" are both "missing" for investigation purposes.

For numeric columns: Use isnull() — numeric columns cannot be "empty", only null.

For dynamic columns: Use isempty() after converting to string, or check the specific field: isempty(tostring(LocationDetails.countryOrRegion)).

Counting nulls — the visibility audit query

Before writing any detection rule, audit the null rates in the columns your rule depends on. This query produces a null rate report for SigninLogs:

Run this query on your workspace. The results tell you which columns are unreliable for filtering. If 40% of sign-ins have no RiskLevelDuringSignIn value, a detection rule that filters on where RiskLevelDuringSignIn == "high" is blind to 40% of sign-in events — any of which could be adversary activity.

This audit query is not a one-time exercise. Run it monthly. Null rates change as authentication patterns change (new applications, new device types, conditional access policy modifications). A detection rule that worked last month may have a new blind spot this month because a null rate shifted.

The coalesce function — your null safety net

coalesce returns the first non-null, non-empty value from a list of arguments:

If countryOrRegion is populated, use it. If it is null but state is populated, use state. If both are null, use "Unknown". The fallback chain ensures every row has a value — no nulls propagate into downstream logic.

Detection rule pattern — null-safe IP classification:

This handles null IPs explicitly instead of letting them fall through to the "External" classification (which would be a false positive — a null IP is not necessarily external).

SigninLogs
| where TimeGenerated > ago(7d)
| summarize 
    Total = count(),
    NullIP = countif(isempty(IPAddress)),
    NullLocation = countif(isempty(
        tostring(LocationDetails.countryOrRegion))),
    NullUserAgent = countif(isempty(UserAgent)),
    NullAppName = countif(isempty(AppDisplayName))
| extend 
    PctNullIP = round(100.0 * NullIP / Total, 1),
    PctNullLocation = round(100.0 * NullLocation / Total, 1),
    PctNullUserAgent = round(100.0 * NullUserAgent / Total, 1),
    PctNullAppName = round(100.0 * NullAppName / Total, 1)

Try it yourself

Run this null audit against your SigninLogs:

Any field with more than 5% null values is a field where where filters will silently exclude a significant portion of your data. Note these fields — every query you write against SigninLogs should account for their null rate.

Check your understanding

1. During an investigation, you run: SigninLogs | where tostring(LocationDetails.countryOrRegion) != "GB" and get zero results. Can you conclude that all sign-ins came from the UK?

No. This query excludes rows where the country field is null — and null fails all comparisons including !=. If an adversary signed in via an API call or a method that does not record location data, their sign-in has a null country and would be excluded from your results. The query shows "zero non-UK sign-ins among sign-ins that have location data" — not "all sign-ins came from the UK." To correctly answer the question, add: or isempty(tostring(LocationDetails.countryOrRegion)) and check if any null-location rows appear.

Yes — the query correctly filters for non-UK sign-ins

No — the query needs to use != "United Kingdom" instead

Null fails all comparisons. A query filtering for "not X" silently excludes nulls. During investigations, always account for null values — adversary activity can hide in the nulls.

Compliance Myth: "You can learn null handling from documentation alone"

The myth: Reading the Microsoft KQL documentation is sufficient to learn this concept. The documentation explains the syntax — what more do you need?

The reality: Documentation teaches SYNTAX. This course teaches APPLICATION. The KQL docs show how summarize works with generic examples. This course shows how summarize reveals a password spray pattern hiding in 45,000 sign-in events. The difference between knowing the syntax and knowing WHEN and WHY to use it in a security context is the difference between an analyst who copies queries and one who writes them. Every operator in this course is taught through a security investigation scenario — not abstract data manipulation.

Troubleshooting

"The query returns an error I do not understand." KQL error messages reference the specific line and operator that failed. Read the error message from left to right: it names the operator, the expected input type, and the actual input type. Most errors are type mismatches (passing a string where a datetime is expected) or field name typos. The getschema operator shows every field name and type for any table: TableName | getschema.

"The query runs but returns unexpected results." Add | take 10 after each operator in the pipeline and examine the intermediate output. This reveals WHERE the data transforms in a way you did not expect. Debug the pipeline stage by stage, not the entire query at once.

Decision point

You are writing a KQL query and cannot remember the exact operator syntax. Do you check the Microsoft documentation or ask Claude?

Check the documentation first for SYNTAX questions (exact parameter names, return types, operator behavior). The KQL documentation at learn.microsoft.com/kusto is the authoritative source. Claude may generate plausible but incorrect syntax — especially for newer operators or recently changed behavior. Use Claude for DESIGN questions: 'I need to detect accounts with sign-ins from more than 3 countries in 24 hours — what is the best KQL approach?' Claude excels at designing the query strategy; the documentation excels at providing the exact syntax.

You've learned how KQL processes data.

K0 gave you the query language's place in the Microsoft security stack. K1 took you through the semantics — tables, operators, the pipe model, and why KQL isn't SQL. Now you write the queries that find what attackers hope you miss.

12 modules of query craft — filtering and shaping, joins and unions, time-series analysis, summarisation, string manipulation, and geospatial analysis
68 KQL exercises — every one with a realistic dataset, a reference solution, and a discussion of alternative approaches
K11 — Threat Hunting with KQL — the course's flagship module. Hypothesis-driven methodology, MITRE ATT&CK-aligned hunting across 7 techniques, UEBA composite risk scoring, and retroactive IOC sweeps
K13 Capstone — The Hunting Lab — three complete investigation scenarios requiring every query skill from the course
Hunt management and ROI metrics — the operating model that justifies KQL hunt programs to leadership

Unlock the full course with Premium See Full Syllabus

← Previous Next →