In this module

1.4 Type Conversion and Coercion

3-4 hours · Module 1 · Free

Operational Objective

This subsection teaches type conversion and coercion — a core KQL skill for security investigation and detection engineering in Microsoft Sentinel and Defender XDR. Every concept is demonstrated against security log data from the Northgate Engineering environment.

Deliverable: Working proficiency with the KQL operators and patterns covered in this subsection, demonstrated through security investigation exercises.

Estimated completion: 25 minutes

Figure 1.4 — Change management pipeline for production automation.

Type Conversion and Coercion

Introduction

Figure — Type Conversion and Coercion. Applied to security investigation workflows at NE.

When data types do not match — a string that contains a number, a dynamic field that needs to be a string for comparison, a datetime that needs to be an integer for arithmetic — you need explicit type conversion. KQL provides conversion functions for every type. Using them correctly prevents the silent failures covered in subsection 1.3.

The conversion functions

Function	Converts to	Example	Security use case
`tostring()`	string	`tostring(LocationDetails.city)`	Extract string from dynamic JSON
`toint()`	int (32-bit)	`toint(ResultType)`	Convert string ResultType for numeric comparison
`tolong()`	long (64-bit)	`tolong(Properties.targetResources[0].id)`	Convert large numeric IDs
`toreal()`	real (float)	`toreal(LocationDetails.geoCoordinates.latitude)`	Extract coordinates for distance calculation
`todatetime()`	datetime	`todatetime(Properties.createdDateTime)`	Convert string dates from audit logs
`tobool()`	bool	`tobool(DeviceDetail.isManaged)`	Convert string "true"/"false" to boolean
`totimespan()`	timespan	`totimespan("1:30:00")`	Create duration values for arithmetic
`todecimal()`	decimal	`todecimal(Amount)`	High-precision numeric conversion
`parse_json()`	dynamic	`parse_json(AdditionalDetails)`	Parse string containing JSON into queryable object
`todynamic()`	dynamic	`todynamic(RawData)`	Alias for parse_json

Conversion behavior on failure

Every to* function returns null when conversion fails — not an error. This is both useful and dangerous:

// toint on a non-numeric string returns null, not an error
print toint("not_a_number")  // output: null

// toint on a numeric string works
print toint("50126")          // output: 50126

// todatetime on garbage returns null
print todatetime("not_a_date") // output: null

// If ResultType contains "0", "50126", and "Interrupted"...
SigninLogs
| extend ResultCode = toint(ResultType)
| where ResultCode == 0
// "Interrupted" rows convert to null, then fail the == 0 check
// They are silently excluded — you never see them

// WRONG: comparing dynamic to string silently fails in some contexts
SigninLogs
| where LocationDetails.countryOrRegion == "GB"

// CORRECT: explicit conversion
SigninLogs
| where tostring(LocationDetails.countryOrRegion) == "GB"

AuditLogs
| extend CreatedDate = todatetime(
    TargetResources[0].modifiedProperties[0].newValue)
| where CreatedDate > ago(7d)

// Calculate sign-in duration in minutes
SigninLogs
| extend DurationMinutes = datetime_diff('minute', 
    todatetime(AppliedConditionalAccessPolicies[0].enforcedGrantControls),
    TimeGenerated)

// When a column contains JSON as a string, not as a dynamic type
| extend ParsedData = parse_json(RawEventData)
| extend ActionName = tostring(ParsedData.ActionName)
| extend TargetUser = tostring(ParsedData.Target[0].Id)

SigninLogs
| extend Country = coalesce(
    tostring(LocationDetails.countryOrRegion),
    "Unknown"
)

// Some sign-in events store the IP in different fields
SigninLogs
| extend BestIP = coalesce(IPAddress, 
    tostring(DeviceDetail.ipAddress), 
    "No IP recorded")

SigninLogs
| where TimeGenerated > ago(24h)
| where ResultType == "0"
| extend AuthDetails = parse_json(tostring(AuthenticationDetails))
| extend 
    PrimaryMethod = tostring(AuthDetails[0].authenticationMethod),
    PrimaryResult = tostring(AuthDetails[0].succeeded),
    SecondaryMethod = tostring(AuthDetails[1].authenticationMethod),
    SecondaryResult = tostring(AuthDetails[1].succeeded)
| where PrimaryMethod != "" and SecondaryMethod != ""
| summarize count() by PrimaryMethod, SecondaryMethod

SigninLogs
| where TimeGenerated > ago(24h)
| where not(ipv4_is_in_range(IPAddress, "10.0.0.0/8"))
    and not(ipv4_is_in_range(IPAddress, "172.16.0.0/12"))
    and not(ipv4_is_in_range(IPAddress, "192.168.0.0/16"))
| where ResultType == "0"

| extend IPNum = tolong(split(IPAddress, ".")[0]) * 16777216
    + tolong(split(IPAddress, ".")[1]) * 65536
    + tolong(split(IPAddress, ".")[2]) * 256
    + tolong(split(IPAddress, ".")[3])
| sort by IPNum asc

SigninLogs
| where TimeGenerated > ago(24h)
| extend ConvertedResult = toint(ResultType)
| summarize 
    TotalRows = count(),
    ConvertedOK = countif(isnotnull(ConvertedResult)),
    ConvertedNull = countif(isnull(ConvertedResult))
| extend FailRate = round(100.0 * ConvertedNull / TotalRows, 2)

SigninLogs
| where TimeGenerated > ago(1h)
| where ResultType == "0"
| mv-expand CAPolicy = parse_json(tostring(ConditionalAccessPolicies))
| extend 
    PolicyName = tostring(CAPolicy.displayName),
    PolicyResult = tostring(CAPolicy.result),
    GrantControls = tostring(CAPolicy.enforcedGrantControls)
| where PolicyResult == "success"
| summarize AppliedPolicies = make_set(PolicyName) by UserPrincipalName

SigninLogs
| where TimeGenerated > ago(7d)
| where UserPrincipalName =~ "target@northgateeng.com"
| extend LocalHour = hourofday(TimeGenerated + 1h)  // UTC+1 for BST
| summarize count() by LocalHour
| sort by LocalHour asc

// An employee in New York (UTC-4) sends a suspicious email at 9am local = 13:00 UTC
// A recipient in London (UTC+1) clicks the link at 2pm local = 13:00 UTC
// Both events have the same UTC timestamp — the click was instant.
// Without UTC awareness, you might think the click happened 5 hours after the email.

SigninLogs
| where TimeGenerated > ago(24h)
| where ResultType != "0"
| extend FailureReason = case(
    ResultType == "50126", "Invalid password",
    ResultType == "50074", "MFA required but not completed",
    ResultType == "53003", "Blocked by Conditional Access",
    ResultType == "50053", "Account locked out",
    ResultType == "50057", "Account disabled",
    ResultType == "50058", "Session interrupted",
    ResultType == "50076", "MFA required — user did not complete",
    ResultType == "500121", "Authentication failed during MFA",
    ResultType == "700016", "Application not found in tenant",
    strcat("Error code: ", ResultType)  // default: show raw code
)
| summarize count() by FailureReason
| sort by count_ desc

Expand for Deeper Context

The dangerous part: if you convert a column and the conversion fails for some rows, those rows silently get null values. Any subsequent where filter on that column excludes the null rows without warning.

Rule: after any type conversion, check for null results. Use isnotnull() or isnotempty() to verify conversions succeeded before filtering.

Practical conversion patterns

Converting dynamic fields to strings for comparison:

Converting string timestamps to datetime:

Some log fields store timestamps as strings rather than datetime. The AuditLogs table's ActivityDateTime is datetime, but nested properties may contain string dates:

Converting between numeric types for arithmetic:

Parsing JSON strings into queryable objects:

The coalesce function

coalesce() returns the first non-null value from a list of arguments. It is the primary tool for handling conversion failures and missing data:

If LocationDetails.countryOrRegion is null (the sign-in has no location data), coalesce returns "Unknown" instead of null. This prevents null values from propagating through the query and silently excluding rows from downstream operations.

Multi-column coalesce for finding the best available value:

Advanced conversion patterns for security data

Converting authentication details from dynamic arrays:

The AuthenticationDetails column in SigninLogs is a dynamic array where each element describes one authentication step (password, MFA push, FIDO2 key). Extracting information requires array access + type conversion:

This shows which authentication methods are being used together — password + push notification, password + FIDO2, passwordless + none. The conversion chain is: access array element → convert to string → use in comparison.

Converting IP addresses for range comparison:

IP addresses are strings in KQL, which means you cannot use numeric range operators. To check if an IP falls within a CIDR range, use the ipv4_is_in_range() function:

No type conversion needed — ipv4_is_in_range handles the string-to-IP conversion internally. But if you need to sort or compare IPs numerically, you must convert them:

Safe conversion with error tracking:

For production queries where you need to know if conversions failed (not just silently dropping rows), track the failure rate:

If the failure rate is above 0%, investigate which values failed conversion and decide whether to handle them separately or filter them out.

The parse_json pattern for nested dynamic data

Many security log columns contain serialised JSON stored as strings. The ConditionalAccessPolicies column in SigninLogs, for example, is a dynamic array where each element is a JSON object describing a conditional access policy evaluation:

The conversion chain: ConditionalAccessPolicies is dynamic → tostring() serialises it → parse_json() re-parses it (sometimes necessary when the dynamic value is double-encoded) → mv-expand creates one row per array element → tostring() on each field extracts the string value.

This pattern appears in every investigation that needs to understand why a sign-in was allowed or blocked. The conditional access evaluation data is the authoritative record of which policies applied, what controls were enforced, and what the outcome was.

Timestamp conversion and timezone handling

All timestamps in KQL are stored in UTC. The TimeGenerated column is always UTC, regardless of the user's local timezone. This has three investigation implications:

1. User activity times must be mentally converted. If a user reports suspicious activity "at 3pm" and they are in London (UTC+1 in BST), search for 14:00 UTC: where TimeGenerated between (datetime(2026-03-24T14:00:00Z) .. datetime(2026-03-24T15:00:00Z)).

2. Time-of-day analysis requires explicit conversion:

The + 1h offset converts UTC to BST. Without this, a sign-in at midnight local time appears as 23:00 in the query results — potentially causing you to misclassify normal activity as after-hours.

3. Cross-timezone correlation must align timestamps:

Always work in UTC for query logic. Convert to local time only in the final project for human-readable output.

Batch conversion with `case` and `iff`

When a column contains coded values that need translation for readable output, use case for multiple mappings or iff for binary:

This transforms opaque error codes into human-readable reasons. The case statement evaluates in order and returns the first match. The final argument (no condition) is the default — any ResultType not explicitly mapped shows the raw code prefixed with "Error code:".

This conversion pattern is essential for SOC dashboards and IR reports. An incident report that says "50126 failures from IP 198.51.100.44" means nothing to a CISO. "Invalid password failures from IP 198.51.100.44" communicates the attack type immediately.

SigninLogs
| where TimeGenerated > ago(24h)
| take 100
| extend ResultCode = toint(ResultType)
| extend ConversionWorked = isnotnull(ResultCode)
| summarize 
    Total = count(),
    Converted = countif(ConversionWorked),
    Failed = countif(not(ConversionWorked))

Try it yourself

Run this query and examine the results:

If Failed is greater than zero, some ResultType values in your data are not numeric. Check what they are: | where not(ConversionWorked) | distinct ResultType. These are the values a toint conversion would silently turn to null.

Check your understanding

1. You convert a column with toint() and use it in a where filter. The query returns fewer rows than expected. What is the most likely cause?

Some values in the column could not be converted to int, so toint() returned null for those rows. The where filter then excluded the null rows because null fails all comparison operations (null == 0 is false, null > 10 is false, null != 5 is false). The rows were not errors — they were silently dropped. To diagnose: add | where isnotnull(converted_column) as a separate step and compare the count, or use countif(isnotnull(...)) to see how many conversions failed.

The toint function has a rounding error

The where filter is case-sensitive

Type conversion failures produce null, and null fails all comparisons. This is the most common cause of silently missing rows in KQL queries.

NE environmental considerations

NE's detection environment includes specific factors that influence this rule's operation:

Device diversity: 768 P2 corporate workstations with full Defender for Endpoint telemetry, 58 P1 manufacturing workstations with basic cloud-delivered protection, and 3 RHEL rendering servers with Syslog-only coverage. Rules targeting DeviceProcessEvents operate with full fidelity on P2 devices but may have reduced visibility on P1 devices. Manufacturing workstations in Sheffield and Sunderland represent a detection gap for endpoint-level detections.

Expand for Deeper Context

Network topology: 11 offices connected via Palo Alto SD-WAN with full-mesh connectivity. The SD-WAN firewall logs feed CommonSecurityLog in Sentinel. Cross-site lateral movement generates firewall allow events that correlate with DeviceLogonEvents — enabling multi-source detection that single-table rules cannot achieve.

User population: 810 users with distinct behavioral profiles — office workers (predictable hours, consistent applications), field engineers (variable hours, travel patterns), IT administrators (elevated privilege, broad access patterns), and manufacturing operators (fixed shifts, limited application access). Each user population has different detection baselines.

Compliance Myth: "You can learn type conversion and coercion from documentation alone"

The myth: Reading the Microsoft KQL documentation is sufficient to learn this concept. The documentation explains the syntax — what more do you need?

The reality: Documentation teaches SYNTAX. This course teaches APPLICATION. The KQL docs show how summarize works with generic examples. This course shows how summarize reveals a password spray pattern hiding in 45,000 sign-in events. The difference between knowing the syntax and knowing WHEN and WHY to use it in a security context is the difference between an analyst who copies queries and one who writes them. Every operator in this course is taught through a security investigation scenario — not abstract data manipulation.

Troubleshooting

"The query returns an error I do not understand." KQL error messages reference the specific line and operator that failed. Read the error message from left to right: it names the operator, the expected input type, and the actual input type. Most errors are type mismatches (passing a string where a datetime is expected) or field name typos. The getschema operator shows every field name and type for any table: TableName | getschema.

"The query runs but returns unexpected results." Add | take 10 after each operator in the pipeline and examine the intermediate output. This reveals WHERE the data transforms in a way you did not expect. Debug the pipeline stage by stage, not the entire query at once.

Decision point

You are writing a KQL query and cannot remember the exact operator syntax. Do you check the Microsoft documentation or ask Claude?

Check the documentation first for SYNTAX questions (exact parameter names, return types, operator behavior). The KQL documentation at learn.microsoft.com/kusto is the authoritative source. Claude may generate plausible but incorrect syntax — especially for newer operators or recently changed behavior. Use Claude for DESIGN questions: 'I need to detect accounts with sign-ins from more than 3 countries in 24 hours — what is the best KQL approach?' Claude excels at designing the query strategy; the documentation excels at providing the exact syntax.

You've learned how KQL processes data.

K0 gave you the query language's place in the Microsoft security stack. K1 took you through the semantics — tables, operators, the pipe model, and why KQL isn't SQL. Now you write the queries that find what attackers hope you miss.

12 modules of query craft — filtering and shaping, joins and unions, time-series analysis, summarisation, string manipulation, and geospatial analysis
68 KQL exercises — every one with a realistic dataset, a reference solution, and a discussion of alternative approaches
K11 — Threat Hunting with KQL — the course's flagship module. Hypothesis-driven methodology, MITRE ATT&CK-aligned hunting across 7 techniques, UEBA composite risk scoring, and retroactive IOC sweeps
K13 Capstone — The Hunting Lab — three complete investigation scenarios requiring every query skill from the course
Hunt management and ROI metrics — the operating model that justifies KQL hunt programs to leadership

Unlock the full course with Premium See Full Syllabus

← Previous Next →