In this module

1.1 The Tabular Data Model — Everything Is a Table

3-4 hours · Module 1 · Free

Operational Objective

The Shape Mismatch: Analysts who learned SQL expect queries to return single values, individual rows, or mixed result types. KQL returns tables — always. Every operator consumes a table and produces a table. Understanding this pipeline model is the foundation for writing queries that produce the results you expect and debugging queries that do not.

Deliverable: Understanding of KQL's tabular pipeline model — every operator takes a table in and produces a table out.

Estimated completion: 25 minutes

The Tabular Data Model — Everything Is a Table

Introduction

Every KQL query starts with a table and ends with a table. This is the single most important concept in KQL — and the one that experienced SQL users most frequently misunderstand. In SQL, a query can return a scalar value, a single row, a set of rows, or a table. In KQL, the output is always a table. A query that returns one number returns a table with one column and one row. A query that returns nothing returns an empty table with column headers but zero rows.

Understanding this matters because every operator in KQL accepts a table as input and produces a table as output. The pipeline chains these operators together: operator 1 produces a table → operator 2 consumes that table and produces a new table → operator 3 consumes that new table. If you understand the input and output shapes at each stage, you can predict the result of any query.

Figure K1.1 — Every KQL operator takes a table as input and produces a table as output. The pipeline reduces data at each step.

What a table is in KQL

A KQL table has:

SigninLogs
| getschema

// How many rows in the last hour?
SigninLogs
| where TimeGenerated > ago(1h)
| count

// What columns exist and what types are they?
SigninLogs
| getschema

// What does a sample row look like?
SigninLogs
| where TimeGenerated > ago(1h)
| take 1

// Step 1: How much data is there?
NewTable
| where TimeGenerated > ago(1h)
| count

// Step 2: What columns exist?
NewTable
| getschema

// Step 3: What does the data look like?
NewTable
| where TimeGenerated > ago(1h)
| take 5

// Step 4: What are the key entities?
NewTable
| where TimeGenerated > ago(1h)
| summarize dcount(AccountName), dcount(DeviceName) 

// Step 5: What are the common values in categorical columns?
NewTable
| where TimeGenerated > ago(1h)
| summarize count() by ActionType
| sort by count_ desc
| take 20

search *
| distinct $table
| sort by $table asc

SigninLogs
| where TimeGenerated > ago(1h)
| take 1
| extend 
    Country = tostring(LocationDetails.countryOrRegion),
    City = tostring(LocationDetails.city),
    Latitude = toreal(LocationDetails.geoCoordinates.latitude),
    Longitude = toreal(LocationDetails.geoCoordinates.longitude),
    OS = tostring(DeviceDetail.operatingSystem),
    Browser = tostring(DeviceDetail.browser),
    FirstAuthMethod = tostring(AuthenticationDetails[0].authenticationMethod),
    FirstAuthResult = tobool(AuthenticationDetails[0].succeeded)

.show table SigninLogs policy retention

Expand for Deeper Context

- Columns — each with a name and a fixed data type (string, datetime, int, long, real, bool, dynamic, timespan, guid, decimal) - Rows — each containing one value per column - No inherent row order — rows in a table have no guaranteed sequence unless you explicitly sort

Security log tables in Sentinel follow this model. SigninLogs is a table. Each row is one sign-in event. Each column is an attribute of that event: TimeGenerated (datetime), UserPrincipalName (string), IPAddress (string), ResultType (string — yes, it is a string even though it contains numbers), Location (dynamic — a JSON object containing country, city, state).

Run this query to see the schema of any table:

The output is itself a table — with columns for ColumnName, ColumnOrdinal, DataType, and ColumnType. Every KQL operation, including schema inspection, produces tabular output.

Tables you will use throughout this course

Security operations in Microsoft environments primarily query these tables:

Table	Source	What each row represents
`SigninLogs`	Entra ID	One interactive user sign-in attempt (success or failure)
`AADNonInteractiveUserSignInLogs`	Entra ID	One non-interactive sign-in (token refresh, background SSO)
`AuditLogs`	Entra ID	One directory change (user created, role assigned, policy modified)
`OfficeActivity`	M365	One user action in Exchange, SharePoint, or OneDrive
`DeviceProcessEvents`	Defender for Endpoint	One process creation on an onboarded device
`DeviceNetworkEvents`	Defender for Endpoint	One network connection from an onboarded device
`DeviceFileEvents`	Defender for Endpoint	One file creation, modification, or deletion on an onboarded device
`DeviceLogonEvents`	Defender for Endpoint	One logon event on an onboarded device
`EmailEvents`	Defender for Office 365	One email message processed
`EmailUrlInfo`	Defender for Office 365	One URL found in an email
`CloudAppEvents`	Defender for Cloud Apps	One SaaS application activity event
`SecurityAlert`	All Defender products	One security alert from any Microsoft security product
`SecurityIncident`	Sentinel	One incident (which may contain multiple alerts)

Each table has dozens of columns. You do not need to memorise them — getschema and the Sentinel documentation are your references. What you need to understand is that every row in every table represents one discrete event, and every column represents one attribute of that event.

The output is always a table

This concept has practical implications:

A count is a table. Run SigninLogs | count and the output is a table with one column (Count) and one row (the number). You can pipe this into further operations — SigninLogs | count | where Count > 1000 is valid KQL.

A summarize is a table. Run SigninLogs | summarize count() by IPAddress and the output is a table with columns IPAddress and count_. The by clause becomes the row keys. The aggregation function becomes a new column. This output table can be joined, filtered, extended, summarized again — any tabular operation.

A join produces a table. When you join SigninLogs to OfficeActivity, the output is a table with columns from both tables. The column names are preserved — if both tables have a column named TimeGenerated, the output has TimeGenerated (from the left) and TimeGenerated1 (from the right, automatically renamed).

An empty result is a table. A query that matches zero rows still returns the column structure. This matters for detection rules — a scheduled analytics rule that returns zero rows means "no detection." A rule that returns one or more rows means "generate an alert."

Examining table shapes

Before writing any complex query, check the shape of your data:

These three queries — count, schema, sample — are the first thing you run against any table you have not worked with before. They tell you: how much data you are dealing with, what fields are available, and what the data actually looks like.

Table schemas change over time

Microsoft periodically adds, renames, or deprecates columns in security log tables. A query that worked last month may fail this month because a column was renamed. This is particularly common with:

- SigninLogs — Microsoft has added columns for CAE (Continuous Access Evaluation), authentication strength, and cross-tenant access over the past year - DeviceProcessEvents — new columns for SHA256 hashes and process integrity levels - AuditLogs — expanded TargetResources structure for directory role assignments

Defense against schema changes: Always use getschema to verify column names before writing queries against tables you have not queried recently. If a query that previously worked returns an error about a missing column, check whether the column was renamed (common) or deprecated (rare).

Exploring an unfamiliar table

When you encounter a table for the first time — perhaps during an investigation that takes you into CloudAppEvents or AADServicePrincipalSignInLogs — follow this systematic exploration:

These five queries take less than a minute to run and give you a complete understanding of any table. Step 5 is particularly important — the ActionType (or equivalent) column tells you what kinds of events the table records, which determines what security questions it can answer.

Custom tables and the `_CL` suffix

In addition to the built-in tables, your Sentinel workspace may have custom log tables. These are created by Data Collection Rules (DCRs) that ingest data from non-standard sources — on-premises firewalls, custom applications, third-party security tools. Custom tables always end with the _CL suffix: Firewall_CL, ProxyLogs_CL, ApplicationAudit_CL.

Custom tables follow the same tabular model: columns with fixed types, rows representing individual events, no inherent ordering. The queries you learn in this course work identically on custom tables.

To see all tables in your workspace:

This returns every table that contains data. It is the starting point for discovering what data sources are available for investigation and detection.

The dynamic type — JSON inside your tables

The dynamic type deserves special attention because it contains most of the investigation-critical data in security logs. Columns like LocationDetails, DeviceDetail, AuthenticationDetails, ConditionalAccessPolicies, and AdditionalFields are all dynamic — they store JSON objects or arrays inside a single column.

Accessing data inside dynamic columns requires dot notation for objects and bracket notation for arrays:

Three critical rules for dynamic columns:

Rule 1: Always wrap dynamic access in a type conversion function. LocationDetails.countryOrRegion returns a dynamic value, not a string. Comparing it with == may fail silently. tostring(LocationDetails.countryOrRegion) converts it to a string that works with all string operators.

Rule 2: Array access uses zero-based indexing. AuthenticationDetails[0] is the first authentication step. AuthenticationDetails[1] is the second (typically MFA). Index out of bounds returns null — it does not error.

Rule 3: Nested access chains from left to right. LocationDetails.geoCoordinates.latitude drills into the LocationDetails object, then into the nested geoCoordinates object, then extracts the latitude field. Each dot traverses one level of nesting.

Why some columns look like strings but are not

A common frustration: you see a value in the query results that looks like a simple string — "50126" in the ResultType column — but your comparison where ResultType == 50126 returns no results. The value is stored as the string "50126", not the integer 50126. String-to-integer comparison fails silently in KQL — it does not produce an error, it simply matches zero rows.

The fix: always check the schema first (getschema), and match the type in your comparison. If ResultType is a string, compare with a string: where ResultType == "50126". If you need it as an integer for arithmetic, convert explicitly: extend ResultCode = toint(ResultType).

This is not a minor inconvenience — it is the source of the most common detection rule bug. A rule that filters on where ResultType == 0 instead of where ResultType == "0" silently matches zero events and generates zero alerts. The rule appears to be working (no errors), but it is blind. Every KQL practitioner encounters this bug at least once. Schema awareness prevents it.

Table retention and its impact on queries

Every table in your Sentinel workspace has a retention period — how long data is stored before being deleted. Retention varies by table:

- SecurityEvent, Syslog: Typically 90 days (configurable up to 730 days) - SigninLogs: 30 days default on Log Analytics, up to 730 with extended retention - DeviceProcessEvents: 30 days in advanced hunting, 90 days in Sentinel - AuditLogs: 30 days default

When you query SigninLogs | where TimeGenerated > ago(90d) on a workspace with 30-day retention, you get 30 days of data — not an error. KQL does not warn you that the time range exceeds retention. It simply returns whatever data exists, which may be far less than you expected.

Investigation implication: If you are investigating an incident that started 45 days ago but SigninLogs retention is 30 days, the first 15 days of adversary activity are gone. This is why incident responders immediately export relevant logs to long-term storage (Azure Blob, Log Analytics archive tier) during the initial response phase — before retention deletes the evidence.

Check retention for any table you query:

Or in the Azure portal: Log Analytics workspace → Tables → select table → Manage table → Retention settings.

Try it yourself

Run getschema against three tables you use regularly

Run getschema against three tables you use regularly. For each, identify:

How many columns does the table have?
How many columns are string type vs dynamic type?
Which column contains the primary entity (user, device, IP) for that table?

Understanding the schema before writing queries prevents the most common class of errors: referencing columns that do not exist, treating string columns as integers, and missing data buried in dynamic (JSON) columns.

Check your understanding

1. You run SigninLogs | where ResultType == 0 | count and get the result 14,523. What is the shape of the output?

A table with one column (Count) and one row (14523). Even a single number is returned as a tabular result in KQL. This means you can pipe the count into further operations — for example, you could join this count to another query or use it in a conditional. The output is never a bare scalar; it is always a table.

A single number: 14523

A table with one row containing all columns from SigninLogs

In KQL, every output is a table. count produces a table with one column and one row. Understanding this principle is foundational to understanding how operators chain together.

Compliance Myth: "KQL is just SQL with different syntax"

The myth: If you know SQL, you know KQL. Just swap SELECT for project and FROM for the table name.

The reality: KQL and SQL share superficial similarities but have fundamentally different execution models. SQL is set-based (define WHAT you want, the engine decides HOW). KQL is pipeline-based (define the SEQUENCE of transformations, each operating on the previous output). SQL subqueries and nested JOINs have no direct KQL equivalent. KQL's let statements and tabular pipeline produce different query patterns than SQL. Analysts who try to write KQL like SQL produce verbose, slow, incorrect queries. Learn KQL as a NEW language, not a SQL dialect.

Troubleshooting

"My query returns a table with columns but zero rows." The pipeline produced an empty table — one of the where filters eliminated all rows. Debug by removing filters one at a time from the bottom up: does the query return rows without the last where? If yes: that filter is too restrictive. Check for field name typos, case-sensitivity issues, or incorrect comparison operators.

Decision point

You are writing a KQL query and cannot remember the exact operator syntax. Do you check the Microsoft documentation or ask Claude?

Check the documentation first for SYNTAX questions (exact parameter names, return types, operator behavior). The KQL documentation at learn.microsoft.com/kusto is the authoritative source. Claude may generate plausible but incorrect syntax — especially for newer operators or recently changed behavior. Use Claude for DESIGN questions: 'I need to detect accounts with sign-ins from more than 3 countries in 24 hours — what is the best KQL approach?' Claude excels at designing the query strategy; the documentation excels at providing the exact syntax.

You've learned how KQL processes data.

K0 gave you the query language's place in the Microsoft security stack. K1 took you through the semantics — tables, operators, the pipe model, and why KQL isn't SQL. Now you write the queries that find what attackers hope you miss.

12 modules of query craft — filtering and shaping, joins and unions, time-series analysis, summarisation, string manipulation, and geospatial analysis
68 KQL exercises — every one with a realistic dataset, a reference solution, and a discussion of alternative approaches
K11 — Threat Hunting with KQL — the course's flagship module. Hypothesis-driven methodology, MITRE ATT&CK-aligned hunting across 7 techniques, UEBA composite risk scoring, and retroactive IOC sweeps
K13 Capstone — The Hunting Lab — three complete investigation scenarios requiring every query skill from the course
Hunt management and ROI metrics — the operating model that justifies KQL hunt programs to leadership

Unlock the full course with Premium See Full Syllabus

← Previous Next →