2.6 Exam Relevance and References

90 minutes · Module 2 · Free

SC-200 Exam Relevance

KQL appears across every domain of the SC-200 exam. The exam does not test you on KQL syntax in isolation — it tests whether you can use KQL to solve investigation problems. Expect questions that present a scenario and ask which query or operator achieves the stated goal.

What the exam tests

Build KQL statements — write queries using where, summarize, extend, project, join, and let. The exam expects you to read a query and identify what it does, or choose the correct query for a given investigation task.

Analyze data by using KQL — use time functions, aggregation, and string operators to answer security questions. This includes interpreting query output and knowing which table contains the data you need.

Use KQL across investigation scenarios — the exam frequently presents a scenario (compromised account, phishing campaign, suspicious device activity) and asks you to select the query that finds the answer.

Exam tip: operator selection questions

A common exam format: "A security analyst needs to find users who signed in from more than 3 countries in the last 7 days. Which query achieves this?" You will see 4 options with similar-looking KQL. The differences will be in the operator choice (has vs contains), the aggregation function (count vs dcount), or the join kind (inner vs innerunique). Being precise about what each operator does — not just approximately understanding it — is what the exam rewards.

KQL Quick Reference

Keep this reference open during practice. Every operator and function covered in this module, in one place.

Operators

where — filter rows by condition
project — select and optionally rename columns
project-away — remove specific columns
extend — add calculated columns (keeps existing)
summarize — aggregate rows by group
sort by — order results
take / limit — return N rows
top N by — shortcut for sort + take
join kind=X — combine tables on shared key
mv-expand — expand array into rows
render — visualise results as chart
let — declare variable or sub-query
union — combine rows from multiple tables
getschema — show table column names and types

Aggregation Functions

count() — count rows per group
dcount(col) — count distinct values
countif(condition) — count rows matching condition
make_set(col) — collect distinct values into array
make_list(col) — collect all values into array
arg_max(col, *) — row with maximum value
arg_min(col, *) — row with minimum value
sum(col) — sum numeric column
avg(col) — average numeric column
min(col) / max(col) — minimum / maximum value
percentile(col, N) — Nth percentile

String Operators (by preference order)

== — exact match (case-sensitive)
=~ — exact match (case-insensitive)
has — term match, indexed, fast (default choice)
has_cs — term match, case-sensitive, fastest
has_any(list) — matches any term in list
has_all(list) — matches all terms
contains — substring scan, slow (partial words only)
startswith — prefix match
endswith — suffix match
matches regex — regex, slowest (last resort)
in / !in — value in / not in list

Time Functions

ago(duration) — relative time from now
now() — current time
between(start .. end) — fixed time range
bin(col, interval) — bucket timestamps
datetime_diff(unit, end, start) — time difference
hourofday(ts) — hour 0-23
dayofweek(ts) — day of week
startofday(ts) — midnight of that day
format_datetime(ts, format) — formatted output

Parsing Functions

parse_json(string) — parse JSON string to dynamic
parse_url(string) — decompose URL into components
split(string, delimiter) — split string into array
extract(regex, group, string) — regex extraction
tostring(dynamic) — convert dynamic to string
toint(string) / tolong(string) — convert to integer
todatetime(string) — convert to datetime

References

KQL language reference — Microsoft Learn
Advanced hunting schema — Microsoft Defender XDR
Sentinel data source schema reference
Log Analytics demo environment — practise queries with real-looking data

Module 2 — Key takeaways

KQL queries are pipelines — data flows through operators separated by pipes, each transforming the data
Always filter on TimeGenerated first — it allows the engine to skip data partitions
Default to has for string matching. Use contains only for partial-word substrings
summarize with count(), dcount(), and make_set() answers most investigation questions
arg_max(TimeGenerated, *) returns the most recent row per group — the most common pattern
mv-expand is essential for working with JSON arrays in security tables
Build queries incrementally — run after each line, check output, add the next operator
Brute force: summarize by user. Password spray: summarize by IP. Same operators, different grouping
let statements make detection rules tuneable without rewriting query logic
Always check both SigninLogs and AADNonInteractiveUserSignInLogs during compromise investigations

Module 2 — Final knowledge check

1. Write the correct order of operations for an efficient investigation query.

summarize → where → project → sort

Table → where (time first) → where (filters) → extend → summarize → project → sort

project → where → summarize

Start with the table. Filter by time first (performance). Add specific filters. Calculate with extend. Aggregate with summarize. Select output columns with project. Sort for readability. This pattern — filter early, project late — is the foundation of efficient KQL.

2. A colleague writes: where SenderFromAddress contains "phish". What would you recommend instead?

Use has instead — it checks the term index and is significantly faster on large datasets

The query is fine as written

Use == instead

has uses the term index and runs orders of magnitude faster than contains on tables with millions of rows. Since "phish" would be a standalone term in an email address, has is the correct choice. contains scans every character of every string — only use it for partial-word substrings.

3. You need to find all users who signed in from more than 3 countries in a single day. Which combination of operators achieves this?

summarize dcount(Location) by UserPrincipalName, then where dcount > 3

where Location > 3

project Location, count()

dcount() counts distinct values. Summarizing dcount of the country field by user gives you how many different countries each user signed in from. Filtering where that count exceeds 3 identifies potential impossible travel or compromised accounts.

4. The ConditionalAccessPolicies field contains a JSON array. You need to find which specific policy blocked a sign-in. What is the correct approach?

Use where ConditionalAccessStatus == "failure"

Use mv-expand to expand the array, then filter on the individual policy result field

Use parse_json on the ConditionalAccessStatus field

ConditionalAccessStatus tells you the overall result, but not which policy. The ConditionalAccessPolicies array contains each policy with its name and individual result. mv-expand creates one row per policy, letting you filter to find the specific policy that blocked the sign-in.

5. You need to check whether a phishing email recipient subsequently signed in from the same IP as the email sender. Which approach works?

Put both queries in the same where clause

Use a let statement to get phishing sender IPs from EmailEvents, then join to SigninLogs on IP address

Use mv-expand on the email table

This is a cross-table correlation. A let statement captures the phishing email data, then a join matches those IPs against sign-in logs. This let + join pattern is how you trace attack chains across different data sources.

6. An IP shows 50 failed sign-ins against 40 different user accounts in one hour. Which attack pattern is this?

Brute force

Password spray

Token replay

One IP targeting many distinct accounts is the password spray signature. Each account sees only 1-2 failures (staying below lockout thresholds), but the IP reveals the pattern when you summarize by IPAddress and count distinct UserPrincipalNames.

7. What does arg_max(TimeGenerated, *) do when used with summarize ... by UserPrincipalName?

Returns the complete row with the latest timestamp for each user — showing their most recent sign-in with all columns

Returns the maximum timestamp only

Returns all rows for the user with the most sign-ins

arg_max returns the entire row (all columns, indicated by *) that has the maximum value of the specified column. Combined with by UserPrincipalName, it returns one row per user — the row with the most recent TimeGenerated. This is the standard pattern for getting the latest state of each entity.

8. You want to detect sign-ins between midnight and 6am. Which time function extracts the hour?

hourofday(TimeGenerated) — returns 0-23

bin(TimeGenerated, 1h)

format_datetime(TimeGenerated, "HH")

hourofday() returns the integer hour (0-23) from a timestamp. You can then filter with where Hour < 6 to find activity before 6am. bin() buckets timestamps for aggregation (different use case). format_datetime() produces a string (less efficient for numeric comparison).

9. What is the performance difference between has and contains on a table with 10 million rows?

No difference

has uses the term index (seconds); contains scans every character of every string (potentially minutes)

contains is faster because it is simpler

The term index is the key. Kusto automatically indexes terms (words separated by non-alphanumeric characters) in string columns. has looks up the index — like using an index in a book. contains reads every page. On 10 million rows, this is the difference between a 2-second query and a 2-minute query.

10. When should you use an anti join?

When you need all matching rows from both tables

When you want to deduplicate results

When you need rows from the left table that have no match in the right table — exclusion queries

Anti joins find non-matches. Examples: users who received phishing but have not signed in since (no compromise), devices that have not reported in 30 days (potentially offline), IPs sending email that are not in your authorised sender list (potential spoofing).

2.5 Investigation Patterns and Query Organisation →