2.6 Exam Relevance and References

90 minutes · Module 2 · Free

SC-200 Exam Relevance

KQL appears across every domain of the SC-200 exam. The exam does not test you on KQL syntax in isolation — it tests whether you can use KQL to solve investigation problems. Expect questions that present a scenario and ask which query or operator achieves the stated goal.

What the exam tests

Build KQL statements — write queries using where, summarize, extend, project, join, and let. The exam expects you to read a query and identify what it does, or choose the correct query for a given investigation task.

Analyze data by using KQL — use time functions, aggregation, and string operators to answer security questions. This includes interpreting query output and knowing which table contains the data you need.

Use KQL across investigation scenarios — the exam frequently presents a scenario (compromised account, phishing campaign, suspicious device activity) and asks you to select the query that finds the answer.

Exam tip: operator selection questions

A common exam format: "A security analyst needs to find users who signed in from more than 3 countries in the last 7 days. Which query achieves this?" You will see 4 options with similar-looking KQL. The differences will be in the operator choice (has vs contains), the aggregation function (count vs dcount), or the join kind (inner vs innerunique). Being precise about what each operator does — not just approximately understanding it — is what the exam rewards.

KQL Quick Reference

Keep this reference open during practice. Every operator and function covered in this module, in one place.

Operators

where — filter rows by condition
project — select and optionally rename columns
project-away — remove specific columns
extend — add calculated columns (keeps existing)
summarize — aggregate rows by group
sort by — order results
take / limit — return N rows
top N by — shortcut for sort + take
join kind=X — combine tables on shared key
mv-expand — expand array into rows
render — visualise results as chart
let — declare variable or sub-query
union — combine rows from multiple tables
getschema — show table column names and types

Aggregation Functions

count() — count rows per group
dcount(col) — count distinct values
countif(condition) — count rows matching condition
make_set(col) — collect distinct values into array
make_list(col) — collect all values into array
arg_max(col, *) — row with maximum value
arg_min(col, *) — row with minimum value
sum(col) — sum numeric column
avg(col) — average numeric column
min(col) / max(col) — minimum / maximum value
percentile(col, N) — Nth percentile

String Operators (by preference order)

== — exact match (case-sensitive)
=~ — exact match (case-insensitive)
has — term match, indexed, fast (default choice)
has_cs — term match, case-sensitive, fastest
has_any(list) — matches any term in list
has_all(list) — matches all terms
contains — substring scan, slow (partial words only)
startswith — prefix match
endswith — suffix match
matches regex — regex, slowest (last resort)
in / !in — value in / not in list

Time Functions

ago(duration) — relative time from now
now() — current time
between(start .. end) — fixed time range
bin(col, interval) — bucket timestamps
datetime_diff(unit, end, start) — time difference
hourofday(ts) — hour 0-23
dayofweek(ts) — day of week
startofday(ts) — midnight of that day
format_datetime(ts, format) — formatted output

Parsing Functions

parse_json(string) — parse JSON string to dynamic
parse_url(string) — decompose URL into components
split(string, delimiter) — split string into array
extract(regex, group, string) — regex extraction
tostring(dynamic) — convert dynamic to string
toint(string) / tolong(string) — convert to integer
todatetime(string) — convert to datetime

References

Module 2 — Key takeaways

  • KQL queries are pipelines — data flows through operators separated by pipes, each transforming the data
  • Always filter on TimeGenerated first — it allows the engine to skip data partitions
  • Default to has for string matching. Use contains only for partial-word substrings
  • summarize with count(), dcount(), and make_set() answers most investigation questions
  • arg_max(TimeGenerated, *) returns the most recent row per group — the most common pattern
  • mv-expand is essential for working with JSON arrays in security tables
  • Build queries incrementally — run after each line, check output, add the next operator
  • Brute force: summarize by user. Password spray: summarize by IP. Same operators, different grouping
  • let statements make detection rules tuneable without rewriting query logic
  • Always check both SigninLogs and AADNonInteractiveUserSignInLogs during compromise investigations

Module 2 — Final knowledge check

1. Write the correct order of operations for an efficient investigation query.

summarize → where → project → sort
Table → where (time first) → where (filters) → extend → summarize → project → sort
project → where → summarize

2. A colleague writes: where SenderFromAddress contains "phish". What would you recommend instead?

Use has instead — it checks the term index and is significantly faster on large datasets
The query is fine as written
Use == instead

3. You need to find all users who signed in from more than 3 countries in a single day. Which combination of operators achieves this?

summarize dcount(Location) by UserPrincipalName, then where dcount > 3
where Location > 3
project Location, count()

4. The ConditionalAccessPolicies field contains a JSON array. You need to find which specific policy blocked a sign-in. What is the correct approach?

Use where ConditionalAccessStatus == "failure"
Use mv-expand to expand the array, then filter on the individual policy result field
Use parse_json on the ConditionalAccessStatus field

5. You need to check whether a phishing email recipient subsequently signed in from the same IP as the email sender. Which approach works?

Put both queries in the same where clause
Use a let statement to get phishing sender IPs from EmailEvents, then join to SigninLogs on IP address
Use mv-expand on the email table

6. An IP shows 50 failed sign-ins against 40 different user accounts in one hour. Which attack pattern is this?

Brute force
Password spray
Token replay

7. What does arg_max(TimeGenerated, *) do when used with summarize ... by UserPrincipalName?

Returns the complete row with the latest timestamp for each user — showing their most recent sign-in with all columns
Returns the maximum timestamp only
Returns all rows for the user with the most sign-ins

8. You want to detect sign-ins between midnight and 6am. Which time function extracts the hour?

hourofday(TimeGenerated) — returns 0-23
bin(TimeGenerated, 1h)
format_datetime(TimeGenerated, "HH")

9. What is the performance difference between has and contains on a table with 10 million rows?

No difference
has uses the term index (seconds); contains scans every character of every string (potentially minutes)
contains is faster because it is simpler

10. When should you use an anti join?

When you need all matching rows from both tables
When you want to deduplicate results
When you need rows from the left table that have no match in the right table — exclusion queries