2.6 Exam Relevance and References
SC-200 Exam Relevance
KQL appears across every domain of the SC-200 exam. The exam does not test you on KQL syntax in isolation — it tests whether you can use KQL to solve investigation problems. Expect questions that present a scenario and ask which query or operator achieves the stated goal.
What the exam tests
Build KQL statements — write queries using where, summarize, extend, project, join, and let. The exam expects you to read a query and identify what it does, or choose the correct query for a given investigation task.
Analyze data by using KQL — use time functions, aggregation, and string operators to answer security questions. This includes interpreting query output and knowing which table contains the data you need.
Use KQL across investigation scenarios — the exam frequently presents a scenario (compromised account, phishing campaign, suspicious device activity) and asks you to select the query that finds the answer.
A common exam format: "A security analyst needs to find users who signed in from more than 3 countries in the last 7 days. Which query achieves this?" You will see 4 options with similar-looking KQL. The differences will be in the operator choice (has vs contains), the aggregation function (count vs dcount), or the join kind (inner vs innerunique). Being precise about what each operator does — not just approximately understanding it — is what the exam rewards.
KQL Quick Reference
Keep this reference open during practice. Every operator and function covered in this module, in one place.
Operators
where — filter rows by conditionproject — select and optionally rename columnsproject-away — remove specific columnsextend — add calculated columns (keeps existing)summarize — aggregate rows by groupsort by — order resultstake / limit — return N rowstop N by — shortcut for sort + takejoin kind=X — combine tables on shared keymv-expand — expand array into rowsrender — visualise results as chartlet — declare variable or sub-queryunion — combine rows from multiple tablesgetschema — show table column names and types
Aggregation Functions
count() — count rows per groupdcount(col) — count distinct valuescountif(condition) — count rows matching conditionmake_set(col) — collect distinct values into arraymake_list(col) — collect all values into arrayarg_max(col, *) — row with maximum valuearg_min(col, *) — row with minimum valuesum(col) — sum numeric columnavg(col) — average numeric columnmin(col) / max(col) — minimum / maximum valuepercentile(col, N) — Nth percentile
String Operators (by preference order)
== — exact match (case-sensitive)=~ — exact match (case-insensitive)has — term match, indexed, fast (default choice)has_cs — term match, case-sensitive, fastesthas_any(list) — matches any term in listhas_all(list) — matches all termscontains — substring scan, slow (partial words only)startswith — prefix matchendswith — suffix matchmatches regex — regex, slowest (last resort)in / !in — value in / not in list
Time Functions
ago(duration) — relative time from nownow() — current timebetween(start .. end) — fixed time rangebin(col, interval) — bucket timestampsdatetime_diff(unit, end, start) — time differencehourofday(ts) — hour 0-23dayofweek(ts) — day of weekstartofday(ts) — midnight of that dayformat_datetime(ts, format) — formatted output
Parsing Functions
parse_json(string) — parse JSON string to dynamicparse_url(string) — decompose URL into componentssplit(string, delimiter) — split string into arrayextract(regex, group, string) — regex extractiontostring(dynamic) — convert dynamic to stringtoint(string) / tolong(string) — convert to integertodatetime(string) — convert to datetime
References
- KQL language reference — Microsoft Learn
- Advanced hunting schema — Microsoft Defender XDR
- Sentinel data source schema reference
- Log Analytics demo environment — practise queries with real-looking data
Module 2 — Key takeaways
- KQL queries are pipelines — data flows through operators separated by pipes, each transforming the data
- Always filter on TimeGenerated first — it allows the engine to skip data partitions
- Default to
hasfor string matching. Usecontainsonly for partial-word substrings summarizewithcount(),dcount(), andmake_set()answers most investigation questionsarg_max(TimeGenerated, *)returns the most recent row per group — the most common patternmv-expandis essential for working with JSON arrays in security tables- Build queries incrementally — run after each line, check output, add the next operator
- Brute force: summarize by user. Password spray: summarize by IP. Same operators, different grouping
letstatements make detection rules tuneable without rewriting query logic- Always check both SigninLogs and AADNonInteractiveUserSignInLogs during compromise investigations
Module 2 — Final knowledge check
1. Write the correct order of operations for an efficient investigation query.
2. A colleague writes: where SenderFromAddress contains "phish". What would you recommend instead?
3. You need to find all users who signed in from more than 3 countries in a single day. Which combination of operators achieves this?
4. The ConditionalAccessPolicies field contains a JSON array. You need to find which specific policy blocked a sign-in. What is the correct approach?
5. You need to check whether a phishing email recipient subsequently signed in from the same IP as the email sender. Which approach works?
6. An IP shows 50 failed sign-ins against 40 different user accounts in one hour. Which attack pattern is this?
7. What does arg_max(TimeGenerated, *) do when used with summarize ... by UserPrincipalName?
8. You want to detect sign-ins between midnight and 6am. Which time function extracts the hour?
where Hour < 6 to find activity before 6am. bin() buckets timestamps for aggregation (different use case). format_datetime() produces a string (less efficient for numeric comparison).9. What is the performance difference between has and contains on a table with 10 million rows?
10. When should you use an anti join?