6.8 Real-World Query Building Exercises

8-12 hours · Module 6 · Free

Real-World Query Building Exercises

Introduction

This subsection contains five investigation scenarios of increasing difficulty. Each gives you a situation description and asks you to construct a query from a blank editor. There are no starter queries, no hints in the prompt, and no partial solutions — just the scenario and the question.

This is the test of whether the module worked. If you can write these queries from scratch, you have KQL fluency. If you struggle, revisit the specific subsection that covers the pattern you need.

Open your Sentinel Logs editor alongside this page. For each scenario, write and run the query before revealing the solution.

Scenario 1: Identify the scope of a password spray

Situation: Your SIEM fires an alert: “Password spray detected from IP range 203.0.113.0/24.” You need to determine how many users were targeted, how many were compromised, and which applications were accessed.

Your task: Write a query that shows, for each IP in the 203.0.113.x range over the last 7 days: the total failure count, the number of unique targeted users, whether any sign-in succeeded, and which applications were accessed by successful sign-ins.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
let attackRange = "203.0.113.";
SigninLogs
| where TimeGenerated > ago(7d)
| where IPAddress startswith attackRange
| summarize
    TotalAttempts = count(),
    Failures = countif(ResultType != "0"),
    Successes = countif(ResultType == "0"),
    TargetedUsers = dcount(UserPrincipalName),
    CompromisedUsers = dcountif(UserPrincipalName, ResultType == "0"),
    Apps = make_set(iff(ResultType == "0", AppDisplayName, ""), 10)
    by IPAddress
| extend Apps = set_difference(Apps, dynamic([""]))
| order by Successes desc, Failures desc

Key technique: dcountif() counts distinct values only when a condition is met — it counts unique users that had successful sign-ins, not just unique users who were targeted. set_difference removes the empty string from the Apps list (which was added by the iff for failed sign-ins).

Scenario 2: Trace a phishing email to compromise

Situation: You receive a user report: “I clicked a link in an email from support@northgate-voicemail.com about 2 hours ago and entered my password on what I now think was a fake page.” The user is j.morrison.

Your task: Write a query that finds the phishing email, then correlates it with any sign-in activity for j.morrison from a new IP address within 1 hour of the email delivery. Show the email time, sign-in time, IP address, country, and application.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
let targetUser = "j.morrison";
EmailEvents
| where TimeGenerated > ago(4h)
| where SenderFromAddress has "northgate-voicemail.com"
| where RecipientEmailAddress has targetUser
| project EmailTime = TimeGenerated, RecipientEmailAddress, Subject
| join kind=inner (
    SigninLogs
    | where TimeGenerated > ago(4h)
    | where UserPrincipalName has targetUser
    | where ResultType == "0"
    | extend Country = tostring(LocationDetails.countryOrRegion)
    | project SigninTime = TimeGenerated, UserPrincipalName, IPAddress, Country, AppDisplayName
) on $left.RecipientEmailAddress == $right.UserPrincipalName
| where SigninTime between (EmailTime .. (EmailTime + 1h))
| project EmailTime, SigninTime, IPAddress, Country, AppDisplayName, Subject

Key technique: The join with a time window (between) connects the email delivery to the subsequent sign-in. This is the email-to-compromise correlation from subsection 6.3 applied to a real scenario.

Situation: Before investigating whether a sign-in from Russia is suspicious for j.morrison, you need to establish their normal sign-in countries over the last 30 days.

Your task: Write a query showing each country j.morrison has signed in from in the last 30 days, the number of sign-ins per country, the first and last sign-in from that country, and the percentage of total sign-ins each country represents.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
let targetUser = "j.morrison";
let totalSignins = toscalar(
    SigninLogs
    | where TimeGenerated > ago(30d)
    | where UserPrincipalName has targetUser
    | where ResultType == "0"
    | count);
SigninLogs
| where TimeGenerated > ago(30d)
| where UserPrincipalName has targetUser
| where ResultType == "0"
| extend Country = tostring(LocationDetails.countryOrRegion)
| summarize
    SigninCount = count(),
    FirstSeen = min(TimeGenerated),
    LastSeen = max(TimeGenerated)
    by Country
| extend PercentOfTotal = round(SigninCount * 100.0 / totalSignins, 1)
| order by SigninCount desc

Key technique: toscalar() converts a single-value query result into a scalar variable that can be used in calculations. This lets you compute percentages — “Russia represents 0.1% of j.morrison’s sign-ins” is more informative than “Russia had 1 sign-in.”

Scenario 4: Find accounts accessed from both a known-bad IP and a legitimate IP

Situation: During an investigation, you identified 198.51.100.44 as an attacker IP. You need to find users who were accessed from BOTH this attacker IP AND their normal IP — indicating the attacker successfully used their credentials alongside the legitimate user.

Your task: Write a query that finds users who had successful sign-ins from 198.51.100.44 AND from at least one other IP in the last 7 days. Show the user, attacker sign-in time, legitimate sign-in time, and both IPs.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
let attackerIP = "198.51.100.44";
let compromisedUsers =
    SigninLogs
    | where TimeGenerated > ago(7d)
    | where ResultType == "0"
    | where IPAddress == attackerIP
    | distinct UserPrincipalName;
SigninLogs
| where TimeGenerated > ago(7d)
| where ResultType == "0"
| where UserPrincipalName in (compromisedUsers)
| extend IsAttacker = iff(IPAddress == attackerIP, "Attacker", "Legitimate")
| summarize
    AttackerSignins = countif(IsAttacker == "Attacker"),
    LegitSignins = countif(IsAttacker == "Legitimate"),
    AttackerIPs = make_set(iff(IsAttacker == "Attacker", IPAddress, ""), 5),
    LegitIPs = make_set(iff(IsAttacker == "Legitimate", IPAddress, ""), 5)
    by UserPrincipalName
| where LegitSignins > 0
| project UserPrincipalName, AttackerSignins, LegitSignins, AttackerIPs, LegitIPs

Key technique: A two-step approach using let to identify compromised users, then re-querying to see ALL their sign-in activity. The iff() within make_set() splits IPs into attacker vs legitimate groups. Users with both attacker and legitimate sign-ins confirm concurrent access — the hallmark of a token replay attack.

Scenario 5: Build a SOC shift-start dashboard query

Situation: You are starting your SOC shift. Write a single query that gives you a health overview: for each data source table in the workspace, show the event count in the last 24 hours, the time since the last event, and flag any table where the last event is more than 2 hours old.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
union withsource=TableName *
| where TimeGenerated > ago(24h)
| summarize
    EventCount = count(),
    LastEvent = max(TimeGenerated)
    by TableName
| extend MinutesSinceLastEvent = datetime_diff('minute', now(), LastEvent)
| extend Status = iff(MinutesSinceLastEvent > 120, "STALE - INVESTIGATE", "Healthy")
| order by MinutesSinceLastEvent desc
| project TableName, EventCount, LastEvent, MinutesSinceLastEvent, Status

Key technique: union withsource=TableName * combines all tables and adds a column identifying which table each row came from. The iff() classifies tables as healthy or stale. Any table showing “STALE” may have a disconnected connector — this query catches data gaps before they affect an investigation.

Check your understanding

1. You completed all 5 scenarios. Which KQL concepts from earlier subsections did you use most frequently?

let for parameterization, where for filtering, summarize with multiple aggregation functions, extend with iff() for classification, and join for cross-table correlation. These five constructs appear in every investigation query. If any felt unfamiliar during the exercises, revisit the subsection that covers it.

search and render

extract and parse

order by and take

The core investigation toolkit is let + where + summarize + extend + join. String functions and visualization are important supporting skills, but these five are the foundation of every query you will write in Modules 9-15.

← 6.7 KQL Performance Optimization and Query Debugging 6.9 Module Summary →