TH1.9 Advanced Collection: Multi-Table Correlation Patterns

3-4 hours · Module 1 · Free

Operational Objective

The most dangerous M365 attacks cross data source boundaries — the authentication is in SigninLogs, the persistence is in AuditLogs, the data access is in CloudAppEvents, the exfiltration is in DeviceNetworkEvents. Single-table hunting finds indicators. Multi-table correlation finds attack chains. This subsection teaches the join patterns that campaign modules use to connect evidence across the M365 telemetry landscape.

Deliverable: Production-ready KQL join patterns for the six most common multi-table correlations in M365 hunting, with performance optimization guidance.

⏱ Estimated completion: 30 minutes

Single tables find events. Joins find attacks.

An anomalous sign-in in SigninLogs is an indicator. That sign-in followed by an inbox rule creation in CloudAppEvents followed by an OAuth consent in AuditLogs is an attack chain. The difference is the join.

Six multi-table correlation patterns appear repeatedly across the campaign modules. Learn these patterns here. Apply them in TH4 through TH13.

Pattern 1: Authentication + directory change correlation

The most common hunting correlation: did the user who signed in anomalously also make directory changes?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// Pattern: anomalous sign-in → directory modification in same window
// Use case: AiTM → MFA registration, role assignment, CA policy change
let anomalyWindow = 24h;
let suspectSignIns = SigninLogs
| where TimeGenerated > ago(7d)
| where ResultType == 0
| where RiskLevelDuringSignIn in ("medium", "high")
| project SignInTime = TimeGenerated,
    UserPrincipalName, IPAddress,
    tostring(LocationDetails.countryOrRegion);
// Join: what directory changes did these users make?
AuditLogs
| where TimeGenerated > ago(7d)
| where Result == "success"
| extend Actor = tostring(InitiatedBy.user.userPrincipalName)
| join kind=inner suspectSignIns
    on $left.Actor == $right.UserPrincipalName
| where TimeGenerated between (
    SignInTime .. (SignInTime + anomalyWindow))
// Directory changes within 24h of a risky sign-in
// Relevant operations: "User registered security info",
//   "Add member to role", "Update conditional access policy",
//   "Consent to application"
| project TimeGenerated, Actor, OperationName,
    SignInTime, IPAddress

Pattern 2: Authentication + cloud app activity correlation

What did the user do after signing in? The sign-in is in SigninLogs. The activity is in CloudAppEvents.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
// Pattern: sign-in from new IP → cloud app activity from that session
// Use case: post-compromise activity mapping
let suspectUsers = SigninLogs
| where TimeGenerated > ago(7d)
| where ResultType == 0
// ... (your indicator filter from the hunt hypothesis)
| distinct UserPrincipalName, IPAddress;
CloudAppEvents
| where TimeGenerated > ago(7d)
| join kind=inner suspectUsers
    on $left.AccountId == $right.UserPrincipalName
| where IPAddress == suspectUsers.IPAddress
// Activity from the same IP as the anomalous sign-in
// Look for: New-InboxRule, Set-Mailbox, file downloads,
//   sharing operations, consent events
| summarize Actions = make_set(ActionType, 20),
    ActionCount = count()
    by AccountId, IPAddress
| where ActionCount > 5
// Users with 5+ actions from the anomalous IP warrant investigation

Pattern 3: Email delivery + authentication correlation

Was a phishing email delivered before the anomalous sign-in? This is the AiTM confirmation pattern.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
// Pattern: phishing email → compromised sign-in within N hours
// Use case: confirming AiTM attack chain
let maxPhishToSignIn = 48h;
let riskySignIns = SigninLogs
| where TimeGenerated > ago(7d)
| where ResultType == 0
| where RiskLevelDuringSignIn in ("medium", "high")
| project SignInTime = TimeGenerated,
    UserPrincipalName, IPAddress;
EmailEvents
| where TimeGenerated > ago(7d)
| where DeliveryAction == "Delivered"
| where ThreatTypes has_any ("Phish", "Malware")
| join kind=inner riskySignIns
    on $left.RecipientEmailAddress == $right.UserPrincipalName
| where SignInTime between (
    TimeGenerated .. (TimeGenerated + maxPhishToSignIn))
// Phishing email delivered → risky sign-in within 48 hours
// This correlation is the strongest single indicator of AiTM
| project EmailTime = TimeGenerated, Subject,
    SenderFromAddress, SignInTime, IPAddress,
    UserPrincipalName

Pattern 4: Cloud activity + endpoint activity correlation

Did the cloud-plane compromise reach the endpoint? The OAuth consent is in AuditLogs. The file download is in CloudAppEvents. The execution is in DeviceProcessEvents.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
// Pattern: cloud compromise → endpoint execution
// Use case: tracking attack pivot from cloud to endpoint
let compromisedUsers = datatable(UPN:string)
["j.morrison@northgateeng.com"];  // From hunt finding
// Cloud activity
let cloudActions = CloudAppEvents
| where TimeGenerated > ago(7d)
| where AccountId in (compromisedUsers)
| project CloudTime = TimeGenerated, ActionType,
    Application, AccountId;
// Endpoint activity for the same user
DeviceProcessEvents
| where TimeGenerated > ago(7d)
| where AccountUpn in (compromisedUsers)
| project EndpointTime = TimeGenerated,
    FileName, ProcessCommandLine,
    DeviceName, AccountUpn
// Manual correlation: examine cloud and endpoint timelines
//   side by side for the compromised user
// Look for: file downloads in cloud followed by execution on endpoint
// Look for: browser-based credential theft followed by tool deployment

The core AiTM detection pattern: token refreshes from IPs that do not match interactive sign-ins.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
// Pattern: interactive baseline → non-interactive anomaly
// Use case: AiTM token replay (TH4 primary detection)
let baselineWindow = 30d;
let detectionWindow = 7d;
let interactiveBaseline = SigninLogs
| where TimeGenerated between (
    ago(baselineWindow + detectionWindow) .. ago(detectionWindow))
| where ResultType == 0 and IsInteractive == true
| summarize KnownIPs = make_set(IPAddress, 30)
    by UserPrincipalName;
AADNonInteractiveUserSignInLogs
| where TimeGenerated > ago(detectionWindow)
| where ResultType == 0
| join kind=inner interactiveBaseline on UserPrincipalName
| where not(IPAddress in (KnownIPs))
// Token refresh from IP not seen in 30-day interactive baseline
// This is the signature of token replay
| summarize RefreshCount = count(),
    Apps = make_set(AppDisplayName, 5)
    by UserPrincipalName, IPAddress
| where RefreshCount > 3
// Multiple refreshes = sustained session from anomalous IP

What happened after an application was consented? The consent is in AuditLogs. The application’s subsequent authentication is in AADServicePrincipalSignInLogs.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// Pattern: consent event → app sign-in → data access
// Use case: OAuth persistence investigation (TH6)
let recentConsents = AuditLogs
| where TimeGenerated > ago(90d)
| where OperationName == "Consent to application"
| where Result == "success"
| extend AppId = tostring(TargetResources[0].id)
| extend ConsentedBy = tostring(
    InitiatedBy.user.userPrincipalName)
| project ConsentTime = TimeGenerated,
    AppId, ConsentedBy;
AADServicePrincipalSignInLogs
| where TimeGenerated > ago(90d)
| join kind=inner recentConsents on AppId
| where TimeGenerated > ConsentTime
// App sign-ins AFTER consent was granted
| summarize
    SignInCount = count(),
    UniqueIPs = dcount(IPAddress),
    IPs = make_set(IPAddress, 5)
    by AppId, ConsentedBy
// High sign-in count from many IPs post-consent = suspicious
// Low sign-in count from expected IP = likely legitimate app

Performance optimization

Multi-table joins are expensive. Advanced Hunting will timeout on poorly structured joins against large tables.

Pre-filter before joining. Always reduce each table to the minimum rows needed before the join. The let pattern — creating a filtered dataset and then joining against it — is not a style preference. It is a performance necessity. A join between two unfiltered 30-day tables will timeout. A join between a filtered set of 50 suspect users and a 7-day window of activity will complete in seconds.

Use kind=inner for most hunting joins. Inner join returns only rows that match on both sides. This is usually what you want — “show me the cloud activity for users who had anomalous sign-ins.” Use kind=leftouter only when you need to see users who had anomalous sign-ins but no cloud activity (the absence of activity may itself be an indicator).

Avoid join on high-cardinality columns. Joining on IPAddress across two large tables produces massive intermediate result sets. Instead, pre-filter one table to a small set of IPs (suspect IPs from the hypothesis), then join.

Figure TH1.9 — Six multi-table correlation patterns. Each connects two or more M365 data sources to trace attack chains across the telemetry landscape.

Try it yourself

Exercise: Run Pattern 5 in your environment

Pattern 5 (interactive → non-interactive IP mismatch) is the most immediately useful. Run it against your Sentinel workspace or Advanced Hunting.

Examine the results. How many users show non-interactive token refreshes from IPs not in their 30-day interactive baseline? Most will be legitimate (VPN, mobile network switching). Look for users where the non-interactive IP is in a different country from all interactive sign-ins — those warrant enrichment across the five dimensions from TH1.4.

⚠ Compliance Myth: "We can hunt effectively within a single data source"

The myth: Each data source can be hunted independently. Cross-table correlation is an advanced technique for later.

The reality: The most dangerous M365 attacks deliberately cross data source boundaries because they know most detection rules query a single table. AiTM phishing spans EmailEvents (delivery), SigninLogs (authentication), AuditLogs (persistence), and CloudAppEvents (post-compromise). A hunt that examines only SigninLogs finds the anomalous sign-in but misses the phishing email that confirms it and the inbox rule that proves persistence. Single-table hunting is valid for orientation queries (step 1). The indicator, enrichment, and pivot queries (steps 2–4) almost always require multi-table correlation to build sufficient confidence for escalation.

Extend these patterns

The six patterns cover the most common correlations. Campaign modules introduce technique-specific correlations that extend these — for example, TH12 (ransomware pre-encryption) correlates DeviceProcessEvents with DeviceFileEvents and DeviceNetworkEvents in a time-sequenced chain (reconnaissance tool execution → backup disruption → C2 beaconing). The principle is the same: pre-filter each table, join on entity (user, device, IP), and constrain by time window. The specific tables and correlation logic change per technique.

📋 Operational Artifact — Multi-Table Correlation Quick Reference

Pattern → Tables → Join Key → Use Case
Auth + Directory → SigninLogs + AuditLogs → UserPrincipalName → Post-auth persistence
Auth + Cloud App → SigninLogs + CloudAppEvents → AccountId/UPN → Post-auth activity
Email + Auth → EmailEvents + SigninLogs → RecipientEmail/UPN → AiTM confirmation
Cloud + Endpoint → CloudAppEvents + DeviceProcessEvents → UPN → Cloud-to-endpoint pivot
Interactive + Non-Interactive → SigninLogs + AADNonInteractive → UPN → Token replay
Audit + SPN → AuditLogs + AADServicePrincipalSignInLogs → AppId → OAuth persistence

References Used in This Subsection

Microsoft. “KQL Join Operator.” Microsoft Learn. https://learn.microsoft.com/en-us/kusto/query/join-operator
Microsoft. “Advanced Hunting — Query Best Practices.” Microsoft Learn. https://learn.microsoft.com/en-us/defender-xdr/advanced-hunting-best-practices

You're reading the free modules of this course

The full course continues with advanced topics, production detection rules, worked investigation scenarios, and deployable artifacts. Premium subscribers get access to all courses.

View Pricing See Full Syllabus

← TH1.8 Hypothesis Prioritization and Backlog Management TH1.10 Behavioral Baselining Methodology →