TH0.10 M365 Data Sources for Hunting

3-4 hours · Module 0 · Free

Operational Objective

Every hunt query targets a specific data table. If you do not know what each table contains, what it records, and what it misses, you cannot scope hunts effectively, you cannot interpret results accurately, and you will miss evidence that exists in a table you did not think to query. This subsection is the reference guide to the M365 telemetry tables used throughout this course — what each one records, what hunting questions it answers, and where the blind spots are.

Deliverable: A working knowledge of every M365 data source relevant to threat hunting, including what each table captures, what it does not capture, and which hunt campaigns depend on it.

⏱ Estimated completion: 30 minutes

The tables that matter

You do not need to memorize every column in every table. You need to know which table to query for which question, what the table records, and — critically — what it does not record. The gaps in telemetry are as important as the content, because a hunt that queries a table missing the relevant data produces a false negative: “we looked and found nothing” when the truth is “we looked in the wrong place.”

Identity and authentication tables

SigninLogs — interactive user sign-ins. Every time a user opens a browser and authenticates to Entra ID, the event appears here. Contains: user principal name, IP address, location (country, city), device details (OS, browser), conditional access evaluation results, risk level, MFA requirement and method, authentication protocol, application accessed, result code.

What it answers: Where are users signing in from? Which users are authenticating from new locations? Which sign-ins bypassed MFA? Which conditional access policies applied?

What it misses: Application-based sign-ins and token refreshes — those go to AADNonInteractiveUserSignInLogs. Service principal authentication — that goes to AADServicePrincipalSignInLogs. A hunt that only queries SigninLogs misses the entire token replay attack surface.

Hunt campaigns that use it: TH4 (authentication anomalies), TH7 (privilege escalation), TH10 (lateral movement).

AADNonInteractiveUserSignInLogs — token refreshes and application-based sign-ins. When an application uses a refresh token to obtain a new access token, the event appears here. This is where AiTM token replay is visible — the attacker’s stolen refresh token generating new access tokens from the attacker’s IP.

What it answers: Which IPs are refreshing tokens for each user? Are refresh events coming from IPs that differ from the user’s interactive sign-in IPs? Which applications are using tokens most frequently?

What it misses: The initial interactive authentication — that is in SigninLogs. The actual data access performed with the token — that is in CloudAppEvents or application-specific audit logs.

Hunt campaigns: TH4 (primary table for token replay detection), TH6 (OAuth app sign-in patterns).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// Quick check: are both authentication tables ingested?
// Both are required for comprehensive identity hunting
union
    (SigninLogs | where TimeGenerated > ago(1d)
    | summarize Count = count() | extend Table = "SigninLogs"),
    (AADNonInteractiveUserSignInLogs | where TimeGenerated > ago(1d)
    | summarize Count = count()
    | extend Table = "AADNonInteractiveUserSignInLogs")
| project Table, Count
// If AADNonInteractive returns 0, AiTM token replay is invisible
// This is the single most impactful data source gap in M365 hunting

AADServicePrincipalSignInLogs — service principal authentication. Applications that authenticate with their own credentials (client secret or certificate) rather than on behalf of a user. This is where compromised application credentials appear.

What it answers: Which service principals are authenticating? From which IPs? How frequently? Has the authentication pattern changed?

What it misses: What the application does after authentication — that requires correlating with CloudAppEvents or MicrosoftGraphActivityLogs.

Hunt campaigns: TH6 (OAuth app abuse — post-consent behavior).

AuditLogs (Entra ID) — directory changes. User creation, deletion, and modification. Group membership changes. Role assignments. Application consent. Conditional access policy changes. MFA method registration.

What it answers: Who changed what in the directory? Were any roles assigned outside PIM? Were any conditional access policies weakened? Were any new MFA methods registered?

What it misses: Authentication events (those are in SigninLogs). Data access events (those are in CloudAppEvents). On-premises AD changes (those are in IdentityDirectoryEvents if Defender for Identity is deployed).

Hunt campaigns: TH6 (consent events), TH7 (role and policy changes), TH4 (MFA registration correlation).

Cloud application and email tables

CloudAppEvents — Defender for Cloud Apps telemetry. The richest M365 hunting table for cloud-plane activity. Records Exchange Online operations (inbox rules, mail forwarding, message access), SharePoint/OneDrive file operations, Teams activity, Power Platform operations, and third-party SaaS activity visible to Defender for Cloud Apps.

What it answers: What did the user do after signing in? Were inbox rules created? Were files downloaded from SharePoint? Were sharing links created? What applications accessed the data?

What it misses: Authentication events (SigninLogs). Email content and delivery details (EmailEvents). Endpoint activity (Device* tables). If Defender for Cloud Apps is not connected, this table is empty — and a significant portion of your cloud hunting surface is dark.

Hunt campaigns: TH5 (inbox rules), TH8 (data exfiltration), TH11 (shadow IT), TH13 (insider threat).

EmailEvents — email delivery telemetry from Defender for Office 365. Records email delivery actions, threat detections, sender/recipient, subject, delivery location.

What it answers: Was a phishing email delivered to a user before an anomalous sign-in? What emails did a compromised account send? Were phishing emails sent from a compromised internal account?

What it misses: Email content (requires EmailUrlInfo and EmailAttachmentInfo for URLs and attachments). Post-delivery user actions on the email. Inbox rule processing after delivery.

Hunt campaigns: TH4 (phishing correlation), TH5 (post-compromise email activity).

Endpoint tables

DeviceProcessEvents — process execution on Defender for Endpoint-managed devices. Every process creation, with parent process, command line, file hash, user context, and timestamp.

What it answers: What processes executed? What were the command lines? What parent processes spawned them? Are there unusual process trees?

DeviceFileEvents — file creation, modification, deletion, and rename events on endpoints.

DeviceRegistryEvents — registry key creation, modification, and deletion. Critical for persistence detection — autostart entries, service creation, scheduled task registration.

DeviceNetworkEvents — network connections from endpoints. Destination IP, port, protocol, process that initiated the connection.

DeviceLogonEvents — logon events on endpoints. Local and remote (RDP, network) logon types.

These five tables collectively provide the endpoint hunting surface. They are required for TH9 (endpoint persistence), TH10 (lateral movement), and TH12 (ransomware pre-encryption).

What they miss: Cloud-plane activity (identity, email, SaaS). An attacker operating entirely through the browser or Graph API without dropping files or executing processes on the endpoint generates zero events in these tables.

Supplementary tables

IdentityLogonEvents / IdentityDirectoryEvents — Defender for Identity telemetry. On-premises Active Directory logon and directory change events. Required for hybrid hunting in TH10. If not ingested, cloud-to-on-prem pivot detection has no on-prem visibility.

MicrosoftGraphActivityLogs — Graph API call activity. What applications and users are doing through the API. Relatively new (2024). If ingested, dramatically enriches TH5 (inbox rule creation via Graph) and TH6 (post-consent data access via Graph). If not ingested, these API-based attack paths are invisible.

OfficeActivity — legacy Office 365 audit log connector. Overlaps with CloudAppEvents but with different schema and less enrichment. If your environment uses OfficeActivity instead of CloudAppEvents, the hunt queries need adaptation — the campaigns in this course are written for CloudAppEvents.

Figure TH0.10 — M365 hunting data source map. Three clusters (identity, cloud apps, endpoint) cover the three attack planes. Each cluster has common ingestion gaps noted.

The retention question

Advanced Hunting queries the last 30 days of data. If your hunt hypothesis covers a longer window — and long-dwell hypotheses (APT, supply chain) often do — you need to query through Sentinel’s Log Analytics interface (which respects your configured retention period) or use search jobs for archived data.

Check your retention:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// What is your actual data retention for hunting-critical tables?
Usage
| where TimeGenerated > ago(90d)
| where DataType in (
    "SigninLogs", "AADNonInteractiveUserSignInLogs",
    "AuditLogs", "CloudAppEvents", "SecurityAlert")
| summarize
    EarliestData = min(TimeGenerated),
    LatestData = max(TimeGenerated),
    RetentionDays = datetime_diff('day', max(TimeGenerated), min(TimeGenerated))
    by DataType
| sort by RetentionDays desc
// If RetentionDays < 90 for any table, long-window hunts are limited
// Consider configuring archive tiers for hunting-critical tables

Try it yourself

Exercise: Audit your hunting data estate

Run the data source check query from above. For each of the three clusters (identity, cloud apps, endpoint), answer:

Identity: Are all four tables ingested? If AADNonInteractiveUserSignInLogs is missing, TH4 (authentication anomalies) will have a critical blind spot.

Cloud apps: Is CloudAppEvents populated? If not, is Defender for Cloud Apps connected? Is MicrosoftGraphActivityLogs enabled?

Endpoint: Are all five Device* tables populated? If not, is Defender for Endpoint deployed to all relevant device groups?

Document the gaps. Each gap is either a prerequisite to fix before hunting that domain or a known limitation to record in hunt records that depend on the missing table.

⚠ Compliance Myth: "We ingest all our logs into Sentinel — we have full visibility"

The myth: If data is flowing into Sentinel, it is available for hunting. Log ingestion equals visibility.

The reality: Ingestion is necessary but not sufficient. Many organizations ingest SigninLogs but not AADNonInteractiveUserSignInLogs — leaving the entire token replay attack surface invisible. Many ingest CloudAppEvents but have not connected all relevant data sources in Defender for Cloud Apps — leaving specific application activities unreported. Some ingest endpoint tables but only from a subset of devices (servers but not workstations, or managed devices but not BYOD). The audit is not “is the table ingested?” but “is the table ingested completely, for all relevant entities, with sufficient retention?” Each gap is a hunting blind spot.

Extend this reference

Microsoft's data source landscape evolves. New tables appear (MicrosoftGraphActivityLogs was introduced in 2024). Existing tables gain new columns. Defender for Cloud Apps adds new application connectors. Before starting any hunt campaign, check the Microsoft Learn documentation for the specific table to confirm the columns you need are available and populated in your environment. The table schemas in this subsection are accurate as of the course publication date but may have expanded since then.

📋 Operational Artifact — Data Source Quick Reference for Hunt Campaigns

Per campaign, confirm these tables are ingested before hunting:
TH4 (Auth anomalies): SigninLogs ☐ + AADNonInteractiveUserSignInLogs ☐ + AuditLogs ☐ TH5 (Inbox rules): CloudAppEvents ☐ + EmailEvents ☐ TH6 (OAuth abuse): AuditLogs ☐ + AADServicePrincipalSignInLogs ☐ TH7 (Privilege escalation): AuditLogs ☐ + SigninLogs ☐ TH8 (Data exfiltration): CloudAppEvents ☐ TH9 (Endpoint persistence): DeviceProcessEvents ☐ + DeviceRegistryEvents ☐ + DeviceFileEvents ☐ TH10 (Lateral movement): SigninLogs ☐ + DeviceLogonEvents ☐ + IdentityLogonEvents ☐ TH11 (Shadow IT): CloudAppEvents ☐ TH12 (Ransomware staging): DeviceProcessEvents ☐ + DeviceNetworkEvents ☐ + DeviceFileEvents ☐ TH13 (Insider threat): CloudAppEvents ☐ + SigninLogs ☐ + DeviceFileEvents ☐

References Used in This Subsection

Microsoft. “Advanced Hunting Schema Reference.” Microsoft Learn. https://learn.microsoft.com/en-us/defender-xdr/advanced-hunting-schema-tables
Microsoft. “Microsoft Sentinel — Data Connectors Reference.” Microsoft Learn. https://learn.microsoft.com/en-us/azure/sentinel/data-connectors-reference
Microsoft. “MicrosoftGraphActivityLogs.” Microsoft Learn. https://learn.microsoft.com/en-us/graph/microsoft-graph-activity-logs-overview

You're reading the free modules of this course

The full course continues with advanced topics, production detection rules, worked investigation scenarios, and deployable artifacts. Premium subscribers get access to all courses.

View Pricing See Full Syllabus

← TH0.9 Common Hunting Myths TH0.11 The Human Factor: What Makes a Good Hunter →