In this module
EI1.11 Building a Sign-In Baseline
Figure EI1.11 โ Security hardening lifecycle from assessment through continuous monitoring.
Figure โ Building a Sign-In Baseline.
Why baselines matter more than rules
Static detection rules use fixed thresholds: "alert if more than 10 failed sign-ins in 5 minutes" or "alert if sign-in from a blocked country." These rules catch known patterns but miss subtle anomalies โ a sign-in from Belgium is not from a "blocked country" but may be anomalous if your company has no employees in Belgium.
Baseline-driven detection compares current behavior against established normal patterns. Instead of "alert on sign-in from Russia" (which misses Belgium), the approach is "alert on sign-in from any country this user has never signed in from in the past 30 days." Instead of "alert on more than 10 failed sign-ins" (which misses a slow spray at 5 per hour), the approach is "alert when the hourly failure rate exceeds 200% of the historical average."
Building the baseline is the prerequisite. Without it, you have rules. With it, you have detection.
The baseline dimensions
A complete sign-in baseline covers five dimensions:
Geographic baseline โ which countries and cities do your users sign in from? What is the expected set? This baseline enables anomaly detection for unexpected locations and impossible travel.
// EI1.11 โ Geographic baseline: normal countries per user
SigninLogs
| where TimeGenerated > ago(30d)
| where ResultType == 0
| extend Country = tostring(LocationDetails.countryOrRegion)
| where isnotempty(Country)
| summarize
Countries = make_set(Country),
CountryCount = dcount(Country),
SignInCount = count()
by UserPrincipalName
| order by CountryCount desc
// Save this result. Any user who later signs in from a country
// not in their baseline set triggers an investigation// EI1.11 โ Temporal baseline: sign-in hours per user
SigninLogs
| where TimeGenerated > ago(30d)
| where ResultType == 0
| extend HourOfDay = hourofday(TimeGenerated)
| extend DayOfWeek = dayofweek(TimeGenerated) / 1d // 0=Sun, 6=Sat
| extend IsWeekend = DayOfWeek in (0, 6)
| summarize
WeekdayHours = make_set_if(HourOfDay, not(IsWeekend)),
WeekendActivity = countif(IsWeekend),
TotalSignIns = count()
by UserPrincipalName
| extend EarliestNormalHour = array_sort_asc(WeekdayHours)[0]
| extend LatestNormalHour = array_sort_desc(WeekdayHours)[0]
// Users who normally sign in 8-18 on weekdays but suddenly show
// a 3 AM weekend sign-in warrant investigation// EI1.11 โ Device baseline: normal devices per user
SigninLogs
| where TimeGenerated > ago(30d)
| where ResultType == 0
| extend DeviceOS = tostring(DeviceDetail.operatingSystem)
| extend DeviceId = tostring(DeviceDetail.deviceId)
| summarize
DeviceOSes = make_set(DeviceOS),
DeviceCount = dcount(DeviceId),
MostUsedOS = arg_max(count(), DeviceOS)
by UserPrincipalName
// A user who normally uses Windows 11 appearing on Linux = anomaly
// A user who normally has 1-2 devices appearing with 5 = anomaly// EI1.11 โ Application baseline: normal apps per user
SigninLogs
| where TimeGenerated > ago(30d)
| where ResultType == 0
| summarize
Apps = make_set(AppDisplayName),
AppCount = dcount(AppDisplayName)
by UserPrincipalName
| order by AppCount desc
// A finance user who normally uses Outlook, Teams, and SharePoint
// suddenly accessing "Azure Portal" or "Microsoft Graph Explorer"
// warrants investigation โ these are tools used for administration,
// not typical finance work// EI1.11 โ IP baseline: normal IPs per user (last 30 days)
SigninLogs
| where TimeGenerated > ago(30d)
| where ResultType == 0
| summarize
KnownIPs = make_set(IPAddress, 50),
IPCount = dcount(IPAddress)
by UserPrincipalName
| order by IPCount desc
// Users with many distinct IPs: likely mobile workers or VPN users
// Users with 1-2 IPs: likely office-based โ new IP is anomalous
// Cross-reference with named locations in CA for trusted IP identificationThe composite baseline query
Combining all five dimensions into a single per-user profile:
// EI1.11 โ Comprehensive user baseline (30-day reference)
// Store this result as a saved query โ re-run monthly to update
SigninLogs
| where TimeGenerated > ago(30d)
| where ResultType == 0
| extend
Country = tostring(LocationDetails.countryOrRegion),
DeviceOS = tostring(DeviceDetail.operatingSystem),
HourOfDay = hourofday(TimeGenerated)
| summarize
// Geographic
Countries = make_set(Country, 10),
CountryCount = dcount(Country),
// Temporal
ActiveHours = make_set(HourOfDay, 24),
// Device
DeviceTypes = make_set(DeviceOS, 5),
DeviceOSCount = dcount(DeviceOS),
// Application
Apps = make_set(AppDisplayName, 20),
AppCount = dcount(AppDisplayName),
// IP
IPCount = dcount(IPAddress),
// Volume
TotalSignIns = count(),
AvgDailySignIns = count() / 30.0,
// Risk
RiskySignIns = countif(RiskLevelDuringSignIn in ("medium", "high")),
// Time range
FirstSeen = min(TimeGenerated),
LastSeen = max(TimeGenerated)
by UserPrincipalName
| order by TotalSignIns descThis query produces one row per user with their complete 30-day behavioral profile. Save it. Re-run it monthly. When you need to assess whether a specific sign-in is anomalous, compare the sign-in's properties against this baseline for that user.
Using the baseline for anomaly detection
The baseline enables a new class of detection โ deviations from established patterns. Here is the core pattern for baseline-driven anomaly detection:
// EI1.11 โ Detect sign-ins from new countries (not in 30-day baseline)
let baseline = SigninLogs
| where TimeGenerated between (ago(30d) .. ago(1d))
| where ResultType == 0
| extend Country = tostring(LocationDetails.countryOrRegion)
| summarize BaselineCountries = make_set(Country) by UserPrincipalName;
SigninLogs
| where TimeGenerated > ago(24h)
| where ResultType == 0
| extend Country = tostring(LocationDetails.countryOrRegion)
| where isnotempty(Country)
| join kind=inner baseline on UserPrincipalName
| where not(Country in (BaselineCountries))
| project
TimeGenerated, UserPrincipalName, AppDisplayName,
NewCountry = Country, IPAddress,
BaselineCountries,
RiskLevelDuringSignIn
| order by TimeGenerated desc
// Every result is a user signing in from a country they have never
// used in the past 30 days โ a strong anomaly signalThis pattern โ establish baseline, compare current activity, flag deviations โ is the foundation of the detection rules in EI13. The detection rules automate this comparison and fire alerts when deviations exceed defined thresholds.
Maintaining and refreshing baselines
A baseline is not a one-time exercise. User behavior changes legitimately: people travel, change roles, adopt new applications, and switch devices. A baseline that is never refreshed becomes increasingly inaccurate, producing false positives for legitimate behavior changes and potentially missing real anomalies because the baseline no longer reflects current patterns.
The recommended refresh cadence is monthly. Re-run the composite baseline query at the start of each month using a rolling 30-day window. Compare the new baseline against the previous month's baseline to identify drift โ users who have legitimately expanded their geographic footprint, adopted new applications, or changed their working hours.
// EI1.11 โ Baseline drift detection
// Compare this month's country set against last month's for each user
let currentBaseline = SigninLogs
| where TimeGenerated > ago(30d)
| where ResultType == 0
| extend Country = tostring(LocationDetails.countryOrRegion)
| summarize CurrentCountries = make_set(Country) by UserPrincipalName;
let previousBaseline = SigninLogs
| where TimeGenerated between (ago(60d) .. ago(30d))
| where ResultType == 0
| extend Country = tostring(LocationDetails.countryOrRegion)
| summarize PreviousCountries = make_set(Country) by UserPrincipalName;
currentBaseline
| join kind=inner previousBaseline on UserPrincipalName
| extend NewCountries = set_difference(CurrentCountries, PreviousCountries)
| extend DroppedCountries = set_difference(PreviousCountries, CurrentCountries)
| where array_length(NewCountries) > 0 or array_length(DroppedCountries) > 0
| project UserPrincipalName, NewCountries, DroppedCountries,
CurrentCountries, PreviousCountries
// Results: users whose geographic pattern changed between months
// New countries: investigate or update baseline as legitimate
// Dropped countries: may indicate role change or account compromise remediationOrganizational baseline vs per-user baseline
Individual user baselines are the most precise but are impractical to manage for organizations with thousands of users. An alternative is the organizational baseline โ a set of norms that apply across the tenant:
The geographic organizational baseline is the list of countries where the organization has employees, offices, or approved remote workers. Any sign-in from outside this list is anomalous at the organizational level, regardless of individual user history.
// EI1.11 โ Organizational baseline: expected countries
// Any sign-in from outside these countries is an organizational anomaly
let orgCountries = dynamic(["US", "GB", "CA", "DE"]); // Your countries
SigninLogs
| where TimeGenerated > ago(24h)
| where ResultType == 0
| extend Country = tostring(LocationDetails.countryOrRegion)
| where Country !in (orgCountries) and isnotempty(Country)
| summarize SignIns = count(), Users = make_set(UserPrincipalName, 10)
by Country
| order by SignIns desc
// Fast organizational-level anomaly check โ no per-user baseline needed
// Useful as a complement to per-user baselines, not a replacementTry it yourself
Try It โ Build Your Lab Baseline
Environment: Your M365 developer tenant with Sentinel workspace.
Exercise: Run the composite baseline query against your developer tenant. Because the tenant is new, the baseline will be limited โ but it will show the sign-in patterns from your lab setup activities.
Answer these questions from the baseline results: 1. How many distinct countries appear in your baseline? 2. How many distinct applications have you accessed? 3. What are your active hours? 4. How many distinct IP addresses have you used?
Save the composite baseline query in your Sentinel workspace (Logs โ Save โ Save as query). You will re-run this baseline at the end of each module to see how your sign-in patterns change as you configure the lab environment.
The myth: Identity Protection builds its own baseline of user behavior and detects anomalies automatically. We do not need to build our own baseline.
The reality: Identity Protection's baseline is a black box โ you cannot query it, inspect it, or tune it beyond the coarse risk policy thresholds (low/medium/high). It detects generic anomalies (unfamiliar sign-in properties, atypical travel) but does not detect organization-specific anomalies (a finance user accessing the Azure Portal, a UK employee signing in from Belgium, an after-hours sign-in from a user who never works evenings). Your custom baseline captures the patterns specific to your environment. Custom detection rules built on this baseline catch anomalies that Identity Protection's generic model misses. Both are valuable โ Identity Protection as the broad detection layer and custom baselines as the precision layer.
A sign-in log shows a successful authentication from an IP in a country where NE has no employees. MFA was satisfied by push notification. The user says they approved the MFA prompt while traveling. Do you accept this explanation?
Verify, do not accept at face value. Check: does the user have a travel request on file? Does the IP geo-location match the claimed travel destination? Does the device fingerprint match the user's enrolled device? Are there other sign-ins from NE's corporate IP in the same time window (which would indicate the user is NOT traveling)? An attacker who stole credentials and is bombarding the user with MFA prompts (MFA fatigue) gets the same 'I approved it' response from a confused user. The investigation confirms or refutes the travel explanation within 5 minutes.
You've mapped the identity threat landscape and learned to read sign-in logs.
EI0 established that every cloud attack starts with identity. EI1 took you through the signal that matters most โ interactive, non-interactive, service principal, and managed identity sign-ins. Now you engineer the defences.
- 17 engineering modules โ authentication methods, conditional access architecture, Identity Protection, PIM, token protection, application governance, and detection rules
- The Defense Design Method โ the six-step framework applied to every identity control you'll build
- EI18 Capstone โ Identity Security Architecture Design โ design complete identity architectures for three realistic organisations (SMB, mid-market, regulated enterprise)
- Identity Security Toolkit lab pack โ deployable conditional access policies, PIM configurations, and Identity Protection risk rules
- Cross-domain detection (EI16) โ email-to-identity correlation and the full phishing-to-inbox-rule attack chain