TH2.3 series_decompose_anomalies()
The function that does the statistics for you
TH2.1 taught manual statistical outlier detection — calculating percentiles and z-scores yourself. series_decompose_anomalies() automates this at a more sophisticated level. It decomposes a time series into three components — baseline (expected value), seasonality (repeating patterns like weekday/weekend cycles), and residual (everything left over). Anomalies are data points where the residual exceeds a threshold.
The function handles non-stationary data (trends), periodic patterns (work hours, business cycles), and noise — without requiring you to model any of these manually.
Basic anomaly detection
| |
Interpreting the output
anomalies = 1 (positive anomaly): The value is significantly higher than expected. For sign-in counts: more sign-ins than the user’s pattern predicts. For download volumes: more downloads than usual. This is the primary hunting signal — increased activity may indicate compromise.
anomalies = -1 (negative anomaly): The value is significantly lower than expected. For sign-in counts: fewer sign-ins than usual. This can indicate: the user’s account has been taken over and the attacker is signing in from a different path (non-interactive instead of interactive), the user has been locked out, or the user is on leave and the account should be dormant but is being used. Negative anomalies are underused in hunting — they can be as informative as positive ones.
anomalies = 0: Normal. The value falls within the expected range given the baseline, trend, and seasonality. These data points require no investigation.
score: The anomaly score is the number of standard deviations the residual deviates from zero. Higher absolute scores indicate stronger anomalies. Use the score to rank results — investigate the highest scores first.
Tuning sensitivity
The threshold parameter (1.5 in the example above) controls how many standard deviations from the baseline a value must be to qualify as anomalous. The default is 1.5.
| |
Guidance: Start with the default 1.5. If results are too noisy (hundreds of anomalies), increase to 2.0 or 3.0. If results are empty, decrease to 1.0. Campaign modules specify the threshold they have found effective for each technique.
Applied example: detecting authentication volume spikes
| |
Figure TH2.3 — How series_decompose_anomalies works. The function decomposes the time series into baseline, seasonality, and residual, then flags data points where the residual exceeds the sensitivity threshold.
Try it yourself
Exercise: Run anomaly detection on your sign-in data
Run the basic anomaly detection query (first query in this subsection) against your SigninLogs. How many user-day combinations are flagged as anomalous?
Examine the top 5 by score. For each: is the anomaly explainable (weekend work, holiday, user travel) or unexpected? This is the analysis step — the function identified the statistical anomalies, you determine whether they are security-relevant.
Then run with threshold 3.0 instead of 1.5. How many results survive the stricter threshold? The difference between 1.5 and 3.0 results is the sensitivity band you will tune per campaign.
The myth: series_decompose_anomalies() as a scheduled query provides automated hunting. Deploy it and the known-unknown layer is addressed.
The reality: Automated anomaly detection operates in the unknown-unknown layer of the detection pyramid (TH0.3) — it flags statistical deviations without understanding what they mean. Hunting operates in the known-unknown layer — it tests specific hypotheses about specific techniques. series_decompose_anomalies() is a hunting tool, not a hunting replacement. It identifies which data points are statistically unusual. The analyst determines whether “unusual” means “compromised,” “legitimate change,” or “noise.” The function does the math. Hunting does the judgment.
Extend this function
series_decompose_anomalies() accepts additional parameters for advanced use: `Seasonality` (auto-detect or specify the period — e.g., 7 for weekly), `Trend` (enable/disable trend detection), and `Test_points` (number of points at the end of the series to test for anomalies, leaving the rest as baseline only). The `Test_points` parameter is particularly useful for hunting: set it to 7 (for 7 days of detection window) with a 30-day series, and the function uses the first 23 days as baseline and tests only the last 7 days for anomalies — equivalent to the baseline + detection window pattern from TH1.10.
References Used in This Subsection
- Microsoft. “KQL series_decompose_anomalies().” Microsoft Learn. https://learn.microsoft.com/en-us/kusto/query/series-decompose-anomalies-function
- Microsoft. “KQL series_decompose().” Microsoft Learn. https://learn.microsoft.com/en-us/kusto/query/series-decompose-function
- Course cross-references: TH0.3 (detection pyramid — anomaly detection layer), TH1.10 (baselining), TH4 (authentication), TH8 (exfiltration)
You're reading the free modules of this course
The full course continues with advanced topics, production detection rules, worked investigation scenarios, and deployable artifacts. Premium subscribers get access to all courses.