6.7 Connector Validation and Ongoing Monitoring
75 minutes · Module 6
Connector Validation and Ongoing Monitoring
By the end of this subsection, you will have a validation checklist for new connectors and a weekly monitoring routine that catches connector failures before they create blind spots.
New connector validation checklist
Every time you enable a new connector, run through this checklist before marking it as complete:
| Check | Query or action | Expected result |
|---|
| Data arriving | Query the target table for events in the last hour | EventCount > 0 |
| Correct table | Verify data lands in the expected table (e.g., CommonSecurityLog not Syslog) | Table name matches connector documentation |
| Fields populated | Check critical columns are not null | SourceIP, DeviceAction, TimeGenerated populated |
| Ingestion latency | Run the latency query from 6.6 | Average < 5 minutes |
| Volume matches estimate | Compare actual 24-hour volume to your pre-connection estimate | Within 30% of estimate |
| DCR filtering active | If a DCR was applied, verify volume reduction | Reduction matches expected percentage |
| No duplicates | Run the duplicate detection query from 6.6 | Zero or near-zero duplicates |
| Analytics rules compatible | Run any planned analytics rules against the new data | Rules return expected results |
Create a validation templateCopy this checklist into your SOC wiki and fill it out for every new connector. The completed checklist serves as documentation — when someone asks "when was the Palo Alto connector set up and is it working?" you have the answer with evidence.
Weekly monitoring routine
Run these three queries every Monday (or at the start of each shift in a 24/7 SOC):
1. Connector health — all tables:
1
2
3
4
5
6
7
8
9
| union withsource=TableName *
| where TimeGenerated > ago(24h)
| summarize
EventCount = count(),
LastEvent = max(TimeGenerated),
HoursAgo = round(datetime_diff('minute', now(), max(TimeGenerated)) / 60.0, 1)
by TableName
| where HoursAgo > 2
| sort by HoursAgo desc
|
| No results — all tables have events within the last 2 hours |
|---|
What to look for: Any table appearing in this result has a data gap. A 2-hour threshold during business hours is the alert level. During weekends or off-hours, some tables (like AuditLogs) may naturally go quiet — adjust the threshold if your org has low weekend activity.
2. Volume anomaly check:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| let baseline =
Usage
| where TimeGenerated between (ago(8d) .. ago(1d))
| where IsBillable == true
| summarize AvgDailyMB = round(avg(Quantity), 0) by DataType;
Usage
| where TimeGenerated > ago(1d)
| where IsBillable == true
| summarize TodayMB = round(sum(Quantity), 0) by DataType
| join kind=inner baseline on DataType
| extend PercentChange = round((TodayMB - AvgDailyMB) * 100.0 / AvgDailyMB, 0)
| where abs(PercentChange) > 50
| project DataType, TodayMB, AvgDailyMB, PercentChange
| sort by abs(PercentChange) desc
|
| DataType | TodayMB | AvgDailyMB | PercentChange |
|---|
| CommonSecurityLog | 8,450 | 2,100 | +302% |
| SigninLogs | 180 | 1,200 | -85% |
What to look for: Two anomaly types matter equally. Spikes (+302%): unexpected cost increase — investigate the cause (DDoS, config change, new verbose log source). Drops (-85%): data loss — your detections are blind. A drop in SigninLogs means your token replay and brute-force rules are not firing. Fix drops before spikes — missing data is worse than extra data.
3. Analytics rule health:
1
2
3
4
5
6
7
| SentinelHealth
| where TimeGenerated > ago(7d)
| where SentinelResourceType == "Analytic rule"
| where Status == "Failure"
| summarize FailCount = count(), LastFailure = max(TimeGenerated)
by SentinelResourceName
| sort by FailCount desc
|
| SentinelResourceName | FailCount | LastFailure |
|---|
| Token replay from novel IP | 14 | 2026-03-21 08:15 |
| Inbox rule with suspicious keywords | 2 | 2026-03-19 14:22 |
What to look for: 14 failures for the token replay rule means it has been broken for days — 14 scheduled runs failed. Common cause: the rule references a table that was moved to Basic tier (join not supported) or a table that stopped receiving data. 2 failures may be transient (service hiccup) — monitor but do not panic.
Three queries. Five minutes. Every week.These three queries — connector health, volume anomalies, rule health — are the minimum monitoring for any Sentinel deployment. They catch 90% of operational problems before they impact detection capability. Save them as favorites in your workspace and run them at the start of every shift or every Monday.
Check your understanding
1. The volume anomaly query shows SigninLogs at -85% compared to the 7-day average. What is the correct response priority?
Highest priority — fix immediately. SigninLogs data loss means token replay detection, brute force detection, and impossible travel detection are all blind. Every minute without sign-in data is a minute you cannot detect account compromise.
Medium priority — investigate next week
Low priority — users can still sign in
Data drops are the highest-priority operational issue in a SIEM. Your security posture degrades silently — no alerts fire, no one notices until an investigation returns empty results. Fix ingestion gaps before building new detections, before tuning existing rules, before everything else.