9.9 Workbooks and Security Reporting

16-20 hours · Module 9

Workbooks and Security Reporting

Introduction

Workbooks are Sentinel’s reporting and visualisation layer. They are Azure Monitor workbooks integrated into the Sentinel portal — interactive dashboards that display KQL query results as charts, tables, heatmaps, and time series. While analytics rules detect threats and automation rules handle response, workbooks show the SOC how the operation is performing: how many incidents were handled, which attack types are trending, where detection gaps exist, and how individual analysts are performing.

Workbook architecture

A workbook is a collection of visualisation tiles arranged on a canvas. Each tile runs a KQL query and displays the results in a chosen format: bar chart, line chart, pie chart, table, heatmap, tile, or markdown text. Tiles can be parameterised — allowing the viewer to filter the entire workbook by time range, severity, analyst, or data source.

Content Hub workbooks. Many Content Hub solutions include pre-built workbooks. The Entra ID solution includes a sign-in analysis workbook. The Defender XDR solution includes an incident overview workbook. These are excellent starting points — install them, review the KQL, and customise for your environment.

Custom workbooks. For operational dashboards that reflect your specific SOC workflow, create custom workbooks. Navigate to Sentinel → Workbooks → Add workbook.

The SOC operational dashboard

Every SOC needs a primary operational dashboard that answers these questions at a glance.

How many incidents are open? By severity and status.

1
2
3
4
SecurityIncident
| where Status != "Closed"
| summarize Count = count() by Severity, Status
| order by Severity asc

What is the incident trend? Daily incident creation over the last 30 days, broken down by severity.

1
2
3
SecurityIncident
| where TimeGenerated > ago(30d)
| summarize DailyCount = count() by bin(TimeGenerated, 1d), Severity

Visualise as a stacked bar chart. An upward trend in High-severity incidents demands immediate attention. A steady or downward trend indicates effective detection tuning and threat mitigation.

What are the top attack types? Incident classification breakdown — what threats is the SOC finding?

1
2
3
4
5
6
7
SecurityIncident
| where TimeGenerated > ago(30d)
| where Status == "Closed" and Classification == "TruePositive"
| extend Tactic = tostring(parse_json(tostring(AdditionalData)).tactics)
| summarize Count = count() by Tactic
| order by Count desc
| take 10

How is the team performing? MTTT and MTTR by analyst.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
SecurityIncident
| where TimeGenerated > ago(30d)
| where Status == "Closed"
| where isnotempty(Owner)
| extend MTTR_hours = datetime_diff('hour', ClosedTime, CreatedTime)
| summarize
    IncidentsClosed = count(),
    AvgMTTR = avg(MTTR_hours),
    MedianMTTR = percentile(MTTR_hours, 50)
    by tostring(Owner.assignedTo)
| order by IncidentsClosed desc

Where are the detection gaps? MITRE ATT&CK coverage — which techniques have analytics rules and which do not. This visualisation requires querying the analytics rule configuration (covered in subsection 9.11 detection engineering lifecycle).

Analyst workload distribution

Balance the SOC workload across the team with a workload distribution tile.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
// Analyst workload — current open assignments
SecurityIncident
| where Status in ("New", "Active")
| extend AssignedTo = tostring(Owner.assignedTo)
| summarize
    OpenIncidents = count(),
    HighSev = countif(Severity == "High"),
    MedSev = countif(Severity == "Medium"),
    LowSev = countif(Severity == "Low"),
    OldestIncident_hours = max(datetime_diff('hour', now(), CreatedTime))
    by AssignedTo
| order by OpenIncidents desc

Visualise as a stacked bar chart (incidents per analyst, coloured by severity). Analysts with disproportionately high workload need help — either through reassignment or by identifying automation opportunities for their most common incident types. The OldestIncident_hours column flags analysts who have stale incidents that need attention.

Rule effectiveness dashboard

Track which analytics rules deliver the most value to the SOC.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
// Rule effectiveness — true positives, MTTR, and volume
SecurityIncident
| where TimeGenerated > ago(30d)
| where Status == "Closed"
| extend RuleName = tostring(parse_json(tostring(AdditionalData)).alertProductNames)
| summarize
    Total = count(),
    TruePositives = countif(Classification == "TruePositive"),
    FalsePositives = countif(Classification == "FalsePositive"),
    AvgMTTR = avg(datetime_diff('hour', ClosedTime, CreatedTime))
    by RuleName
| extend TPRate = round(TruePositives * 100.0 / Total, 1)
| extend Effectiveness = case(
    TPRate > 80 and Total > 5, "★★★ High",
    TPRate > 50 and Total > 3, "★★ Medium",
    TPRate > 0, "★ Low",
    "No TPs")
| where Total > 2
| order by TruePositives desc

Display as a table with conditional formatting: green background for High effectiveness, yellow for Medium, red for Low. This tile drives the monthly detection review (subsection 9.11) — rules with Low effectiveness are candidates for tuning or retirement.

Threat category breakdown

Show which attack categories are targeting your environment — essential for both SOC awareness and executive reporting.

1
2
3
4
5
6
7
8
// Threat breakdown by MITRE tactic — what are attackers doing?
SecurityIncident
| where TimeGenerated > ago(30d)
| where Classification == "TruePositive"
| extend Tactics = parse_json(tostring(AdditionalData)).tactics
| mv-expand Tactics
| summarize IncidentCount = count() by tostring(Tactics)
| order by IncidentCount desc

Visualise as a horizontal bar chart or donut chart. The distribution reveals your threat profile: if 60% of true positives are “Initial Access” (credential attacks), your environment is primarily targeted through identity compromise — prioritise identity detection and MFA hardening. If “Execution” dominates, endpoint threats are the primary vector — prioritise endpoint detection and application control.

Geographic threat map

For environments with external-facing sign-ins, a geographic heatmap shows where attacks originate.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// Geographic distribution of failed sign-ins from external IPs
SigninLogs
| where TimeGenerated > ago(30d)
| where ResultType != "0"
| where not(ipv4_is_private(IPAddress))
| extend Country = tostring(LocationDetails.countryOrRegion)
| where isnotempty(Country)
| summarize AttackCount = count(), UniqueIPs = dcount(IPAddress)
    by Country
| order by AttackCount desc
| take 20

Visualise as a map tile (Azure Monitor workbooks support map visualisations with latitude/longitude). Countries with disproportionately high attack volumes that are not business-relevant locations may warrant geographic blocking via conditional access policy — an actionable insight directly from the workbook.

Workbook parameter patterns

Parameters make workbooks interactive and reusable. These patterns cover the most common needs.

Time range parameter: Add as the first parameter on every workbook. Default: 30 days for operational dashboards, 7 days for investigation dashboards. All tiles reference this parameter: | where TimeGenerated > {TimeRange:start}.

Severity filter parameter: Multi-select dropdown: High, Medium, Low, Informational. Allows analysts to focus on specific severity levels. All incident tiles add: | where Severity in ({SeverityFilter}).

Analyst filter parameter: Dropdown populated from a KQL query: SecurityIncident | distinct tostring(Owner.assignedTo). Enables per-analyst performance views. SOC managers use this to review individual analyst metrics.

Data source parameter: Multi-select: Identity, Endpoint, Email, Network, Custom. Maps to tags added by automation rules (subsection 9.6). Enables filtering all tiles by detection category.

Weekly trend analysis

A weekly trend tile shows whether the SOC’s operational posture is improving or degrading.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// Weekly SOC performance trend
SecurityIncident
| where TimeGenerated > ago(90d)
| where Status == "Closed"
| summarize
    Incidents = count(),
    TruePositives = countif(Classification == "TruePositive"),
    AvgMTTR_hours = avg(datetime_diff('hour', ClosedTime, CreatedTime)),
    TPRate = round(countif(Classification == "TruePositive") * 100.0 / count(), 1)
    by Week = bin(TimeGenerated, 7d)
| order by Week asc

Visualise as a multi-line chart: one line for incident volume, one for MTTR, one for TP rate. Healthy trends: stable or declining incident volume (indicating effective prevention), declining MTTR (faster response), and stable or increasing TP rate (better detection quality). Unhealthy trends: rising incident volume without corresponding true positives (more noise), rising MTTR (overwhelmed team), declining TP rate (degrading rule quality).

Executive reporting workbook

The executive audience needs different information: risk posture summary, incident trend over time, compliance metrics, and cost efficiency. Build a separate workbook for executives with high-level KPIs.

Key executive metrics:

Total incidents detected this month. Percentage resolved within SLA. Mean time to contain (from incident creation to containment action — not closure). Top 3 threat categories. Cost per incident (from Module 8.10). Month-over-month trend.

Workbook parameters for executive use: Default the time range to 30 days (executives review monthly). Add a “Department” parameter if incidents can be attributed to business units. Remove technical detail — executives need outcomes, not KQL.

Executive KPI definitions

Define these KPIs clearly for executive reporting. Each KPI should have: a definition, the target value, the current value (from KQL), and a trend indicator (improving/stable/degrading).

Security posture score: Percentage of High and Medium-severity incidents resolved within SLA. Target: 95%+. Calculation: closed incidents within SLA ÷ total closed incidents × 100. Trend: compare this month to the 3-month rolling average.

Threat detection effectiveness: Number of True Positive incidents per month. A rising number can indicate either improving detection or increasing threat activity — context matters. Pair with the false positive rate to distinguish: rising TP with stable FP = more threats detected. Rising TP with rising FP = noisier environment.

Mean time to contain (MTTC): Time from incident creation to the first containment action (device isolation, password reset, session revocation). More meaningful than MTTR for executives because containment stops the threat — closure may include documentation time that does not affect security posture. Target: under 2 hours for High severity.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
// Executive MTTC calculation
// Requires containment actions to be logged as incident comments
// containing "containment" or tracked via automation rule tags
SecurityIncident
| where TimeGenerated > ago(30d)
| where Status == "Closed"
| where Classification == "TruePositive"
| where Severity in ("High", "Medium")
| extend ContainmentTime = datetime_diff('hour', FirstModifiedTime, CreatedTime)
| summarize
    AvgMTTC_hours = round(avg(ContainmentTime), 1),
    MedianMTTC_hours = round(percentile(ContainmentTime, 50), 1),
    IncidentsContained = count()
    by bin(TimeGenerated, 7d)
| order by TimeGenerated asc

SLA tracking tile

Define service level agreements for incident response and track compliance.

Severity	Triage SLA	Containment SLA	Resolution SLA
High	30 minutes	2 hours	4 hours
Medium	2 hours	8 hours	24 hours
Low	8 hours	24 hours	72 hours

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// SLA compliance by severity
SecurityIncident
| where TimeGenerated > ago(30d)
| where Status == "Closed"
| extend TriageMinutes = datetime_diff('minute', FirstModifiedTime, CreatedTime)
| extend ResolutionHours = datetime_diff('hour', ClosedTime, CreatedTime)
| extend TriageSLA = case(
    Severity == "High" and TriageMinutes <= 30, "Met",
    Severity == "Medium" and TriageMinutes <= 120, "Met",
    Severity == "Low" and TriageMinutes <= 480, "Met",
    "Breached")
| extend ResolutionSLA = case(
    Severity == "High" and ResolutionHours <= 4, "Met",
    Severity == "Medium" and ResolutionHours <= 24, "Met",
    Severity == "Low" and ResolutionHours <= 72, "Met",
    "Breached")
| summarize
    TotalIncidents = count(),
    TriageSLAMet = countif(TriageSLA == "Met"),
    ResolutionSLAMet = countif(ResolutionSLA == "Met")
    by Severity
| extend TriageCompliance = round(TriageSLAMet * 100.0 / TotalIncidents, 1)
| extend ResolutionCompliance = round(ResolutionSLAMet * 100.0 / TotalIncidents, 1)

Display as a colour-coded table. Green for compliance above 95%. Yellow for 80-95%. Red for below 80%.

Workbook maintenance schedule

Workbooks degrade if not maintained. Queries break when tables are renamed, KQL functions change, or new data sources require updated tiles.

Monthly: Verify all tiles load without errors. Check that query time ranges and parameters produce current data. Update any hardcoded values (thresholds, IP ranges, user lists) that may have changed.

Quarterly: Review whether the workbook tiles still match SOC reporting needs. Remove tiles that nobody reads. Add tiles for new use cases identified in the quarterly detection review. Update the executive workbook with any new KPIs management has requested.

After major changes: When you deploy new analytics rules, add new data connectors, or restructure the automation rule library, review the workbooks for impact. A new data source may need a new data source health tile. New analytics rules may change the rule effectiveness metrics.

Data source health workbook

Complement the SOC operational dashboard with a data source health workbook that shows connector status (from Module 8.9).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
// Data source health tile — connector freshness
let ExpectedTables = datatable(TableName:string, Category:string) [
    "SigninLogs", "Identity",
    "DeviceProcessEvents", "Endpoint",
    "EmailEvents", "Email",
    "CommonSecurityLog", "Network",
    "SecurityEvent", "Windows Hosts",
    "Syslog", "Linux Hosts"
];
union withsource=TableName *
| where TimeGenerated > ago(4h)
| summarize LastEvent = max(TimeGenerated), EventCount = count() by TableName
| join kind=inner ExpectedTables on TableName
| extend MinutesSinceLastEvent = datetime_diff('minute', now(), LastEvent)
| extend Status = case(
    MinutesSinceLastEvent < 30, "Healthy",
    MinutesSinceLastEvent < 120, "Delayed",
    "Down")
| project Category, TableName, Status, MinutesSinceLastEvent, EventCount

Display as a colour-coded table: green for Healthy, yellow for Delayed, red for Down. This workbook is the shift-start health check from Module 7.11 in visual form — the first thing every analyst checks at the start of their shift.

Shared workbooks are saved to the Sentinel workspace and visible to all users with Sentinel Reader (or above) permissions. Use for: SOC operational dashboards, team performance reports, and data source health monitors.

Private workbooks are saved to the individual user and not shared. Use for: personal investigation templates, draft dashboards under development.

Workbook templates. Save a polished workbook as a template in Content Hub or as a gallery template — this allows other team members (or other Sentinel deployments) to install your workbook as a starting point.

Detection coverage heatmap

The most valuable workbook tile for detection engineering: a heatmap showing MITRE ATT&CK technique coverage — which techniques have active rules firing and which are silent.

1
2
3
4
5
6
7
8
// Detection coverage by technique — 90-day firing analysis
SecurityIncident
| where TimeGenerated > ago(90d)
| where Classification == "TruePositive"
| extend Techniques = parse_json(tostring(AdditionalData)).techniques
| mv-expand Techniques
| summarize TruePositiveCount = count() by tostring(Techniques)
| order by TruePositiveCount desc

Visualise as a bar chart. Techniques with zero true positives may indicate: rules that are not detecting real threats (tune or retire), techniques that are not being used against your environment (expected — not every technique is relevant), or detection gaps where no rule exists (create one).

Alert funnel analysis

Track the journey from raw alerts to closed incidents to understand SOC efficiency and rule quality.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// Alert-to-incident funnel — 30-day analysis
let TotalAlerts = toscalar(SecurityAlert
| where TimeGenerated > ago(30d)
| count);
let TotalIncidents = toscalar(SecurityIncident
| where TimeGenerated > ago(30d)
| count);
let TriagedIncidents = toscalar(SecurityIncident
| where TimeGenerated > ago(30d)
| where Status != "New"
| count);
let ClosedIncidents = toscalar(SecurityIncident
| where TimeGenerated > ago(30d)
| where Status == "Closed"
| count);
let TruePositives = toscalar(SecurityIncident
| where TimeGenerated > ago(30d)
| where Classification == "TruePositive"
| count);
print
    RawAlerts = TotalAlerts,
    Incidents = TotalIncidents,
    Triaged = TriagedIncidents,
    Closed = ClosedIncidents,
    TruePositives = TruePositives,
    TriageRate = round(TriagedIncidents * 100.0 / TotalIncidents, 1),
    TPRate = round(TruePositives * 100.0 / ClosedIncidents, 1)

Display as a funnel visualisation. The narrowing from alerts → incidents → true positives shows where volume is filtered at each stage. A wide funnel (many alerts, few true positives) indicates noisy rules. A narrow funnel (few alerts, most are true positives) indicates well-tuned detection.

Workbook design best practices

Use parameters for interactivity. Add time range, severity, and analyst parameters to every operational workbook. Parameters let each viewer filter to their area of responsibility without creating separate workbooks.

Group tiles logically. Top row: summary KPIs (total open incidents, MTTT, MTTR). Middle: trend charts (incident volume over time, severity distribution). Bottom: detail tables (individual incidents, rule performance). This mirrors the analyst workflow: scan summary → identify trends → drill into detail.

Limit to 6-8 tiles per tab. Workbooks with 20+ tiles load slowly and overwhelm the viewer. Use tabs (workbook groups) to organise: “Overview” tab with KPIs, “Rule Performance” tab with per-rule metrics, “Analyst Performance” tab with per-person metrics, “Health” tab with connector status.

Use conditional formatting. Colour-code values: green for healthy/good (MTTR within SLA, connector healthy), yellow for warning (MTTR approaching SLA, connector delayed), red for critical (MTTR exceeded, connector down). The analyst should be able to spot problems in 2 seconds by scanning colours — not by reading numbers.

Pin the workbook to the Azure dashboard. For the primary SOC operational workbook, pin it to the Azure portal dashboard so it appears immediately when analysts open the portal — no navigation required.

Incident volume forecasting

Add a tile that projects future incident volume based on historical trends.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
// 7-day incident forecast based on 30-day trend
SecurityIncident
| where TimeGenerated > ago(30d)
| summarize DailyCount = count() by bin(TimeGenerated, 1d)
| order by TimeGenerated asc
| serialize
| extend RowNumber = row_number()
| extend Forecast = 0
| summarize Slope = (max(DailyCount) - min(DailyCount)) / 30.0,
    AvgDaily = avg(DailyCount)
| extend Next7DayForecast = round(AvgDaily + Slope * 7, 0)
| extend Next30DayForecast = round(AvgDaily + Slope * 30, 0)

Rising forecasts indicate: new analytics rules generating more incidents (expected after rule deployment), a genuine increase in threat activity, or degrading rule quality (more false positives). Falling forecasts indicate: effective rule tuning, reduced threat activity, or connector failures reducing data volume (investigate).

Try it yourself

Create a workbook with three tiles: (1) open incidents by severity (bar chart), (2) daily incident trend over 30 days (line chart), and (3) data source health table. Add a time range parameter. Save as a shared workbook. This is the minimum viable SOC dashboard — the starting point that you refine over time as you identify additional visualisation needs.

What you should observe

The workbook renders KQL results as interactive visualisations. The time range parameter filters all tiles simultaneously. In a lab with few incidents, the charts may be sparse — the structure and queries are the deliverable, not the data volume. In production, these tiles provide at-a-glance SOC situational awareness.

Knowledge check

Check your understanding

1. Your SOC manager asks for a monthly report showing which MITRE ATT&CK techniques your analytics rules cover and which have gaps. What do you build?

A workbook tile that queries the analytics rule configuration to extract MITRE ATT&CK technique mappings, compares them against the full ATT&CK matrix, and displays a heatmap showing covered techniques (green) vs uncovered techniques (red). This is the detection coverage visualisation — covered in detail in subsection 9.11 (Detection Engineering Lifecycle).

Export the incident list to Excel

Run a KQL query and copy the results

This data is not available in Sentinel

Workbook with ATT&CK coverage analysis. Sentinel stores MITRE mappings on each analytics rule — they are queryable and visualisable.

← 9.8 User and Entity Behavior Analytics (UEBA) 9.10 ASIM Parsers and Data Normalisation →