7.5 Data Retention and Cost Management

16-20 hours · Module 7

Data Retention and Cost Management

SC-200 Exam Objective

Domain 1 — Manage a SOC Environment: "Manage data retention for XDR and Microsoft Sentinel tables." Cost management is an operational skill that the exam tests because uncontrolled costs lead to management pressure to reduce coverage — which weakens security posture.

Introduction

Sentinel cost is driven by two factors: data ingestion volume (how much data enters the workspace per day) and data retention duration (how long the data is stored). Every decision in this module affects one or both: data connectors determine ingestion volume, log tier selection determines ingestion cost per GB, and retention policies determine storage cost. This subsection teaches you to manage both proactively so that Sentinel’s cost is predictable, justifiable, and optimised without degrading security capability.


Retention policies

Sentinel data retention is configured at two levels: workspace default and per-table override.

Workspace default retention applies to all tables that do not have a table-specific retention policy. The default is 90 days. The first 90 days of retention are included in the Sentinel per-GB ingestion price at no additional charge. Retention beyond 90 days incurs a per-GB/month storage charge.

Per-table retention overrides the workspace default for specific tables. Use this when different tables need different retention periods: security investigation tables (SigninLogs, SecurityAlert) may need 180 days for long-running investigations, while verbose operational tables (AzureMetrics) may only need 30 days.

To configure per-table retention: Log Analytics workspace → Tables → select a table → Manage table → set the Interactive retention (Analytics tier) and Total retention (including Archive).

Recommended Retention Settings by Table Category
Table CategoryExample TablesAnalytics RetentionArchiveRationale
Core investigationSigninLogs, SecurityAlert, CloudAppEvents180 days1 yearActive investigations may span 6 months
Endpoint telemetryDeviceProcessEvents, DeviceFileEvents90 days1 yearMost endpoint investigations close within 90d
Email eventsEmailEvents, UrlClickEvents90 days1 yearEmail investigations typically within 90d
Azure operationsAzureActivity90 days3 yearsCompliance may require 3+ year retention
Operational metricsAzureMetrics, Heartbeat30 daysNoneLow security value, short retention sufficient
Compliance auditAuditLogs, OfficeActivity90 days7 yearsRegulatory retention requirements

Cost estimation and monitoring

Before deploying Sentinel in production, estimate the monthly cost based on expected data volume. After deployment, monitor actual cost against the estimate and adjust.

Pre-deployment estimation. Identify each data source you plan to connect. Estimate the daily ingestion volume for each (Microsoft documentation provides per-source estimates, or use the Azure pricing calculator). Multiply by 30 for monthly volume. Apply the per-GB price for your pricing tier. Add retention costs for data retained beyond 90 days.

Post-deployment monitoring. The Usage table tracks actual ingestion:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
// Monthly ingestion by data type  cost planning query
Usage
| where TimeGenerated > ago(30d)
| where IsBillable == true
| summarize MonthlyGB = sum(Quantity) / 1024 by DataType
| extend EstMonthlyCost = MonthlyGB * 5.22  // Approximate per-GB PAYG price
| order by MonthlyGB desc
| extend CumulativePct = row_cumsum(MonthlyGB) / toscalar(
    Usage | where TimeGenerated > ago(30d) | where IsBillable == true
    | summarize sum(Quantity) / 1024) * 100
Expected Output — Monthly Ingestion by Data Type
DataTypeMonthly GBEst. CostCumulative %
SecurityEvent450 GB£2,34945%
SigninLogs180 GB£94063%
AzureMetrics120 GB£62675%
DeviceProcessEvents80 GB£41883%
Other (15 types)170 GB£888100%
Cost optimisation target: SecurityEvent (Windows security events) accounts for 45% of ingestion. If the collection level is "All Events," reducing it to "Common" can cut SecurityEvent volume by 50-70%. AzureMetrics at 12% could move to Basic tier for a 60% cost reduction on that table. These two optimisations reduce total cost by ~25% with minimal security impact.

Ingestion optimisation techniques

Technique 1: Windows Security Event collection level. The SecurityEvent table is often the highest-volume table. The collection level (set in the data collection rule) determines how many event types are collected. “All Events” collects everything including verbose audit success events. “Common” collects the security-relevant subset. “Minimal” collects only critical events. For most SOC operations, “Common” provides the events needed for investigation without the volume cost of “All Events.”

Technique 2: Transformation at ingestion. Data collection rules (DCRs) can transform data during ingestion — filtering out unwanted records, removing unnecessary columns, and parsing complex fields. This reduces the stored volume (and cost) while preserving the investigation-relevant data. Example: filter SigninLogs to exclude service principal sign-ins from a specific application that generates 10,000 events per day with zero security value.

1
2
3
4
5
6
// Identify high-volume, low-value data for ingestion filtering
SigninLogs
| where TimeGenerated > ago(7d)
| where AppDisplayName == "Azure AD Sync" and ResultType == "0"
| summarize DailyCount = count() by bin(TimeGenerated, 1d)
| extend DailyGB = DailyCount * 0.002  // Approximate KB per event

Technique 3: Commitment tier right-sizing. If your consistent daily ingestion exceeds 100 GB, a commitment tier provides a lower per-GB price. Set the tier at your minimum consistent daily volume (not average, not peak) to ensure you benefit from the discount every day.

Technique 4: M365 E5 data grant. M365 E5 licences include a per-user data ingestion grant for qualifying Microsoft data. Verify whether your licences provide this benefit — it can reduce Sentinel cost by 20-40% for M365-heavy organisations.

Technique 5: Basic tier for non-security tables. Move tables that do not support security investigation or analytics rules to Basic tier. AzureMetrics, ContainerLog, AppTraces, and other operational tables are prime candidates (subsection 7.4).


Transformation at ingestion: practical examples

Data collection rules (DCRs) can filter and transform data during ingestion — reducing volume before it counts against your ingestion cost. This is the most precise cost optimisation technique because you control exactly which records and columns are stored.

Example 1: Filter verbose non-interactive sign-ins. The AADNonInteractiveUserSignInLogs table can generate 10x the volume of interactive SigninLogs in environments with many service principals and automated processes. Most of these events are routine token refreshes with zero security value. A DCR transformation can filter these at ingestion:

1
2
3
4
5
// DCR transformation: keep only failed non-interactive sign-ins
// and sign-ins from non-approved IP ranges
source
| where ResultType != "0"  // Keep failures only
    or IPAddress !in ("10.0.0.0/8", "172.16.0.0/12")  // Or from external IPs

This transformation can reduce AADNonInteractiveUserSignInLogs volume by 80-90% while retaining the security-relevant events (failures, external sources).

Example 2: Remove unnecessary columns. Some tables contain columns that you never query. A DCR transformation can drop them at ingestion, reducing per-record size:

1
2
3
4
// DCR transformation: drop columns rarely used in investigation
source
| project-away OriginalRequestId, CorrelationId,
    ResourceDisplayName, HomeTenantId

Column removal reduces record size by 15-25% for tables with many metadata columns. The savings compound across millions of records.

Example 3: Aggregate high-frequency events. For data sources that generate one event per second per device (like performance counters or heartbeat-style telemetry), a DCR transformation can aggregate to one event per minute — reducing volume by 60x while preserving trend data:

1
2
3
4
// DCR transformation: aggregate performance data to 1-minute intervals
source
| summarize avg(CpuUsage), max(MemoryUsage)
    by bin(TimeGenerated, 1m), DeviceName
Test transformations before production deployment

Every record you filter at ingestion is a record you cannot query later. If an investigation requires data that was filtered by a DCR transformation, the data is permanently lost. Test transformations by running the filter query against historical data first: "If I had applied this filter last month, would any investigation have been affected?" Only deploy the transformation after confirming that no investigation-relevant data is lost.


Budget alerting and cost anomaly detection

Beyond reporting, configure automated alerts that detect cost anomalies before they become budget overruns.

Azure Cost Management alerts. In the Azure portal → Cost Management → Budgets → create a budget for the Sentinel workspace resource group. Set a monthly budget based on your expected ingestion cost. Configure alert thresholds at 75%, 90%, and 100% of budget. When a threshold is reached, an email notification is sent to the budget owner (typically the Security Manager or SOC Lead). This provides early warning before costs exceed expectations.

Ingestion anomaly detection with KQL. Create an analytics rule that detects sudden ingestion spikes — which may indicate a misconfigured connector, an attack generating excessive events, or a new data source that was connected without cost analysis:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// Detect ingestion spikes: 3x the 7-day average for any data type
let baseline = Usage
| where TimeGenerated between(ago(8d) .. ago(1d))
| where IsBillable == true
| summarize AvgDailyGB = sum(Quantity) / 1024 / 7 by DataType;
Usage
| where TimeGenerated > ago(1d)
| where IsBillable == true
| summarize TodayGB = sum(Quantity) / 1024 by DataType
| join kind=inner baseline on DataType
| where TodayGB > AvgDailyGB * 3
| project DataType, TodayGB, AvgDailyGB,
    SpikeMultiple = round(TodayGB / AvgDailyGB, 1)

This rule fires when any data type’s daily ingestion exceeds 3x its 7-day average — catching both sudden connector failures (which can dump queued data) and gradual configuration drift.


Cost reporting for management

Management needs monthly cost reports that justify Sentinel spending and demonstrate cost optimisation progress.

Report structure:

Current month ingestion: total GB, cost, trend vs previous month. Breakdown by data type: which data sources drive the most cost. Cost per incident: total monthly cost / incidents investigated (demonstrates value per investigation). Optimisation actions taken: what changes reduced cost this month. Forecast: expected cost for next month based on current ingestion trend.

The cost-per-incident metric is the most powerful justification: if Sentinel costs £5,000/month and the SOC investigates 50 incidents, the cost per incident is £100. Compare this to the cost of missing an incident (data breach average cost: £3.5M per IBM Cost of a Data Breach Report). The £100/incident investment is trivial compared to the risk reduction it provides.


The daily cap: why it is dangerous in production

The daily cap sets a maximum ingestion volume per day. When the cap is reached, all data ingestion stops until midnight UTC. This was discussed in subsection 7.3, but it bears repeating because the exam tests it specifically.

The exam scenario: “Your Sentinel cost increased by 40% last month due to a new data connector. Management asks you to set a daily cap to control costs. What do you do?”

The correct answer: Do not set a daily cap. Instead: identify the specific connector that caused the 40% increase, evaluate whether the data is necessary for security operations, and either optimise the connector’s configuration (reduce collection level, add ingestion filters) or accept the cost increase with justification. A daily cap that stops data ingestion during a security incident is a greater risk than the cost increase.

If management insists on cost control: implement a commitment tier (which provides a lower per-GB rate), move non-essential tables to Basic tier, or apply transformation rules to filter unnecessary records — all of which reduce cost without creating blind spots.

A daily cap is not cost management — it is visibility management

When the cap triggers, you do not save money (the data still exists at the source). You lose visibility (the data is not ingested). An attacker who knows your cap triggers at 14:00 can time their attack for the afternoon, knowing your SIEM is blind. This is not hypothetical — cost-driven visibility gaps are a known attack vector in environments with aggressive data caps.

Try it yourself

Run the monthly ingestion query above against your Sentinel workspace. Identify the top 3 data types by volume. For each, determine: is this data used for analytics rules? Is it used for investigation queries? Could it move to Basic tier? Could the collection level be reduced? Could ingestion filters reduce the volume? This exercise builds the cost optimisation skill that is essential for sustainable Sentinel operations.

What you should observe

In a lab environment, total ingestion is likely 1-3 GB/day — well within the free tier. In production, you will likely see 2-3 data types that account for 60-80% of total cost. These are the optimisation targets: small changes to high-volume tables have the largest cost impact.


Knowledge check

Check your understanding

1. SecurityEvent accounts for 45% of your ingestion cost at "All Events" collection level. How do you reduce this cost without losing security investigation capability?

Change the Windows Security Event collection level from "All Events" to "Common." The Common level includes the security-relevant events (authentication, process creation, privilege changes) that SOC investigations require, while excluding verbose audit success events that rarely provide investigation value. This typically reduces SecurityEvent volume by 50-70%. The events removed are verbose operational events — not the security events used in analytics rules and investigation queries.
Move SecurityEvent to Basic tier
Set a daily cap of 10 GB for SecurityEvent
Disconnect the Windows Security Events connector