10.10 Hunting with Notebooks

14-18 hours · Module 10

Hunting with Notebooks

Introduction

KQL is powerful for structured queries against tabular data. But some hunting analysis requires capabilities beyond KQL: statistical modelling, machine learning, data visualisation beyond workbook charts, integration with external APIs, and complex multi-step analysis with intermediate results. Jupyter notebooks in Sentinel provide these capabilities — running Python code in a notebook environment that connects to your Sentinel workspace data.

When notebooks add value

Use notebooks when: Your analysis requires Python libraries (pandas, scikit-learn, NetworkX, matplotlib) that KQL does not provide. You need iterative, exploratory analysis where each step builds on the previous result. You want to apply machine learning models (clustering, classification, anomaly detection) to hunting data. You need to integrate with external APIs during the analysis (threat intelligence enrichment, WHOIS lookups, DNS resolution). You want publication-quality visualisations for hunt reports.

Use standard KQL when: The analysis can be expressed as a single or small set of KQL queries. The hunting pattern maps to one of the six patterns from subsection 10.3. Performance is important (KQL executes directly on the workspace; notebooks add an intermediary hop). The analyst team is KQL-proficient but not Python-proficient.

For most SOC operations, KQL is sufficient. Notebooks are a specialist tool for advanced hunting scenarios — not a replacement for day-to-day KQL hunting.

The Sentinel notebook environment

Navigate to Sentinel → Threat management → Notebooks. Sentinel provides a built-in notebook experience powered by Azure Machine Learning (AML) or direct Jupyter integration.

Setup: Create or connect an Azure Machine Learning workspace. Launch a notebook from the Sentinel portal or upload a custom notebook. The notebook connects to your Sentinel workspace using the azure-sentinel Python package and authenticates with your Azure credentials.

MSTICPy (Microsoft Threat Intelligence Python library) is the primary library for Sentinel notebook hunting. It provides: data connectors (query Sentinel tables from Python), threat intelligence lookups (VirusTotal, OTX, AbuseIPDB, Shodan), visualisations (process trees, timeline plots, network graphs), and analysis functions (anomaly detection, geo-clustering, domain analysis).

Notebook hunting example: network graph analysis

KQL can identify individual connections. A notebook can visualise the entire network of connections as a graph — revealing patterns invisible in tabular data.

1
2
3
4
5
6
7
// Step 1: KQL query to extract connection data (run from notebook via MSTICPy)
// DeviceNetworkEvents
// | where TimeGenerated > ago(7d)
// | where ActionType == "ConnectionSuccess"
// | where not(ipv4_is_private(RemoteIP))
// | summarize ConnectionCount = count() by DeviceName, RemoteIP
// | where ConnectionCount > 10

In the notebook, load this data into a pandas DataFrame, then build a network graph with NetworkX:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// Python conceptual code (not KQL):
// import networkx as nx
// import matplotlib.pyplot as plt
//
// G = nx.from_pandas_edgelist(df, source='DeviceName',
//     target='RemoteIP', edge_attr='ConnectionCount')
// # Identify highly connected external IPs (potential C2)
// centrality = nx.degree_centrality(G)
// top_nodes = sorted(centrality.items(), key=lambda x: x[1], reverse=True)[:10]
// # Visualise
// nx.draw(G, with_labels=True, node_size=[v * 1000 for v in centrality.values()])

The network graph reveals: which external IPs are connected to the most internal devices (high degree centrality = potential C2 infrastructure), clusters of devices connecting to the same external IPs (potential botnet or shared C2), and isolated connections that stand out from the normal pattern.

Notebook hunting example: time series anomaly detection

KQL can compare today’s count to a 30-day average. A notebook can apply proper statistical time series decomposition — separating trend, seasonality, and anomalies.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// Python conceptual:
// from msticpy.analysis import timeseries
// # Load sign-in volume per hour for 30 days
// ts_data = qry_prov.exec_query("""
//     SigninLogs
//     | where TimeGenerated > ago(30d)
//     | summarize Count = count() by bin(TimeGenerated, 1h)
// """)
// # Decompose: trend + seasonal + residual
// decomposed = timeseries.timeseries_anomalies(ts_data, time_column='TimeGenerated',
//     value_column='Count')
// # Anomalous hours have residual values > 3 standard deviations
// anomalies = decomposed[decomposed['residual_zscore'] > 3]

This finds hours where sign-in volume was anomalously high or low — accounting for daily and weekly seasonality that a simple average comparison misses.

Notebook hunting example: geographic clustering

Identify geographic clusters of sign-in activity that may indicate attacker infrastructure.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
// Python conceptual:
// from sklearn.cluster import DBSCAN
// import numpy as np
//
// # Load sign-in locations for a specific user
// signin_data = qry_prov.exec_query("""
//     SigninLogs
//     | where TimeGenerated > ago(90d)
//     | where UserPrincipalName == "j.morrison@northgateeng.com"
//     | where ResultType == "0"
//     | extend Lat = toreal(LocationDetails.geoCoordinates.latitude)
//     | extend Lon = toreal(LocationDetails.geoCoordinates.longitude)
//     | project TimeGenerated, IPAddress, Lat, Lon, Location
// """)
//
// # Cluster locations using DBSCAN
// coords = signin_data[['Lat', 'Lon']].values
// clustering = DBSCAN(eps=0.5, min_samples=3).fit(np.radians(coords))
// signin_data['Cluster'] = clustering.labels_
//
// # Outlier cluster (-1) = sign-ins from unusual locations
// outliers = signin_data[signin_data['Cluster'] == -1]

DBSCAN clustering groups sign-in locations into geographic clusters. The user’s regular locations (home, office) form dense clusters. Outlier sign-ins (attacker locations) are classified as noise (cluster = -1). This is more sophisticated than simple “first-time country” detection — it identifies outlier locations even within known countries.

Notebook hunting example: process tree visualisation

Visualise the complete process execution chain on a compromised endpoint — revealing the attacker’s toolchain from initial execution to final payload.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
// Python conceptual:
// from msticpy.vis import process_tree
//
// # Load process events for a compromised device
// proc_data = qry_prov.exec_query("""
//     DeviceProcessEvents
//     | where TimeGenerated > ago(24h)
//     | where DeviceName == "DESKTOP-NGE042"
//     | project TimeGenerated, ProcessId, FileName,
//         ProcessCommandLine, InitiatingProcessId,
//         InitiatingProcessFileName, AccountName
// """)
//
// # Build and display the process tree
// process_tree.build_process_tree(proc_data,
//     pid_column='ProcessId',
//     parent_pid_column='InitiatingProcessId',
//     process_name_column='FileName')
// process_tree.plot_process_tree(proc_data)

The visual process tree shows: the root process (explorer.exe), the user-launched application (outlook.exe), the malicious child process (powershell.exe spawned by the macro), and all subsequent processes the attacker launched. In KQL, this requires multiple self-joins and is difficult to visualise. In a notebook, the process tree renders as an interactive visual — click any node to see the full command line and timestamps.

MSTICPy data providers and queries

MSTICPy abstracts the Sentinel query interface, making it easy to run common hunting queries without writing raw KQL.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// Python conceptual:
// from msticpy.data import QueryProvider
//
// # Connect to Sentinel workspace
// qry_prov = QueryProvider("MSSentinel")
// qry_prov.connect(workspace="your-workspace-id")
//
// # Use built-in query templates
// # List available queries:
// qry_prov.list_queries()
//
// # Run a pre-built query:
// failed_signins = qry_prov.Azure.list_aad_signins_for_ip(
//     ip_address="203.0.113.47",
//     start=datetime(2026, 3, 1),
//     end=datetime(2026, 3, 22))
//
// # Run a custom KQL query:
// custom_results = qry_prov.exec_query("""
//     SigninLogs
//     | where TimeGenerated > ago(7d)
//     | where RiskLevelDuringSignIn != "none"
//     | summarize count() by UserPrincipalName, RiskLevelDuringSignIn
// """)

MSTICPy provides hundreds of pre-built query templates covering: Azure AD/Entra ID sign-in analysis, endpoint process investigation, network connection analysis, email analysis, and Azure resource activity. These templates accelerate notebook-based hunting — you do not need to write every query from scratch.

Notebook deployment guide

Option 1: Azure Machine Learning workspace. The production-grade approach. Create an AML workspace in the same subscription as your Sentinel workspace. Launch notebooks from the Sentinel portal → they execute on AML compute. Supports: persistent storage, scheduled execution, collaboration, and GPU compute for ML models.

Option 2: Local Jupyter. For testing and personal hunting. Install Jupyter locally (pip install notebook msticpy). Connect to Sentinel using MSTICPy’s QueryProvider. Simpler setup but no collaboration or persistence.

Option 3: Azure Synapse / Databricks. For organisations with existing data analytics platforms. Connect to the Sentinel Log Analytics workspace via the Azure Monitor API. Use Spark for large-scale analysis.

Recommended path: Start with Content Hub notebook templates on AML (Option 1). Run pre-built templates to learn the workflow. Customise templates for your environment. Build custom notebooks only when the pre-built templates do not cover your specific hunting need.

MSTICPy threat intelligence enrichment

MSTICPy integrates with multiple TI providers for in-notebook enrichment during hunts.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// Python conceptual:
// from msticpy.context.tilookup import TILookup
//
// ti_lookup = TILookup()
// ti_lookup.add_provider(provider="VirusTotal", api_key="your-key")
// ti_lookup.add_provider(provider="AbuseIPDB", api_key="your-key")
//
// # Enrich a list of suspicious IPs from a hunting query
// suspicious_ips = ["203.0.113.47", "198.51.100.22", "192.0.2.156"]
// results = ti_lookup.lookup_iocs(suspicious_ips)
//
// # Display enrichment: reputation score, geolocation, associated domains
// results.summary()

In-notebook TI enrichment eliminates the need to manually check each IP in VirusTotal. The hunter runs a query, extracts suspicious IPs, enriches them all at once, and uses the enrichment results to prioritise which findings warrant full investigation.

Notebook-to-analytics-rule conversion

When a notebook-based hunt discovers a pattern that should be detected automatically, convert the notebook analysis into a KQL analytics rule.

Conversion workflow:

Step 1: Identify the detection logic in the notebook. Which step produced the “this is suspicious” result? What were the criteria?

Step 2: Express the logic in pure KQL. If the notebook used Python libraries (clustering, ML models), simplify: can the detection be approximated with KQL operators? Often, a complex ML model in a notebook can be approximated with a KQL threshold or statistical deviation check that captures 80% of the detection value.

Step 3: If pure KQL cannot express the detection, consider: running the notebook on a schedule (via Azure Machine Learning pipelines) and writing results to a custom table that an analytics rule monitors.

Example: The geographic clustering notebook found that sign-ins from cluster -1 (outliers) correlate with compromises. In KQL, approximate with: “sign-in from a country the user has not visited in 90 days AND the IP is not in a known VPN range” — this captures most of the same outliers without requiring ML clustering. An hour with 500 sign-ins on a Tuesday at 10am may be normal (peak business hours), while 500 sign-ins on a Sunday at 3am is anomalous (same volume, different context).

Content Hub notebook templates

Microsoft provides notebook templates through Content Hub: “Guided Investigation - Process Alerts,” “Entity Explorer - Account,” “Guided Hunting - Anomalous Sign-In,” and others. These templates provide pre-built analysis workflows that you can execute against your workspace data without writing Python from scratch.

Using templates: Install the template from Content Hub. Open in your notebook environment. Configure the workspace connection. Run the cells sequentially. Review the visualisations and analysis output.

Customising templates: Modify the KQL queries to match your environment’s table names and column formats. Adjust thresholds and parameters. Add additional analysis cells for your specific hunting needs.

Notebook limitations

Skill requirement: Notebooks require Python proficiency. Most SOC analysts are KQL-proficient, not Python-proficient. Training investment is needed before notebooks become productive.

Performance: Notebooks query Sentinel data via the Azure Monitor API, which is slower than direct KQL execution in the Logs blade. Large datasets require longer load times.

Not real-time. Notebooks are batch analysis tools — not real-time monitoring. Use Livestream (subsection 10.6) for real-time hunting.

Reproducibility. Notebooks are interactive — each execution may produce different results depending on the data state. Document the parameters, time ranges, and data sources used in each notebook execution to ensure reproducibility.

Try it yourself

Navigate to Sentinel → Notebooks. If an AML workspace is available, launch a Content Hub notebook template (e.g., "Guided Hunting - Anomalous Sign-In"). Configure the workspace connection and run the template cells. Review the output visualisations. If no AML workspace is available, review the notebook templates in the Sentinel GitHub repository to understand the structure and analysis patterns — you can set up the environment later when the skill requirement is met.

What you should observe

The notebook connects to your workspace, queries data via MSTICPy, and produces analysis outputs: tables, charts, and anomaly highlights. The template handles the Python complexity — you provide the workspace credentials and review the results. For most organisations, notebooks are a Phase 2 capability — deployed after KQL-based hunting is well-established.

Knowledge check

Check your understanding

1. When should you use a notebook instead of standard KQL for hunting?

When the analysis requires capabilities beyond KQL: machine learning, network graph analysis, time series decomposition, external API integration during analysis, or complex multi-step exploratory analysis. For structured queries against tabular data, standard KQL in the Logs blade or Hunting blade is faster and simpler. Notebooks are a specialist tool, not a daily hunting replacement.

Always — notebooks are better than KQL

Only when KQL is too slow

Notebooks are deprecated — use KQL only

Notebooks for advanced analysis beyond KQL capabilities. KQL for daily hunting. Most SOC operations are well-served by KQL alone.

← 10.9 MITRE ATT&CK-Driven Hunting 10.11 Building a Hunting Programme →