1.2 The AI Capabilities Matrix for Security Operations

2-3 hours · Module 1 · Free

The AI Capabilities Matrix for Security Operations

Knowing what AI can do in general is not enough. You need to know what it can do for your specific security operations functions — and where it introduces risk. This subsection maps AI capabilities to the core functions of a security operations team, with an honest assessment of effectiveness and failure modes for each.

The output of this subsection is a capabilities matrix you can adapt to your team — a structured assessment of where AI adds genuine value, where it requires heavy verification, and where it should not be used at all.

The security operations function map

A security operations team performs six core functions. AI impacts each differently.

1. Alert triage and initial assessment

Capability	AI effectiveness	Verification requirement
Read alert details and summarize in plain language	High — LLMs excel at translating technical alert data into readable summaries	Low — summary quality is visible on inspection
Assess alert severity based on context	Moderate — AI can cross-reference against known patterns but lacks your environmental context	High — AI does not know if the flagged user is a VIP, if the IP is a known partner, or if the activity matches a planned change
Recommend initial triage action (investigate, close, escalate)	Low-Moderate — AI recommendations are generic without operational context	High — never auto-close or auto-escalate based on AI recommendation alone
Enrich alert with additional context	High — AI can generate queries to pull related sign-in data, device data, and historical alerts	Moderate — verify the enrichment queries produce valid results

Operational note: Alert triage is where AI delivers the most immediate time savings. A Tier 1 analyst spending 5 minutes reading and interpreting an alert can reduce that to 1 minute with AI-generated summaries and enrichment queries. Over 50 alerts per day, that is 200 minutes saved — more than 3 hours. The risk is triage automation without verification. AI-generated triage recommendations must be reviewed by a human analyst before action.

2. Investigation and evidence analysis

Capability	AI effectiveness	Verification requirement
Generate investigation queries (KQL, SPL, Sigma)	High — the highest-value use case for most teams	High — verify table names, column names, and filter logic against your schema
Analyze log data and identify anomalies	High for pattern recognition, Moderate for contextual interpretation	High — AI identifies patterns but cannot assess business context
Reconstruct attack timelines from log evidence	High — AI excels at organizing chronological evidence into narrative form	High — verify every timestamp, every IP, every user reference against source data
Cross-correlate data from multiple sources	Moderate — AI can identify matching fields across datasets you provide	High — ensure the correlation logic is correct (same field meaning across sources)

Operational note: The investigation feedback loop — generate query → execute in SIEM → paste results to AI → analyze → generate follow-up query — is the core AI-assisted investigation pattern. Each cycle takes 2-3 minutes. A complete investigation that takes 2 hours manually completes in 30-45 minutes with AI assistance. The human analyst executes every query and validates every finding. AI generates and analyzes. The human decides and acts.

3. Detection engineering

Capability	AI effectiveness	Verification requirement
Generate detection rules from technique descriptions	High — provide the attack in log-level terms and AI produces functional queries	High — test every rule against 30 days of historical data before deployment
Map rules to MITRE ATT&CK techniques	High — AI maps to the correct parent technique in most cases	Moderate — verify sub-technique specificity
Generate rule documentation	Very High — the highest-ROI use case in detection engineering	Low — documentation quality is visible on review
Estimate false positive rates	Moderate — AI can generate estimation queries but cannot predict your environment’s noise level	High — run the estimation query and evaluate results yourself
Produce rules in multiple query languages	High — AI can produce the same logic in KQL, SPL, and Sigma	High — verify syntax for each target platform

Operational note: Detection engineering is where AI delivers the highest compounding value. A detection rule runs 24/7 after deployment. AI cuts the time from threat advisory to deployed rule from 1-2 days to 2-3 hours. Over a year, that acceleration means your detection coverage expands 3-5x faster than manual rule development allows.

4. Incident response documentation

Capability	AI effectiveness	Verification requirement
Draft IR reports from raw investigation notes	Very High — the strongest AI capability for security teams	Moderate — verify facts against evidence, remove AI inferences presented as facts
Generate executive summaries from technical findings	Very High — AI translates technical content to business language effectively	Moderate — verify business impact statements and ensure no technical inaccuracies
Draft stakeholder communications (board, regulators, employees)	High — AI adapts tone and content for different audiences	High — every external communication must be human-reviewed before sending
Produce post-incident review documentation	High — AI structures lessons learned and recommendations from investigation data	Moderate — verify recommendations are feasible in your environment

Operational note: IR documentation is the most universally time-consuming task in security operations and the one where AI provides the most dramatic efficiency gain. A 4-6 hour report writing effort compresses to 1-2 hours with AI assistance. Module 4 covers the four-pass methodology in depth.

5. Security automation and scripting

Capability	AI effectiveness	Verification requirement
Generate PowerShell, Python, and Bash scripts	High — AI generates functional code for most security automation tasks	Very High — review every script for credential handling, error handling, and edge cases
Generate SOAR playbook logic	Moderate — AI produces the logic but SOAR platform-specific syntax varies	High — verify against your SOAR platform documentation
Review and improve existing scripts	High — AI identifies bugs, security issues, and improvement opportunities	Moderate — verify the suggestions against your environment
Generate deployment documentation	Very High — AI documents what a script does, how to deploy it, and what parameters to configure	Low — documentation quality is visible on review

Operational note: AI-generated code carries the same liability as human-written code. If it runs in production and causes damage, you are responsible. The human-in-the-loop framework (Module 5) ensures that AI-generated scripts pass through code review, dev testing, dry-run execution, and monitored deployment before touching production.

6. Compliance, policy, and governance

Capability	AI effectiveness	Verification requirement
Draft security policies from frameworks and context	High — AI produces structured policy drafts informed by framework requirements	Moderate — verify framework citations and adapt to organizational specifics
Conduct compliance gap analysis	High — AI identifies gaps between current state and framework requirements	High — verify gap assessments against actual implementation, not described implementation
Cross-map controls across frameworks (ISO → NIST → SOC 2)	High — AI identifies equivalent controls across frameworks	High — verify scope alignment, not just keyword matching
Generate risk assessment documentation	Moderate — AI produces reasonable likelihood/impact scores	High — risk appetite and political context are human judgments

Building your capabilities matrix

The tables above are generic. Your capabilities matrix must reflect your team’s specific context. The exercise below produces that artifact.

Try it yourself

Create a table with these columns:

Fill in each row for your team’s top 10 most time-consuming security operations tasks. For each task, assess: how you do it today, how long it takes, where AI could assist, how much time it would save, how much verification the AI output would require, and therefore what priority it should have in your AI adoption plan.

Sort the completed table by Priority (High → Low). The top 3-5 items are where you focus first — the highest time savings with manageable verification overhead.

Most teams find that their top priorities cluster in three areas: (1) investigation query generation — high frequency, high time savings, moderate verification; (2) IR documentation — lower frequency but very high time savings per instance; (3) detection rule documentation — high time savings with low verification overhead because documentation quality is easily assessed on review.

Tasks that typically score low priority: real-time containment decisions (AI cannot assess business context), evidence integrity tasks (AI output must not enter the evidence chain), and regulatory communications (require human review regardless of AI quality).

Where AI must not be used

Some security operations tasks are inappropriate for AI assistance regardless of the model’s capability. These are not capability limitations — they are operational discipline decisions.

Autonomous containment during active incidents. AI can recommend containment actions. It must not execute them autonomously. The decision to disable a VIP account, isolate a production server, or trigger a major incident response requires business context, political judgment, and organizational awareness that AI cannot possess. AI generates the recommendation. A human analyst evaluates the blast radius and makes the decision.

Evidence chain of custody. AI-generated analysis must not be presented as forensic evidence. If investigation findings may be used in legal proceedings (employment tribunal, regulatory investigation, law enforcement referral), every analytical conclusion must be independently verifiable from the raw evidence. AI can draft the report structure and suggest findings. A human analyst must verify every finding against the original log data and document the verification.

Regulatory and legal communications. Communications to regulators (ICO, SEC), law enforcement, or legal counsel must be human-reviewed and approved before transmission, regardless of how well AI drafts them. The reputational and legal consequences of an error in these communications are severe and irreversible.

Classified or restricted material. Data classified above the level that the AI platform is approved to handle must not be processed, regardless of the platform’s data handling commitments. If your organization classifies incident data as restricted and the AI platform is approved only for internal data, the incident data stays out of the AI tool.

Check your understanding

1. Your team spends 3 hours per week writing detection rule documentation. An AI tool can generate documentation from a KQL rule in 2 minutes, with the analyst spending 5 minutes reviewing each output. There are 5 rules per week. What is the actual time savings, and what verification level is required?

Current: 3 hours/week (36 minutes per rule × 5). With AI: 35 minutes/week (7 minutes per rule × 5 — 2 minutes generation + 5 minutes review). Savings: 2 hours 25 minutes per week. Verification: Low — documentation quality is visible on inspection. The reviewer reads the output, confirms it matches the rule logic, and verifies the MITRE mapping. This is a high-ROI, low-risk AI use case and should be near the top of any team's adoption priority list.

3 hours saved — no verification needed

Minimal savings — documentation still takes the same time

2 hours 25 minutes per week saved. Low verification required. Documentation is a high-ROI, low-risk AI use case because the output quality is directly assessable by reading it.

2. During a live incident at 2am, AI recommends immediately disabling the compromised user's account. The user is the CFO who is presenting to the board in 6 hours. What do you do?

Assess the blast radius before acting. The AI recommendation may be technically correct, but disabling the CFO's account 6 hours before a board presentation has business consequences the AI cannot evaluate. Consider alternative containment: revoke active sessions (preserves account but forces re-authentication), apply a conditional access policy restricting access to managed devices only, or enable CAE strict mode for the account. Contact the incident manager or on-call executive for approval before taking any action that impacts a VIP during a business-critical event. AI generates the recommendation. You evaluate the context and decide.

Disable the account immediately — security overrides business

Wait until after the board presentation to take any action

Assess blast radius and consider alternatives. AI cannot evaluate business context. The correct response considers both the security risk and the operational impact, and involves human judgment about proportionate containment.

The Claude platform: capabilities beyond conversational AI

The capabilities matrix above maps what AI can do in a conversational context — you ask, it responds. In March 2026, the Claude platform adds agentic capabilities that expand the matrix significantly:

Claude Code Security — launched February 2026. A reasoning-based vulnerability scanner that reads and reasons about codebases the way a human security researcher would, tracing data flows, mapping component interactions, and identifying context-dependent vulnerabilities that rule-based SAST tools miss. During internal testing, Anthropic found over 500 previously unknown high-severity vulnerabilities in production open-source codebases. This adds Application Security to the capabilities matrix — a domain where conversational AI was limited to ad-hoc code review. Module 5.3 covers this in production depth.

Claude Code with scheduled tasks — recurring security operations that run without manual prompting. Weekly inbox rule audits, daily sign-in anomaly checks, PR security reviews, monthly compliance report generation. This transforms detection monitoring and security hygiene from “tasks we perform when we remember” to “tasks that execute automatically with human review of findings.” Module 5.4 covers implementation.

MCP Connectors — 38+ integrations (Gmail, Google Drive, Slack, GitHub, Calendar, Salesforce, and more) that give Claude direct, permissioned access to your tools within the conversation. Investigation workflows no longer require tab-switching. Evidence gathering, document access, and team communication happen inside the investigation context. Module 2.6 covers Connector-powered investigation workflows.

Cowork with Computer Use — a desktop agent that executes multi-step tasks autonomously. Evidence folder organization, report compilation from multiple sources, bulk data processing. Computer Use extends this to GUI-based security portals that lack APIs. Module 5.5 covers security-specific Cowork workflows.

These capabilities do not change the fundamental principle from the matrix: AI assists, humans decide. They expand the scope of what AI can assist with — from conversational analysis into autonomous detection, application security scanning, cross-tool investigation, and delegated document processing. Every module in this course uses at least two Claude surfaces. The days of treating AI as a chat box are over.

You're reading the free modules of this course

The full course continues with advanced topics, production detection rules, worked investigation scenarios, and deployable artifacts. Premium subscribers get access to all courses.

View Pricing See Full Syllabus

← 1.1 What AI Actually Does (and Does Not Do) 1.3 The AI Security Literature — What the Standards Bodies Say →