Security Automation with Claude

25 min · S5

Security Automation with Claude

Claude generates code. For security teams, that means PowerShell scripts, KQL queries, Logic App workflows, and Python automation. This module covers the patterns that produce safe, production-ready automation — and the human-in-the-loop discipline that prevents Claude-generated code from creating the security incidents you are trying to prevent.

Workflow 1: PowerShell script generation

PowerShell is the operational language of Microsoft 365 security. Claude generates PowerShell competently — but with important caveats for production use.

The generation prompt:

<task>Write a PowerShell script that audits all inbox rules
across the tenant for rules with financial keyword targeting.</task>

<requirements>
- Connect using Microsoft Graph PowerShell SDK (not legacy Exchange Online)
- Search all mailboxes for inbox rules containing: invoice, payment,
  bank, wire, transfer, remittance, account details
- For each match: output the mailbox, rule name, conditions, actions
  (forward, redirect, delete, move)
- Export results to CSV
- Include error handling for mailboxes that cannot be accessed
- Include a dry-run mode (list matches without taking action)
</requirements>

<environment>
- Microsoft 365 E5 tenant
- PowerShell 7.x
- Microsoft.Graph PowerShell module installed
- Running as a Global Admin or Exchange Administrator
</environment>

Claude produces a complete script with connection logic, mailbox enumeration, rule scanning, keyword matching, and CSV export. The script is typically 80-90% production-ready.

The 10-20% that needs human review:

Module and cmdlet names. Verify that Get-MgUserMailFolderMessageRule (or whatever Claude generates) is the actual current cmdlet name. Microsoft Graph cmdlet names change between SDK versions.
Permissions. Claude may assume permissions that your service account does not have. Verify the Graph API permissions required (Mail.Read, MailboxSettings.Read) and ensure your app registration has them.
Error handling. Claude generates error handling, but it may not account for: rate limiting (Graph API throttles at ~10,000 requests per 10 minutes for mail operations), shared mailboxes with different access patterns, or disconnected/archived mailboxes.
Edge cases. Rules created with non-English keywords if you have a multilingual workforce. Rules using regex patterns rather than keyword lists. Rules on shared mailboxes vs user mailboxes.

The testing protocol: Run the script against your dev tenant first. Not production. Check: does it connect? Does it enumerate mailboxes? Does it find the test inbox rules you created? Does the CSV output format match your expectations? Only after dev tenant validation: run against production in dry-run mode first.

Workflow 2: Automated report generation

Combine Claude with PowerShell to generate recurring security reports automatically.

The pattern: PowerShell collects data → formats as text → sends to Claude API → Claude generates the narrative → script writes the report.

<task>Write a PowerShell script that generates a weekly
security metrics report.</task>

<data_sources>
1. Sentinel API: incident count, severity breakdown, MTTR
2. Defender for Endpoint API: active alerts, device compliance rate
3. Entra ID: failed sign-in count, MFA registration rate
</data_sources>

<output>
The script should:
1. Query each data source via API
2. Compile the metrics into a structured text block
3. (Optional) Send the metrics to Claude API for narrative generation
4. Output a markdown report with: metrics table, trend comparison
   to previous week, and 3 recommended actions

Include: error handling, credential management (use Azure Key Vault
references, not hardcoded credentials), and logging.
</output>

The human-in-the-loop requirement: If the script sends data to the Claude API for narrative generation, the data passes through an external service. Apply the same sanitisation discipline from Module F5: no PII, no credentials, no sensitive identifiers in the API payload. The metrics (counts, rates, trends) are typically safe. The details (specific user names, specific IPs) are not.

Workflow 3: Logic App / Power Automate integration

Claude generates the Logic App workflow JSON or Power Automate flow logic for security automation.

<task>Design a Logic App that automatically enriches
Sentinel incidents with IOC reputation data.</task>

<trigger>When a Sentinel incident is created with severity High or Critical</trigger>

<actions>
1. Extract IP addresses and domains from incident entities
2. For each IP: query AbuseIPDB API for reputation score
3. For each domain: query VirusTotal API for domain report
4. Add a comment to the Sentinel incident with the enrichment results
5. If any IOC has a reputation score > 80% malicious: escalate severity to Critical
</actions>

<output>
Logic App ARM template (deployable JSON) with:
- Sentinel trigger configuration
- HTTP action blocks for API calls
- Parse JSON actions for response handling
- Sentinel action for incident update
- Error handling for API failures
Include: API key references as Key Vault secrets (not inline).
</output>

Review every step before deployment. Logic Apps execute automatically — a bug in a Claude-generated Logic App that incorrectly escalates every incident to Critical creates operational chaos. Test: create test incidents in your dev Sentinel workspace and trigger the Logic App. Verify each step produces the correct output.

The human-in-the-loop principle

Claude-generated code that runs in production IS your responsibility

If a Claude-generated PowerShell script deletes mailbox data because the error handling was wrong, you are accountable — not Claude. If a Claude-generated Logic App escalates every alert to Critical because the condition logic had a bug, the SOC faces the alert fatigue — not Claude. AI-generated code does not carry a different liability than human-written code. Review it with the same rigour you would apply to code from a junior team member — because that is the appropriate trust level.

Every Claude-generated automation script must pass through:

Code review. Read every line. Understand the logic. Verify cmdlet names, API endpoints, and permissions.
Dev tenant testing. Run against a non-production environment. Verify input, processing, and output.
Dry-run in production. Run in read-only or report-only mode. Verify it handles production data volumes and edge cases.
Monitored deployment. Deploy with alerting on failures. Review the first 24 hours of output.

Never deploy Claude-generated code directly to production. The code that looks correct may: use a deprecated API endpoint, assume permissions it does not have, mishandle error conditions, or produce unexpected results at scale. The review cycle is not optional — it is the difference between automation that helps and automation that creates incidents.

Workflow 4: KQL function and workbook generation

Beyond individual queries, Claude generates reusable KQL functions and Sentinel workbook tile definitions.

<task>Create a KQL function that I can save in Sentinel
as a reusable function.</task>

<function_purpose>
Takes a UserPrincipalName as input. Returns a consolidated
view of the user's sign-in activity across both SigninLogs
and AADNonInteractiveUserSignInLogs for the last 7 days.
Includes: daily sign-in count, unique IPs, unique apps,
authentication requirements, and a flag for any non-corporate IP usage.
</function_purpose>

<function_name>UserSignInProfile</function_name>

Claude generates the function with parameters, union of both tables, and the summarisation logic. Save it as a Sentinel function and call it from any query: UserSignInProfile("j.morrison@northgateeng.com").

Try it yourself

Ask Claude to write a PowerShell script for a security task you perform regularly (inbox rule audit, stale account report, MFA registration check). Review the output: are the cmdlet names current? Does the error handling cover realistic failure modes? Run it in your dev tenant. This exercise both produces a usable script and calibrates your trust level for Claude-generated automation.

The script structure will be sound. The logic will be correct. The cmdlet names may need verification against your installed module version. Error handling will cover the obvious cases but miss environment-specific edge cases. Production readiness: 80-90% — the remaining 10-20% is your environment knowledge applied during review.

Knowledge checks

Check your understanding

1. Claude generates a PowerShell script that connects to Microsoft Graph and audits all mailboxes. The script runs perfectly in your dev tenant. Should you deploy to production?

Not yet. Dev tenant success does not guarantee production safety. Production has: more mailboxes (scale issues), shared mailboxes and room mailboxes (different access patterns), rate limiting at volume, and potentially different permissions. Run in dry-run mode against production first. Review the output. Then deploy with monitoring. The dev → dry-run → monitored deployment progression is non-negotiable for any automation that touches production mailboxes.

Yes — it passed dev testing

Only if you switch to a service account

Dev → dry-run → monitored deployment. Production has scale, edge cases, and rate limits that dev testing does not expose. Never skip the dry-run step.

Key takeaways

Claude generates 80-90% production-ready code. The remaining 10-20% is cmdlet verification, permission checking, and edge case handling — your job.

Human-in-the-loop is non-negotiable for automation. Code review → dev testing → production dry-run → monitored deployment. No shortcuts.

Sanitise data before sending to the Claude API. Metrics are safe. User names, IPs, and identifiers require sanitisation.

Reusable functions are high-ROI. Claude generates Sentinel functions you save once and use everywhere. Ask for functions, not just one-off queries.

Workflow 5: Code review with Claude

Claude is as effective at reviewing code as writing it. When you write a PowerShell script, Claude identifies bugs, security issues, and improvement opportunities.

The code review prompt:

<task>Review this PowerShell script for security issues
and operational robustness.</task>

<script>
[paste your PowerShell script]
</script>

<review_criteria>
1. Security: credentials handling, injection vulnerabilities,
   unnecessary permissions, data exposure
2. Error handling: are failure modes covered? Will it fail
   gracefully or crash silently?
3. Logging: does it log what it does? Can you audit its actions?
4. Scale: will it work against 500 mailboxes? 5,000? Where
   will it break?
5. Maintainability: is the code readable? Are there hardcoded
   values that should be parameters?
</review_criteria>

<o>
For each finding:
- Line number or section
- Issue (what is wrong)
- Severity (Critical/High/Medium/Low)
- Fix (specific code change)
</o>

Claude produces a structured code review. Common findings: hardcoded credentials (should use Key Vault), missing error handling on API calls (Graph API returns 429 rate-limit responses), no logging (should write to a log file for audit), and hardcoded thresholds (should be parameters).

The bidirectional pattern: Claude writes code → you review → you find issues → Claude fixes them. OR: you write code → Claude reviews → Claude finds issues → you fix them. Both directions are valuable. The second is particularly useful for scripts you wrote quickly during an incident and need to harden for production use.

Workflow 6: Scheduled security report automation

Build a recurring report that runs weekly without manual intervention.

<task>Write a PowerShell script that generates a weekly
email security metrics report.</task>

<metrics>
1. Total phishing emails received (EmailEvents, ThreatTypes has "Phish")
2. Phishing delivery rate (delivered / total)
3. User click rate (UrlClickEvents / delivered phishing)
4. ZAP remediation count (EmailPostDeliveryEvents, ActionType "ZAP")
5. User-reported phishing count
</metrics>

<output>
Markdown report with: metrics table, week-over-week comparison,
and 3 recommended actions based on the data.
Save to a shared folder. Send summary via email.
</output>

<constraints>
- Use Microsoft Graph API for Sentinel/Defender queries
- Use Resend or SMTP for email delivery
- Include error handling for API failures
- Include a dry-run parameter
- Log all actions to a file
- Do NOT include any PII in the email — aggregate metrics only
</constraints>

Claude generates a complete script with: API authentication (using certificate or managed identity), 5 KQL queries executed via the Sentinel API, a markdown report generator, email delivery, error handling, and logging. You review, test in dev, then schedule via Windows Task Scheduler or Azure Automation.

The operational discipline for scheduled scripts:

Run manually for 2 weeks first — verify the output each time
Enable the schedule only after 2 clean manual runs
Set up alerting on script failures (the script should email you if it fails)
Review the output monthly — metrics drift, query performance degrades, APIs change
Store the script in version control (Git) — not on someone’s desktop

Check your understanding

2. You ask Claude to write a PowerShell script that connects to Microsoft Graph. Claude generates code that stores the API client secret in a variable at the top of the script. What is the security issue?

Hardcoded secrets in scripts are a critical security issue. If the script is stored in a shared folder, version control, or backup, the secret is exposed to anyone with access. The fix: use Azure Key Vault, a certificate-based authentication, or a managed identity. Never store client secrets, API keys, or passwords in script files. Claude frequently generates code with inline secrets because that is the simplest pattern — always review for credential handling.

No issue — variables are not accessible externally

Move the secret to a config file instead

Hardcoded secrets = critical issue. Use Key Vault, certificate auth, or managed identity. Claude defaults to the simplest pattern (inline secret) — always verify credential handling.

← AI Security Risks Compliance & Policy Generation →