Security Automation with Claude
Claude generates code. For security teams, that means PowerShell scripts, KQL queries, Logic App workflows, and Python automation. This module covers the patterns that produce safe, production-ready automation — and the human-in-the-loop discipline that prevents Claude-generated code from creating the security incidents you are trying to prevent.
Workflow 1: PowerShell script generation
PowerShell is the operational language of Microsoft 365 security. Claude generates PowerShell competently — but with important caveats for production use.
The generation prompt:
<task>Write a PowerShell script that audits all inbox rules
across the tenant for rules with financial keyword targeting.</task>
<requirements>
- Connect using Microsoft Graph PowerShell SDK (not legacy Exchange Online)
- Search all mailboxes for inbox rules containing: invoice, payment,
bank, wire, transfer, remittance, account details
- For each match: output the mailbox, rule name, conditions, actions
(forward, redirect, delete, move)
- Export results to CSV
- Include error handling for mailboxes that cannot be accessed
- Include a dry-run mode (list matches without taking action)
</requirements>
<environment>
- Microsoft 365 E5 tenant
- PowerShell 7.x
- Microsoft.Graph PowerShell module installed
- Running as a Global Admin or Exchange Administrator
</environment>
Claude produces a complete script with connection logic, mailbox enumeration, rule scanning, keyword matching, and CSV export. The script is typically 80-90% production-ready.
The 10-20% that needs human review:
- Module and cmdlet names. Verify that
Get-MgUserMailFolderMessageRule(or whatever Claude generates) is the actual current cmdlet name. Microsoft Graph cmdlet names change between SDK versions. - Permissions. Claude may assume permissions that your service account does not have. Verify the Graph API permissions required (Mail.Read, MailboxSettings.Read) and ensure your app registration has them.
- Error handling. Claude generates error handling, but it may not account for: rate limiting (Graph API throttles at ~10,000 requests per 10 minutes for mail operations), shared mailboxes with different access patterns, or disconnected/archived mailboxes.
- Edge cases. Rules created with non-English keywords if you have a multilingual workforce. Rules using regex patterns rather than keyword lists. Rules on shared mailboxes vs user mailboxes.
The testing protocol: Run the script against your dev tenant first. Not production. Check: does it connect? Does it enumerate mailboxes? Does it find the test inbox rules you created? Does the CSV output format match your expectations? Only after dev tenant validation: run against production in dry-run mode first.
Workflow 2: Automated report generation
Combine Claude with PowerShell to generate recurring security reports automatically.
The pattern: PowerShell collects data → formats as text → sends to Claude API → Claude generates the narrative → script writes the report.
<task>Write a PowerShell script that generates a weekly
security metrics report.</task>
<data_sources>
1. Sentinel API: incident count, severity breakdown, MTTR
2. Defender for Endpoint API: active alerts, device compliance rate
3. Entra ID: failed sign-in count, MFA registration rate
</data_sources>
<output>
The script should:
1. Query each data source via API
2. Compile the metrics into a structured text block
3. (Optional) Send the metrics to Claude API for narrative generation
4. Output a markdown report with: metrics table, trend comparison
to previous week, and 3 recommended actions
Include: error handling, credential management (use Azure Key Vault
references, not hardcoded credentials), and logging.
</output>
The human-in-the-loop requirement: If the script sends data to the Claude API for narrative generation, the data passes through an external service. Apply the same sanitisation discipline from Module F5: no PII, no credentials, no sensitive identifiers in the API payload. The metrics (counts, rates, trends) are typically safe. The details (specific user names, specific IPs) are not.
Workflow 3: Logic App / Power Automate integration
Claude generates the Logic App workflow JSON or Power Automate flow logic for security automation.
<task>Design a Logic App that automatically enriches
Sentinel incidents with IOC reputation data.</task>
<trigger>When a Sentinel incident is created with severity High or Critical</trigger>
<actions>
1. Extract IP addresses and domains from incident entities
2. For each IP: query AbuseIPDB API for reputation score
3. For each domain: query VirusTotal API for domain report
4. Add a comment to the Sentinel incident with the enrichment results
5. If any IOC has a reputation score > 80% malicious: escalate severity to Critical
</actions>
<output>
Logic App ARM template (deployable JSON) with:
- Sentinel trigger configuration
- HTTP action blocks for API calls
- Parse JSON actions for response handling
- Sentinel action for incident update
- Error handling for API failures
Include: API key references as Key Vault secrets (not inline).
</output>
Review every step before deployment. Logic Apps execute automatically — a bug in a Claude-generated Logic App that incorrectly escalates every incident to Critical creates operational chaos. Test: create test incidents in your dev Sentinel workspace and trigger the Logic App. Verify each step produces the correct output.
The human-in-the-loop principle
If a Claude-generated PowerShell script deletes mailbox data because the error handling was wrong, you are accountable — not Claude. If a Claude-generated Logic App escalates every alert to Critical because the condition logic had a bug, the SOC faces the alert fatigue — not Claude. AI-generated code does not carry a different liability than human-written code. Review it with the same rigour you would apply to code from a junior team member — because that is the appropriate trust level.
Every Claude-generated automation script must pass through:
- Code review. Read every line. Understand the logic. Verify cmdlet names, API endpoints, and permissions.
- Dev tenant testing. Run against a non-production environment. Verify input, processing, and output.
- Dry-run in production. Run in read-only or report-only mode. Verify it handles production data volumes and edge cases.
- Monitored deployment. Deploy with alerting on failures. Review the first 24 hours of output.
Never deploy Claude-generated code directly to production. The code that looks correct may: use a deprecated API endpoint, assume permissions it does not have, mishandle error conditions, or produce unexpected results at scale. The review cycle is not optional — it is the difference between automation that helps and automation that creates incidents.
Workflow 4: KQL function and workbook generation
Beyond individual queries, Claude generates reusable KQL functions and Sentinel workbook tile definitions.
<task>Create a KQL function that I can save in Sentinel
as a reusable function.</task>
<function_purpose>
Takes a UserPrincipalName as input. Returns a consolidated
view of the user's sign-in activity across both SigninLogs
and AADNonInteractiveUserSignInLogs for the last 7 days.
Includes: daily sign-in count, unique IPs, unique apps,
authentication requirements, and a flag for any non-corporate IP usage.
</function_purpose>
<function_name>UserSignInProfile</function_name>
Claude generates the function with parameters, union of both tables, and the summarisation logic. Save it as a Sentinel function and call it from any query: UserSignInProfile("j.morrison@northgateeng.com").
Try it yourself
The script structure will be sound. The logic will be correct. The cmdlet names may need verification against your installed module version. Error handling will cover the obvious cases but miss environment-specific edge cases. Production readiness: 80-90% — the remaining 10-20% is your environment knowledge applied during review.
Knowledge checks
Check your understanding
1. Claude generates a PowerShell script that connects to Microsoft Graph and audits all mailboxes. The script runs perfectly in your dev tenant. Should you deploy to production?
Key takeaways
Claude generates 80-90% production-ready code. The remaining 10-20% is cmdlet verification, permission checking, and edge case handling — your job.
Human-in-the-loop is non-negotiable for automation. Code review → dev testing → production dry-run → monitored deployment. No shortcuts.
Sanitise data before sending to the Claude API. Metrics are safe. User names, IPs, and identifiers require sanitisation.
Reusable functions are high-ROI. Claude generates Sentinel functions you save once and use everywhere. Ask for functions, not just one-off queries.
Workflow 5: Code review with Claude
Claude is as effective at reviewing code as writing it. When you write a PowerShell script, Claude identifies bugs, security issues, and improvement opportunities.
The code review prompt:
<task>Review this PowerShell script for security issues
and operational robustness.</task>
<script>
[paste your PowerShell script]
</script>
<review_criteria>
1. Security: credentials handling, injection vulnerabilities,
unnecessary permissions, data exposure
2. Error handling: are failure modes covered? Will it fail
gracefully or crash silently?
3. Logging: does it log what it does? Can you audit its actions?
4. Scale: will it work against 500 mailboxes? 5,000? Where
will it break?
5. Maintainability: is the code readable? Are there hardcoded
values that should be parameters?
</review_criteria>
<o>
For each finding:
- Line number or section
- Issue (what is wrong)
- Severity (Critical/High/Medium/Low)
- Fix (specific code change)
</o>
Claude produces a structured code review. Common findings: hardcoded credentials (should use Key Vault), missing error handling on API calls (Graph API returns 429 rate-limit responses), no logging (should write to a log file for audit), and hardcoded thresholds (should be parameters).
The bidirectional pattern: Claude writes code → you review → you find issues → Claude fixes them. OR: you write code → Claude reviews → Claude finds issues → you fix them. Both directions are valuable. The second is particularly useful for scripts you wrote quickly during an incident and need to harden for production use.
Workflow 6: Scheduled security report automation
Build a recurring report that runs weekly without manual intervention.
<task>Write a PowerShell script that generates a weekly
email security metrics report.</task>
<metrics>
1. Total phishing emails received (EmailEvents, ThreatTypes has "Phish")
2. Phishing delivery rate (delivered / total)
3. User click rate (UrlClickEvents / delivered phishing)
4. ZAP remediation count (EmailPostDeliveryEvents, ActionType "ZAP")
5. User-reported phishing count
</metrics>
<output>
Markdown report with: metrics table, week-over-week comparison,
and 3 recommended actions based on the data.
Save to a shared folder. Send summary via email.
</output>
<constraints>
- Use Microsoft Graph API for Sentinel/Defender queries
- Use Resend or SMTP for email delivery
- Include error handling for API failures
- Include a dry-run parameter
- Log all actions to a file
- Do NOT include any PII in the email — aggregate metrics only
</constraints>
Claude generates a complete script with: API authentication (using certificate or managed identity), 5 KQL queries executed via the Sentinel API, a markdown report generator, email delivery, error handling, and logging. You review, test in dev, then schedule via Windows Task Scheduler or Azure Automation.
The operational discipline for scheduled scripts:
- Run manually for 2 weeks first — verify the output each time
- Enable the schedule only after 2 clean manual runs
- Set up alerting on script failures (the script should email you if it fails)
- Review the output monthly — metrics drift, query performance degrades, APIs change
- Store the script in version control (Git) — not on someone’s desktop
Check your understanding
2. You ask Claude to write a PowerShell script that connects to Microsoft Graph. Claude generates code that stores the API client secret in a variable at the top of the script. What is the security issue?