SA1.8 Testing Automation Safely

5 hours · Module 1 · Free

Figure SA1.8 — The testing pyramid for security automation. Start with manual verification, add unit tests, then integration, then scenario testing.

Operational Objective

Deploying untested automation to production is how the CEO gets disabled at 02:00. Testing automation safely requires test targets that can be safely contained, test incidents that trigger the playbook without affecting real operations, and a methodology that validates both the happy path (everything works) and the failure path (what happens when an API call fails). This sub teaches the testing methodology for each automation tier — from Tier 1 enrichment (test in production with manual incidents) through Tier 3 containment (dedicated test accounts and staging workspace).

Deliverable: A testing methodology applicable to every playbook in this course, including test target creation for containment playbooks and the dry-run pattern for validating containment logic without executing containment.

⏱ Estimated completion: 25 minutes

Testing by tier

The required testing rigor scales with the blast radius. Tier 1 (enrichment) can be tested in the production workspace with zero risk. Tier 3 (containment) requires dedicated test infrastructure.

Tier 1 testing (enrichment). Create a manual test incident in Sentinel. Trigger the playbook. Check: did the enrichment comment appear? Is the data correct? Did the Teams notification post? This takes 5 minutes and can be done in production because enrichment does not modify anything. The worst case: an enrichment comment on a test incident that you close immediately.

To create a manual test incident: navigate to Sentinel → Incidents → Create incident (preview). Set the title to match your automation rule conditions. Add at least one entity (Account or IP) so entity extraction has data to work with. Create the incident and watch the automation fire.

Tier 2 testing (notification). Test in production but verify the notification routing. Create a test incident at each severity level and confirm the correct notification fires: High → SOC Teams channel, Critical → CISO email, Medium → no notification. The risk is notification noise — the SOC channel receives a test message. Prefix test notifications with “[TEST]” to distinguish them.

Tier 3 testing (containment). Never test containment against production users or systems. Create dedicated test infrastructure:

Test accounts: create test-containment-01@northgateeng.com and test-containment-02@northgateeng.com in Entra ID. These accounts are not used by real people. They exist solely for automation testing. Assign them to a “Test Accounts” group. Ensure they have MFA methods configured (so MFA reset testing works) and sessions active (so session revocation testing works).

Test devices: if possible, enroll a test VM in Defender for Endpoint. The VM runs as an endpoint but is not used for production work. Isolation testing isolates this VM without affecting any real user.

The test sequence for containment: create a test incident targeting the test account → playbook fires → playbook extracts the account entity → playbook evaluates conditions (VIP check, blast radius) → playbook executes containment on the test account → verify the account is actually disabled/sessions revoked → run the rollback playbook → verify the account is restored. Both containment and rollback must be tested before production deployment.

The dry-run pattern

For containment playbooks that are not yet ready for live testing, use the dry-run pattern. Add a configuration parameter (a Sentinel watchlist row or a Logic App parameter) called “DryRun” with value “true” or “false.”

In the containment decision point, add a condition: if DryRun equals “true,” skip the containment action and add an incident comment: “DRY RUN: Would have executed [session revocation] on [d.chen@northgateeng.com]. Confidence: 97%. VIP check: not VIP. Blast radius: Low.” If DryRun equals “false,” execute the containment action normally.

The dry-run validates the entire decision chain — entity extraction, enrichment queries, confidence evaluation, VIP check, blast radius assessment — without the containment impact. Run in dry-run mode for 1-2 weeks, reviewing every dry-run comment to confirm the playbook would have made the correct decision. When satisfied, change DryRun to “false” and the playbook begins executing containment.

The staging workspace

For teams that want complete isolation, deploy a second Sentinel workspace for automation testing. The staging workspace has minimal data ingestion (import sample data via a script), manually created test incidents, and the same analytics rules as production but with lower-cost data retention.

Staging workspace cost: minimal. A Log Analytics workspace with no active data connectors and manual data import costs less than £5/month in log storage. The Logic Apps deployed to the staging workspace cost nothing until they execute (Consumption plan — pay per action).

Deploy playbooks to the staging workspace first. Test against sample incidents. Validate behavior. Then export the ARM template and deploy to the production workspace. This is the safest approach for complex containment playbooks.

SA11 covers staging workspace architecture in comprehensive detail, including data import scripts, test incident generation, and the promotion process from staging to production.

⚠ Compliance Myth: "Testing in production is never acceptable"

The myth: All testing must happen in non-production environments. Any test execution in the production workspace is a policy violation.

The reality: Tier 1 enrichment playbooks are read-only. Testing them in production creates an enrichment comment on a test incident that is immediately closed. No production data is modified. No user is affected. The risk is literally zero. Requiring a staging workspace for read-only enrichment testing delays deployment by days or weeks for no security benefit. Test enrichment in production. Test containment in staging or against test accounts. Match the testing rigor to the blast radius.

Decision point: Your identity containment playbook has been running in dry-run mode for 2 weeks. It has processed 28 incidents. 27 dry-run comments show correct decisions (all 27 were confirmed TP by analysts). 1 dry-run comment shows the playbook would have disabled a service account that the VIP watchlist check should have caught — but the service account was not on the watchlist. Do you switch dry-run to live? Not yet. First, add the service account to the VIP/exclusion watchlist. Run dry-run for another week. When the next week shows 100% correct decisions, switch to live. The dry-run period is not just validating the playbook logic — it is discovering the watchlist gaps that only production-volume data reveals.

Try it: Create test infrastructure

Create two test accounts in Entra ID: test-containment-01@yourdomain.com and test-containment-02@yourdomain.com
Assign them to a group called “Automation Test Accounts”
Configure MFA on both accounts (register an authenticator — you will remove it during testing)
Sign in as each test account from a browser to create active sessions
Create a manual Sentinel incident with test-containment-01 as the Account entity
When you build the containment playbook in SA5, you will test against these accounts

Document the test accounts in your automation runbook: account names, purpose, owner, and the warning “DO NOT USE FOR PRODUCTION — automation testing only.”

Your enrichment playbook is ready for production. A colleague insists it must be tested in the staging workspace first. The playbook reads SigninLogs and adds an incident comment. What is the correct response?

Test in production with a manual test incident. The playbook is read-only (Tier 1) with zero blast radius. An enrichment comment on a test incident has no impact. Requiring staging workspace testing for a read-only playbook delays deployment without improving safety. Create a test incident, trigger the playbook, verify the output, close the incident.

Always test in staging regardless of tier. This is a blanket rule that ignores risk assessment. The testing rigor should match the blast radius. Tier 1 = test in production. Tier 3 = test in staging or against test accounts. Applying maximum testing to minimum-risk automation wastes time.

Deploy directly without testing — enrichment cannot cause harm. While enrichment cannot cause harm, testing validates that it WORKS. An untested playbook that fails silently is worse than no playbook — the analyst assumes enrichment is running when it is not. A 5-minute test confirms the playbook produces correct output.

Test by running the playbook against all existing incidents to see enrichment across a large dataset. Running the playbook against all existing incidents adds enrichment comments to hundreds of incidents that have already been triaged. This creates noise and may confuse analysts reviewing historical incidents. Test against one or two manual incidents, not the entire queue.

Where this goes deeper. SA11 is dedicated to automation testing and governance — staging workspace architecture, the testing pyramid implementation, containment test account design, CI/CD for playbooks, and the promotion process from staging to production.

You're reading the free modules of this course

The full course continues with advanced topics, production detection rules, worked investigation scenarios, and deployable artifacts. Premium subscribers get access to all courses.

View Pricing See Full Syllabus

← SA1.7 Error Handling and Retry Logic SA1.9 Monitoring Automation Health →