8.5 Common Event Format (CEF) Connectors

14-18 hours · Module 8

Common Event Format (CEF) Connectors

Introduction

CEF (Common Event Format) is a standardised log format used by network security devices — firewalls, intrusion detection/prevention systems, web proxies, and security appliances. When a Palo Alto firewall blocks a connection, when a Fortinet IDS detects a signature match, or when a Cisco ASA denies a packet, the event is often formatted as CEF. Ingesting CEF data into Sentinel extends your visibility beyond the Microsoft ecosystem to the network perimeter.

CEF architecture: the log forwarder model

CEF devices cannot send data directly to a Sentinel workspace — they send Syslog messages over the network. A Linux-based log forwarder sits between the CEF device and Sentinel.

The architecture: CEF device → sends Syslog over TCP/UDP port 514 → Linux log forwarder VM (running AMA + Syslog daemon) → AMA parses CEF fields and sends structured data → Sentinel workspace → CommonSecurityLog table.

The log forwarder is a Linux VM (Ubuntu 20.04+ or RHEL 8+) running: the Syslog daemon (rsyslog or syslog-ng) to receive Syslog messages, and AMA to parse CEF-formatted messages and send them to the workspace.

Deployment steps:

Step 1: Deploy a Linux VM in Azure (or on-premises with Azure Arc). Size: Standard_D2s_v3 is sufficient for up to 25,000 events per second.

Step 2: Install AMA on the Linux VM. For Azure VMs, AMA can be deployed through the Azure portal or Azure Policy. For Arc-connected on-premises VMs, install through the Arc agent.

Step 3: Create a DCR for CEF collection. Navigate to Azure portal → Monitor → Data Collection Rules → Create. Select the Linux VM as the data source, configure Syslog collection with the LOG_LOCAL facilities used by your CEF devices (typically LOG_LOCAL0 through LOG_LOCAL7 at DEBUG level), and set the destination to your Sentinel workspace.

Step 4: Configure your CEF devices to forward Syslog to the log forwarder VM’s IP address on port 514 (TCP preferred for reliability).

Step 5: Verify data flow.

1
2
3
4
5
// Verify CEF data is arriving
CommonSecurityLog
| where TimeGenerated > ago(1h)
| summarize Count = count() by DeviceVendor, DeviceProduct
| order by Count desc

The CommonSecurityLog table

CEF events land in the CommonSecurityLog table — a structured table where CEF fields are mapped to named columns.

Key columns: TimeGenerated, DeviceVendor (e.g., “Palo Alto Networks”), DeviceProduct (e.g., “PAN-OS”), DeviceAction (e.g., “allow”, “deny”, “drop”, “alert”), Activity (event description), SourceIP, DestinationIP, DestinationPort, Protocol, RequestURL, DeviceSeverity (the severity assigned by the device), Message (full CEF message), AdditionalExtensions (CEF extension fields).

1
2
3
4
5
6
7
// Find firewall deny events from the last hour — potential scanning or blocked attacks
CommonSecurityLog
| where TimeGenerated > ago(1h)
| where DeviceAction in ("deny", "drop", "block", "reject")
| summarize AttemptCount = count() by SourceIP, DestinationPort
| where AttemptCount > 50
| order by AttemptCount desc

CEF vs Syslog: understanding the difference

CEF is a format. Syslog is a transport protocol. CEF events are transmitted over Syslog — they are Syslog messages with a specific structured payload.

When a device sends a CEF-formatted Syslog message, AMA recognises the CEF header (CEF:0|Vendor|Product|Version|...) and parses it into the CommonSecurityLog table with structured columns.

When a device sends a non-CEF Syslog message (plain text), AMA writes it to the Syslog table with the raw message in the SyslogMessage column — unstructured and harder to query.

The distinction matters because: CEF data in CommonSecurityLog has named, typed columns that support efficient KQL queries (filter on DeviceAction, join on SourceIP). Non-CEF Syslog data requires string parsing with KQL’s parse or extract operators to extract useful fields — more complex and slower.

If your device supports CEF output, always use it. If it only supports plain Syslog, use the Syslog connector (subsection 8.6) and build KQL parsers for the specific log format.

The CEF message format in detail

Understanding the CEF message structure helps you troubleshoot parsing issues and write more precise KQL queries.

A CEF message has two parts: the Syslog header and the CEF payload.

Syslog header: <priority>timestamp hostname — standard Syslog fields. AMA extracts the timestamp and hostname.

The pipe-separated header fields map directly to CommonSecurityLog columns: DeviceVendor, DeviceProduct, DeviceVersion, DeviceEventClassID (SignatureID), Activity (Name), LogSeverity (Severity).

The Extensions section contains key-value pairs: src=192.168.1.100 dst=10.0.0.1 dpt=443 act=deny proto=TCP. These map to named columns: SourceIP (src), DestinationIP (dst), DestinationPort (dpt), DeviceAction (act), Protocol (proto).

Common CEF extension keys and their CommonSecurityLog column mappings:

CEF Key	CommonSecurityLog Column	Description
src	SourceIP	Source IP address
dst	DestinationIP	Destination IP address
spt	SourcePort	Source port
dpt	DestinationPort	Destination port
act	DeviceAction	Action taken (allow, deny, drop)
proto	Protocol	Network protocol
request	RequestURL	Requested URL
msg	Message	Event message text
cs1-cs6	DeviceCustomString1-6	Vendor-specific custom fields
cn1-cn3	DeviceCustomNumber1-3	Vendor-specific numeric fields

The custom fields (cs1-cs6, cn1-cn3) are where vendors put device-specific data that does not fit the standard CEF keys. Each vendor uses them differently: Palo Alto might put the firewall rule name in cs1, while Fortinet puts the policy ID in cs1. Check your vendor’s CEF implementation guide to understand which custom fields they populate.

Vendor-specific CEF configuration

Palo Alto Networks (PAN-OS). Configure Syslog forwarding in PAN-OS: Device → Server Profiles → Syslog → create a profile pointing to your log forwarder IP on port 514 (TCP). Set the format to CEF. Under the Objects → Log Forwarding tab, create a profile that forwards Traffic, Threat, URL, and WildFire logs. Attach the forwarding profile to your security policies.

Key PAN-OS CEF fields: DeviceAction maps to allow/deny/drop. Activity contains the threat name for Threat logs. cs1 contains the firewall rule name. cs2 contains the application name. RequestURL contains the URL for URL filtering events.

1
2
3
4
5
6
7
// Palo Alto firewall — blocked threats in the last 24 hours
CommonSecurityLog
| where TimeGenerated > ago(24h)
| where DeviceVendor == "Palo Alto Networks"
| where DeviceAction in ("deny", "drop", "reset-both", "reset-client", "reset-server")
| summarize ThreatCount = count() by Activity, SourceIP, DestinationIP, DestinationPort
| order by ThreatCount desc

Fortinet (FortiGate). Configure Syslog in FortiGate: Log & Report → Log Settings → Remote Logging and Archiving → enable Syslog. Set the server IP to your log forwarder. Set the format to CEF (under CLI: config log syslogd setting → set format cef).

Cisco ASA. Configure Syslog: logging host <interface> <forwarder-ip> and logging trap informational. Cisco ASA uses a slightly non-standard CEF implementation — some fields may require ASIM parsers for correct mapping.

Investigation patterns with CEF data

CEF data enables network-layer investigation patterns that complement the identity and endpoint data from Microsoft connectors.

Pattern 1: External scanning detection. Identify external IPs that are scanning your perimeter.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
CommonSecurityLog
| where TimeGenerated > ago(1h)
| where DeviceAction in ("deny", "drop", "block")
| where isnotempty(SourceIP) and not(ipv4_is_private(SourceIP))
| summarize
    TargetPorts = make_set(DestinationPort, 20),
    TargetIPs = dcount(DestinationIP),
    AttemptCount = count()
    by SourceIP
| where TargetPorts has_any ("22", "3389", "445", "1433", "3306") or TargetIPs > 5
| order by AttemptCount desc

Pattern 2: Outbound connection to unusual ports. Detect potential C2 communication.

1
2
3
4
5
6
7
8
9
CommonSecurityLog
| where TimeGenerated > ago(24h)
| where DeviceAction in ("allow", "accept")
| where not(DestinationPort in (80, 443, 53, 25, 587, 993, 995))
| where not(ipv4_is_private(DestinationIP))
| summarize ConnectionCount = count(), UniqueHosts = dcount(SourceIP)
    by DestinationIP, DestinationPort
| where ConnectionCount > 10
| order by ConnectionCount desc

Pattern 3: Cross-source correlation — identity + network. Correlate a suspicious sign-in (from SigninLogs) with firewall activity from the same IP.

1
2
3
4
5
6
7
8
9
let SuspiciousIPs = SigninLogs
| where TimeGenerated > ago(1h)
| where RiskLevelDuringSignIn in ("medium", "high")
| distinct IPAddress;
CommonSecurityLog
| where TimeGenerated > ago(1h)
| where SourceIP in (SuspiciousIPs) or DestinationIP in (SuspiciousIPs)
| project TimeGenerated, SourceIP, DestinationIP, DestinationPort,
    DeviceAction, DeviceVendor, Activity

This cross-source query finds network activity from the same IPs that triggered risky sign-in alerts — bridging identity and network visibility in a single investigation.

High availability and scaling

For production environments, a single log forwarder is a single point of failure. If the forwarder VM goes down, all CEF data stops flowing to Sentinel.

HA approach: Deploy two log forwarder VMs behind an Azure Load Balancer (or on-premises load balancer). CEF devices send to the load balancer VIP. Both forwarders run AMA and send data to the same workspace. If one forwarder fails, the load balancer routes traffic to the surviving forwarder.

Scaling: Each log forwarder can handle approximately 25,000 events per second (EPS). For environments exceeding this: add additional forwarders behind the load balancer, or deploy regional forwarders (one per site) to reduce network latency and bandwidth consumption.

Log forwarder sizing and configuration

VM sizing guidelines:

For up to 10,000 EPS: Standard_D2s_v3 (2 vCPU, 8 GB RAM). Sufficient for a single firewall and a few IDS/IPS devices.

For 10,000-25,000 EPS: Standard_D4s_v3 (4 vCPU, 16 GB RAM). Handles multiple firewalls, IDS, and proxy devices.

For 25,000+ EPS: Deploy multiple forwarders behind a load balancer. Scale horizontally rather than vertically.

Disk sizing: The Syslog daemon (rsyslog) buffers messages on disk when AMA ingestion is slower than the incoming rate. Size the OS disk at 128 GB minimum. For high-volume environments (>10,000 EPS), use a Premium SSD for the Syslog buffer directory (/var/spool/rsyslog) to handle burst traffic without message loss.

rsyslog configuration for CEF forwarding. The default rsyslog configuration receives Syslog on UDP port 514. For production reliability, configure TCP reception and increase the queue size:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
// rsyslog.conf additions (not KQL — shown for reference)
// # Enable TCP reception on port 514
// module(load="imtcp")
// input(type="imtcp" port="514")
//
// # Increase queue size for burst handling
// main_queue(
//     queue.type="LinkedList"
//     queue.size="100000"
//     queue.dequeueBatchSize="1000"
//     queue.workerThreads="4"
// )

Log rotation on the forwarder. If rsyslog writes local copies of forwarded messages (for debugging or backup), configure logrotate to prevent disk fill: rotate daily, compress, keep 7 days, and alert when disk usage exceeds 80%.

Multi-vendor CEF deployment

Production environments typically have multiple CEF sources from different vendors — a Palo Alto firewall, a Fortinet IDS, a Zscaler proxy, and a Cisco ASA. All send CEF to the same forwarder(s), and all land in the same CommonSecurityLog table.

Identifying data by vendor: Use the DeviceVendor and DeviceProduct columns to filter data from specific devices in KQL:

1
2
3
4
5
6
7
8
// Volume breakdown by vendor — understand your CEF data composition
CommonSecurityLog
| where TimeGenerated > ago(24h)
| summarize
    EventCount = count(),
    DailyGB = count() * 0.0005 / 1024  // Rough estimate
    by DeviceVendor, DeviceProduct
| order by EventCount desc

Vendor-specific analytics rules. While some analytics rules work across all CEF vendors (e.g., “external IP scanning detection” based on deny events from any vendor), others are vendor-specific. A rule that checks Palo Alto’s cs1 field for the firewall rule name does not work with Fortinet data (which uses cs1 differently). Create vendor-specific rules that reference DeviceVendor in the where clause, and cross-vendor rules that use only standard CEF fields (SourceIP, DestinationIP, DeviceAction).

ASIM normalisation for CEF. Deploy the ASIM Network Session parser from Content Hub. It normalises CEF data from multiple vendors into a standardised schema (SrcIpAddr, DstIpAddr, DstPortNumber, DvcAction). Analytics rules written against the ASIM schema work across all vendors without vendor-specific logic — the parser handles the translation.

Web proxy and WAF CEF integration

Web proxies (Zscaler, Squid, Blue Coat) and Web Application Firewalls (F5, Imperva, AWS WAF) commonly support CEF output. These sources provide URL-level visibility that complements the firewall’s IP/port-level visibility.

Web proxy CEF data includes: the full requested URL (RequestURL field), the HTTP method (GET, POST), the response code, the user agent, and the authenticated user. This enables: detection of connections to known-malicious URLs, data exfiltration via HTTP uploads, and shadow IT discovery (users accessing unapproved SaaS applications).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// Detect access to recently registered domains (potential phishing/C2)
CommonSecurityLog
| where TimeGenerated > ago(24h)
| where DeviceVendor has_any ("Zscaler", "Squid", "BlueCoat", "Cisco")
| where isnotempty(RequestURL)
| extend Domain = extract(@"https?://([^/]+)", 1, RequestURL)
| where Domain !has_any (".microsoft.com", ".google.com", ".amazonaws.com")
| summarize AccessCount = count(), UniqueUsers = dcount(SourceUserName)
    by Domain
| where AccessCount > 5
| order by AccessCount desc

WAF CEF data includes: the attack type (SQL injection, XSS, path traversal), the rule that triggered, the attacker IP, and the targeted URL. This is high-fidelity detection data — WAF alerts have low false positive rates because they match specific attack signatures.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// WAF attack detection — high-fidelity alerts
CommonSecurityLog
| where TimeGenerated > ago(24h)
| where DeviceVendor has_any ("F5", "Imperva", "AWS")
| where DeviceAction in ("blocked", "alerted", "dropped")
| where Activity has_any ("SQL", "XSS", "traversal", "injection", "scanner")
| summarize AttackCount = count(),
    AttackTypes = make_set(Activity, 10)
    by SourceIP, RequestURL
| where AttackCount > 3
| order by AttackCount desc

CEF troubleshooting quick reference

Symptom	Likely Cause	Fix
No events in CommonSecurityLog	Forwarder AMA not running	Check `systemctl status azuremonitoragent`
Events arrive but all fields empty	Device not sending CEF format	Configure device to output CEF
Events in Syslog instead of CommonSecurityLog	CEF sent on wrong facility	Configure device to use LOCAL0-7
High latency (>30 min)	Forwarder VM overloaded	Scale up VM or add second forwarder
Events arrive but SourceIP is empty	Device not including `src=` key	Check vendor CEF implementation guide
Duplicate events	Two forwarders receiving from same device	Check device config and LB settings

CEF connectors extend Sentinel's visibility to the network layer

Microsoft connectors cover identity, endpoint, email, and cloud. CEF connectors cover the network perimeter — firewall deny logs, IDS alerts, proxy logs, and VPN authentication events. Without CEF, an attacker scanning your perimeter or attempting lateral movement through network segments is invisible to Sentinel. The combination of Microsoft connectors (identity + endpoint) and CEF connectors (network) provides the multi-layer visibility that comprehensive threat detection requires.

Try it yourself

If you have a Linux VM available (Azure VM or local VM with Azure Arc), deploy AMA and create a DCR for CEF collection. If no CEF devices are available, you can test with a simulated CEF message using the logger command on the Linux VM:

This sends a test CEF message through the local Syslog daemon. If AMA is configured correctly, the event should appear in CommonSecurityLog within 5 minutes.

What you should observe

The test event appears in CommonSecurityLog with DeviceVendor="TestVendor", DeviceProduct="TestProduct", SourceIP="192.168.1.100", DestinationIP="10.0.0.1", DestinationPort=443. This confirms the entire pipeline: Syslog daemon → AMA → workspace → CommonSecurityLog table.

Knowledge check

Check your understanding

1. What is the role of the Linux log forwarder VM in the CEF architecture?

The log forwarder receives Syslog messages from CEF devices, runs AMA which parses the CEF-formatted messages into structured fields, and sends the structured data to the Sentinel workspace. It acts as an intermediary because CEF devices cannot send data directly to a Sentinel workspace — they output Syslog, not Azure API calls. The forwarder bridges the protocol gap.

It stores CEF data for long-term retention

It filters malicious events before they reach Sentinel

It replaces the need for AMA

The forwarder bridges the protocol gap between Syslog (what devices speak) and the Azure Monitor data pipeline (what Sentinel consumes). AMA on the forwarder does the parsing and transmission.

← 8.4 Connecting Windows Hosts to Sentinel 8.6 Syslog Data Sources →