6.3 Third-Party Connectors: Syslog and CEF

75 minutes · Module 6

Third-Party Connectors: Syslog and CEF

By the end of this subsection, you will understand the Syslog and CEF ingestion paths, know how to deploy and size a Linux forwarder, configure rsyslog, and design a high-availability architecture for production environments.

Most third-party security devices — firewalls, IDS/IPS, proxies, VPN concentrators, switches — send logs via Syslog or Common Event Format (CEF). Getting this data into Sentinel requires a Linux forwarder running the Azure Monitor Agent (AMA).

Syslog vs CEF — choose CEF when available

Attribute	Syslog	CEF
Structure	Unstructured text with facility/severity header	Structured key-value pairs with a defined schema
Table	`Syslog`	`CommonSecurityLog`
KQL queryability	Requires `parse` or `extract` to get usable fields	Fields are pre-parsed: `SourceIP`, `DestinationIP`, `DeviceAction` are columns
Query performance	Slower — regex at query time	Faster — direct column filtering
Best for	Linux servers, custom apps, devices that only output raw Syslog	Firewalls, IDS/IPS, WAFs, security appliances

CEF is always preferred when the device supports it

A KQL query filtering where SourceIP == "203.0.113.45" against CommonSecurityLog runs in seconds. The same filter against raw Syslog requires where SyslogMessage has "203.0.113.45" or a parse operation — slower, less precise, and more error-prone. Configure your devices for CEF output whenever the option exists.

Forwarder architecture

Forwarder sizing

Daily CEF/Syslog volume	vCPU	RAM	Disk	Notes
Under 10 GB/day	2	4 GB	30 GB	Lab or small environment
10-50 GB/day	4	8 GB	64 GB	Most production environments
50-100 GB/day	8	16 GB	128 GB	High-volume with multiple sources
Over 100 GB/day	Multiple forwarders behind load balancer	16 GB each	128 GB each	Distribute load across VMs

Disk space is the silent killer

When the AMA cannot send data (network issue, workspace throttling), rsyslog buffers to disk. A 30 GB disk fills in hours under high volume, and a full disk causes data loss. Size the disk for at least 24 hours of buffer: daily volume in GB + 50% overhead. Monitor disk usage with an Azure alert at 80% threshold.

Step-by-step deployment

Step 1: Deploy the VM. Ubuntu 22.04 LTS recommended. Azure VM for cloud-native environments; on-premises VM when the data sources are on-premises (reduces egress bandwidth). Place the VM in the same network segment as the source devices if possible.

Step 2: Configure rsyslog to receive on port 514. SSH into the forwarder:

1
2
3
4
5
6
7
-- Note: these are shell commands, not KQL
-- Edit /etc/rsyslog.conf and uncomment:
-- module(load="imudp")
-- input(type="imudp" port="514")
-- module(load="imtcp")
-- input(type="imtcp" port="514")
-- Then restart: sudo systemctl restart rsyslog

Enable both UDP and TCP. UDP is the Syslog default but drops packets under load. TCP provides reliable delivery. Configure your source devices for TCP when available.

Step 3: Install the AMA. In Sentinel, navigate to the CEF connector page. Follow the AMA installation command — it is a single shell command that downloads and registers the agent with your workspace.

Step 4: Create a Data Collection Rule. The DCR associates the forwarder VM with your workspace and defines the log facility/severity filters and any KQL transformations. Configuration details are in subsection 6.4.

Step 5: Configure source devices. On each firewall/device, set the Syslog output destination to the forwarder VM’s IP, port 514, TCP preferred. Set format to CEF if supported.

Verification

1
2
3
4
5
CommonSecurityLog
| where TimeGenerated > ago(1h)
| summarize EventCount = count(), LastEvent = max(TimeGenerated)
    by DeviceVendor, DeviceProduct
| sort by EventCount desc

Expected Output

DeviceVendor	DeviceProduct	EventCount	LastEvent
Palo Alto Networks	PAN-OS	4,521	14:31
Fortinet	FortiGate	2,847	14:32

What to look for: Your device vendor and product appear with recent events. If empty: (1) Is the device sending? Test: sudo tcpdump -i any port 514 -c 10 on the forwarder. (2) Is rsyslog running? systemctl status rsyslog. (3) Is AMA connected? Check Azure Portal, Monitor, Data Collection Rules, verify the VM is listed.

High availability for production

A single forwarder is a single point of failure. If it goes down, all third-party data stops.

Production HA architecture: Deploy two forwarders behind an Azure Load Balancer (or on-premises load balancer). Configure source devices to send to the load balancer VIP. Both forwarders run AMA and send to the same workspace. The load balancer distributes traffic; if one forwarder fails, the other handles all traffic.

Test failover before you need it

After deploying HA, shut down one forwarder and verify data continues flowing. Then shut down the other and verify on the first. Document the failover behavior. The worst time to discover your HA does not work is during an incident when you need the firewall data.

Try it yourself

Your organization has 3 Palo Alto firewalls generating a combined 35 GB/day of CEF data, and 5 Linux servers generating 3 GB/day of Syslog. Design the forwarder architecture: how many VMs, what specs, and should you use one forwarder for both or separate them?

Recommended: Two forwarders behind a load balancer.

Combined volume: 38 GB/day. This is in the 10-50 GB range, so each forwarder needs 4 vCPU, 8 GB RAM, 128 GB disk (24 hours of buffer at full load per forwarder).

Both CEF (firewalls) and Syslog (Linux servers) can go through the same forwarders — rsyslog handles both protocols and the AMA routes them to the correct tables (CommonSecurityLog for CEF, Syslog for raw). Separate forwarders for each protocol are unnecessary unless you have specific isolation requirements.

The load balancer distributes across both VMs. Under normal conditions, each handles ~19 GB/day. During a failover, the surviving forwarder handles 38 GB/day — within spec for the 4 vCPU / 8 GB configuration.

Check your understanding

1. Why configure TCP instead of UDP for Syslog transport when both are available?

TCP provides reliable, ordered delivery. UDP drops packets when the forwarder is under load or the network is congested — dropped packets are dropped log events, which creates gaps in your security data. TCP retransmits lost packets.

TCP is faster

UDP does not work with CEF

UDP is fire-and-forget. Under peak load (DDoS event, incident generating high log volume), UDP packet loss means lost security events — exactly when you need them most. TCP guarantees delivery at the cost of slightly more overhead. For security data, reliability beats speed.

2. Your forwarder VM disk is 92% full. What is happening and what is the risk?

rsyslog is buffering data to disk because the AMA cannot forward to the workspace (network issue, AMA crash, or throttling). At 100%, rsyslog cannot buffer and begins dropping events — data loss. Fix the AMA connection immediately, then the buffer drains.

Too many log files — safe to ignore

The VM needs a larger disk as a permanent fix

Full disk = data loss. This is an emergency. The root cause is that AMA cannot send data upstream. Restarting AMA, fixing the network path, or clearing the workspace throttle resolves it. A larger disk only buys time — it does not fix the delivery problem.

← 6.2 Microsoft First-Party Connectors 6.4 Data Collection Rules (DCRs) →