Skip to main content

Event Hubs

Microsoft Azure

Synopsis

Creates a collector that connects to Azure Event Hubs and consumes messages from specified event hubs. Supports multiple authentication methods, TLS encryption, and multiple workers for high-throughput scenarios.

Schema

- id: <numeric>
name: <string>
description: <string>
type: eventhubs
tags: <string[]>
pipelines: <pipeline[]>
status: <boolean>
properties:
event_hub: <string>
client_connection_string: <string>
tenant_id: <string>
client_id: <string>
client_secret: <string>
namespace: <string>
consumer_group: <string>
container_connection_string: <string>
container_url: <string>
container_name: <string>
reuse: <boolean>
workers: <numeric>
tls:
status: <boolean>
cert_name: <string>
key_name: <string>
insecure_skip_verify: <boolean>

Configuration

The following fields are used to define the device:

Device

FieldRequiredDefaultDescription
idYUnique identifier
nameYDevice name
descriptionN-Optional description
typeYMust be eventhubs
tagsN-Optional tags
pipelinesN-Optional pre-processor pipelines
statusNtrueEnable/disable the device

Connection

Event Hubs supports two authentication methods:

Method 1: Connection String Authentication

FieldRequiredDefaultDescription
client_connection_stringY*Event Hubs connection string (required if not using service principal)
event_hubYEvent hub name to consume from

Method 2: Service Principal Authentication

FieldRequiredDefaultDescription
tenant_idY*Azure tenant ID (required if not using connection string)
client_idY*Azure service principal client ID
client_secretY*Azure service principal client secret
namespaceY*Event Hubs namespace (required if not using connection string)
event_hubYEvent hub name to consume from

Consumer Configuration

FieldRequiredDefaultDescription
consumer_groupN"$Default"Consumer group name

Storage Configuration

EventHubs requires checkpoint storage. Choose one method:

Method 1: Storage Account Connection String

FieldRequiredDefaultDescription
container_connection_stringY*Azure Storage connection string
container_nameY*Blob container name for checkpoints

Method 2: Storage Account URL

FieldRequiredDefaultDescription
container_urlY*Azure Storage container URL

* = Conditionally required (see authentication and storage methods above)

Performance

FieldRequiredDefaultDescription
reuseNtrueEnable multi-worker mode
workersN4Number of worker processes when reuse enabled

TLS

FieldRequiredDefaultDescription
tls.statusNfalseEnable TLS encryption
tls.cert_nameN*TLS certificate file name (required if TLS enabled)
tls.key_nameN*TLS private key file name (required if TLS enabled)
tls.insecure_skip_verifyNfalseSkip server certificate verification

* = Conditionally required (only when tls.status: true)

Details

IAM Permissions

When using service principal authentication, the following Azure RBAC roles are required:

Azure RoleScopePurpose
Azure Event Hubs Data ReceiverEvent Hubs Namespace or Event HubConsume events and read hub properties
Storage Blob Data ContributorStorage Account or ContainerRead, write, and list checkpoint blobs

The checkpoint storage requires Contributor (not just Reader) because the device writes checkpoint state and manages ownership blobs for partition load balancing.

When using connection string authentication, Azure RBAC roles are not needed for Event Hubs access. The Shared Access Policy embedded in the connection string governs access (typically Listen claim for consumers). Checkpoint storage still requires either a connection string or RBAC role assignment.

Multiple Workers

When reuse is enabled, the collector uses multiple workers. Each worker maintains its own Event Hubs consumer and processes messages independently, automatically balancing message volumes.

Messages

The collector supports automatic checkpoint management, consumer group load balancing, multiple Event Hub subscriptions, TLS-encrypted connections, both connection string and service principal authentication, and custom message-processing pipelines.

Examples

Basic with Connection String

Creating a simple EventHubs consumer with connection string...

- id: 1
name: basic_eventhubs
type: eventhubs
properties:
client_connection_string: "Endpoint=sb://mynamespace.servicebus.windows.net/;SharedAccessKeyName=mykey;SharedAccessKey=myvalue"
event_hub: "logs"
container_connection_string: "DefaultEndpointsProtocol=https;AccountName=mystorage;AccountKey=mykey"
container_name: "checkpoints"

Service Principal Authentication

Connecting with service principal authentication...

- id: 2
name: sp_eventhubs
type: eventhubs
properties:
tenant_id: "12345678-1234-1234-1234-123456789012"
client_id: "87654321-4321-4321-4321-210987654321"
client_secret: "${AZURE_CLIENT_SECRET}"
namespace: "mynamespace"
event_hub: "security-logs"
consumer_group: "datastream-group"
container_url: "https://mystorage.blob.core.windows.net/checkpoints"

High-Volume Processing

Optimizing for throughput with multiple workers...

- id: 3
name: performant_eventhubs
type: eventhubs
properties:
client_connection_string: "${EVENTHUBS_CONNECTION_STRING}"
event_hub: "high-volume-logs"
consumer_group: "processing-group"
container_connection_string: "${STORAGE_CONNECTION_STRING}"
container_name: "checkpoints"
reuse: true
workers: 8

Secure Connection

Secure EventHubs connection with TLS...

- id: 4
name: secure_eventhubs
type: eventhubs
properties:
tenant_id: "${AZURE_TENANT_ID}"
client_id: "${AZURE_CLIENT_ID}"
client_secret: "${AZURE_CLIENT_SECRET}"
namespace: "secure-namespace"
event_hub: "secure-logs"
consumer_group: "secure-group"
container_url: "${STORAGE_CONTAINER_URL}"
tls:
status: true
cert_name: "eventhubs.crt"
key_name: "eventhubs.key"

Pipeline Processing

Applying custom processing to EventHubs messages...

- id: 5
name: pipeline_eventhubs
type: eventhubs
pipelines:
- json_parser
- field_extractor
- normalize_timestamps
properties:
client_connection_string: "${EVENTHUBS_CONNECTION_STRING}"
event_hub: "application-logs"
consumer_group: "processing-group"
container_connection_string: "${STORAGE_CONNECTION_STRING}"
container_name: "checkpoints"

Multiple Consumer Groups

Configuring consumer groups for load distribution...

- id: 6
name: distributed_eventhubs
type: eventhubs
properties:
client_connection_string: "${EVENTHUBS_CONNECTION_STRING}"
event_hub: "distributed-logs"
consumer_group: "instance-1"
container_connection_string: "${STORAGE_CONNECTION_STRING}"
container_name: "checkpoints-instance1"
reuse: true
workers: 4