Skip to main content

Datagen

Test Data Generator

Synopsis

Generates synthetic log traffic and emits it to a target host and port using the selected protocol. Intended for pipeline testing, target connectivity validation, and load characterization in development and staging environments. Not for production data ingestion.

warning

This device is an outbound traffic generator. Misconfiguration (e.g., very high count with very low interval) can saturate a downstream target. Resource bounds are applied automatically and out-of-range values are clamped with a log message.

Schema

- id: <numeric>
name: <string>
description: <string>
type: datagen
tags: <string[]>
pipelines: <pipeline[]>
status: <boolean>
properties:
protocol: <string>
address: <string>
port: <numeric>
file_path: <string>
message: <string>
severity: <string>
count: <numeric>
interval: <numeric>
duration: <numeric>
now: <boolean>

Configuration

Device

FieldRequiredDefaultDescription
idY-Unique numeric identifier
nameY-Device name
descriptionN-Optional description
typeY-Must be datagen
tagsN-Optional tags
pipelinesN-Optional pre-processor pipelines
statusNtrueEnable/disable the device

Transport

FieldRequiredDefaultDescription
protocolN"syslog"Transport protocol. One of syslog (UDP), syslog-udp, syslog-tcp, tcp, udp, http, netflow
addressN"127.0.0.1"Destination host (IPv4, IPv6, or hostname)
portN514Destination port (range 1-65535)

Payload

FieldRequiredDefaultDescription
messageN-Inline message body emitted on each cycle (max 64 KB; longer values are truncated and logged)
file_pathN-Path to a file containing one or more messages to emit. Must resolve under the service data directory (absolute paths and symlinks pointing outside are rejected). When set, takes precedence over message
severityN"Error"Syslog severity label applied to generated messages

Generation

FieldRequiredDefaultDescription
countN1000Number of messages to emit per cycle (clamped to 1-1000000)
intervalN1Seconds between emission cycles (minimum 1)
durationN300Total run duration in seconds before the device stops emitting (clamped to 1-86400)
nowNtrueWhen true, start emitting immediately on enable; when false, wait for the next minute boundary before starting

Details

Resource Bounds

Out-of-range values for count, interval, and duration are clamped to safe bounds rather than rejected, so a misconfigured device still starts. Each clamp emits an Information-level debug log line of the form datagen "count" clamped from 5000000 to 1000000 for device <name>, providing a grep-able signal that operator-provided values were adjusted.

File Path Sandboxing

The file_path value is confined to the service data directory. Relative paths are resolved against that directory; absolute paths must already point inside it. Symlinks are resolved before the prefix check, so a link inside the data directory pointing to an external location (e.g., /etc/hostname) is rejected.

Protocol Aliases

syslog is the historical name for UDP transport and is equivalent to syslog-udp. The explicit syslog-udp and syslog-tcp aliases are provided so the udp/tcp pair is symmetric and operators are not misled into guessing.

Examples

Basic Syslog Generator

Emitting 1000 syslog UDP messages per second to a local target...

- id: 1
name: datagen_basic
type: datagen
properties:
protocol: syslog
address: "127.0.0.1"
port: 514
message: "Test event from datagen"

TCP with Custom Severity

Generating TCP-delivered messages at warning severity to a remote target...

- id: 2
name: datagen_tcp_warn
type: datagen
properties:
protocol: syslog-tcp
address: "10.0.0.50"
port: 1514
severity: "Warning"
message: "Synthetic warning"
count: 500
interval: 5
duration: 600

File-Sourced Messages

Replaying messages from a file under the service data directory...

- id: 3
name: datagen_replay
type: datagen
properties:
protocol: syslog
address: "127.0.0.1"
port: 514
file_path: "samples/firewall.log"
count: 100
interval: 2
duration: 1800

High-Volume Load Test

Saturating a target for load characterization (5 min duration cap)...

- id: 4
name: datagen_load
type: datagen
properties:
protocol: tcp
address: "192.168.1.100"
port: 9000
message: "Load test event"
count: 100000
interval: 1
duration: 300
now: true

Delayed Start

Aligning emission to the next minute boundary instead of starting immediately...

- id: 5
name: datagen_aligned
type: datagen
properties:
protocol: udp
address: "127.0.0.1"
port: 5140
message: "Aligned test event"
now: false