File
Synopsis
Director polls glob patterns on its own filesystem (or any mounted path) at a configured interval and forwards matched log lines through an optional pipeline. No Agent is required.
Schema
- id: <numeric>
name: <string>
description: <string>
type: file
tags: <string[]>
pipelines: <pipeline[]>
status: <boolean>
properties:
path: <string>
pipeline_name: <string>
poll_interval: <numeric>
file_log_concurrency: <numeric>
start_date: <numeric>
ignore_cache: <boolean>
ignore_old_date: <boolean>
ignore_retention: <boolean>
ignore_time: <boolean>
date_format: <string>
line_parser: <string|map>
encoding: <string>
filter_mode: <string>
filter_rules: <map[]|string[]>
Configuration
Device
| Field | Required | Default | Description |
|---|---|---|---|
id | Y | - | Unique numeric identifier |
name | Y | - | Device name |
description | N | - | Optional description |
type | Y | - | Must be file |
tags | N | - | Optional tags |
status | N | true | Enable/disable the device |
File Source
| Field | Required | Default | Description |
|---|---|---|---|
path | Y | - | Comma-separated glob pattern(s) to scan. Each entry is treated as an independent glob; whitespace around commas is trimmed and empty segments are dropped. Supports ** for recursive directory matches. |
pipeline_name | N | - | Name of the pipeline that pre-processes matched lines. Empty string passes lines through unprocessed. |
Polling
| Field | Required | Default | Description |
|---|---|---|---|
poll_interval | N | 60 | Polling cadence in seconds. Must be greater than 0. Changing this value restarts the collector. |
file_log_concurrency | N | 1 | Maximum number of files read in parallel per poll tick. Higher values increase throughput at the cost of memory. Changing this value restarts the collector. |
start_date | N | 300 | Lookback window in seconds applied against file modification time. 0 falls back to a 1-second window. -1 (or any negative value) disables time-based filtering entirely. |
Reader Options
| Field | Required | Default | Description |
|---|---|---|---|
ignore_cache | N | false | Skip the persisted file-position cache and re-read from the beginning of each file. |
ignore_old_date | N | false | Skip the reader's old-date filter. |
ignore_retention | N | false | Skip retention-based filtering. |
ignore_time | N | false | Skip per-line time filtering. |
date_format | N | - | Custom log timestamp format. Uses Go's reference time layout (2006-01-02T15:04:05Z07:00), not strftime. |
All reader options are hot-reloaded on the next poll tick without restarting the collector.
Line Parser
Controls how individual lines are grouped into log entries.
| Field | Required | Default | Description |
|---|---|---|---|
line_parser | N | - | Line parser definition. Accepts either a map (preferred) or a bare string shorthand. |
line_parser.type | N* | - | Parser mode: regex, newline (alias new_line), string, or prefix. Numeric aliases: 1 = regex, 2 = newline, 3 = string. |
line_parser.regex | N* | - | Regex pattern that detects the start of a new log entry. Alias: value. |
line_parser.date_based | N | false | Use date-based multiline merging. |
line_parser.has_space | N | false | Treat leading whitespace as a line-continuation marker. |
* Required when using the map form with type: regex.
A bare string value for line_parser is treated as a regex pattern equivalent to type: regex with that pattern.
Encoding
| Field | Required | Default | Description |
|---|---|---|---|
encoding | N | - | Character encoding of the source files. Accepts an alias (case-insensitive; -, _, spaces, and dots are stripped) or a numeric decoder ID. |
Supported aliases:
| Alias | Encoding |
|---|---|
utf8 | UTF-8 |
utf8bom | UTF-8 with BOM |
utf16be | UTF-16 Big Endian |
utf16le | UTF-16 Little Endian |
utf16bebom | UTF-16 BE with BOM |
utf16lebom | UTF-16 LE with BOM |
gbk | GBK (Simplified Chinese) |
latin1, iso88591 | ISO 8859-1 / Latin-1 |
windows1250, cp1250 | Windows-1250 (Central European) |
windows1251, cp1251 | Windows-1251 (Cyrillic) |
windows1252, cp1252 | Windows-1252 (Western European) |
windows1256, cp1256 | Windows-1256 (Arabic) |
Filtering
| Field | Required | Default | Description |
|---|---|---|---|
filter_mode | N | - | Filter direction: include keeps only matching lines; exclude drops matching lines. |
filter_rules | N | - | List of filter rules. Accepts map form or a bare list of strings (treated as regex rules). |
filter_rules[].type | N* | - | Rule type: regex or string. |
filter_rules[].regex | N* | - | Regex pattern to match against each line. Required when type: regex. |
filter_rules[].source | N* | - | Substring or wildcard pattern to match. Alias: value. Required when type: string. |
* Required for each rule entry.
A bare list of strings is accepted as shorthand and treated as regex rules.
Details
Path Resolution
path accepts a single string that may contain comma-separated glob expressions. Each entry is processed as an independent glob after whitespace trimming; empty segments (e.g., trailing commas) are discarded. Each path is normalized via filepath.Clean before globbing. The ** double-star pattern matches recursively across directory levels.
Hot Reload vs Restart
Most configuration changes take effect on the next poll tick without interrupting the collector:
- Hot-reload (no restart):
path,start_date,ignore_cache,ignore_old_date,ignore_retention,ignore_time,date_format,line_parser,encoding,filter_mode,filter_rules - Restart required:
poll_interval,file_log_concurrency
Time-Based Filtering
start_date is applied against each file's modification time before the file is read:
- Positive value (e.g.,
300): Only files modified within the last N seconds are processed. 0: Falls back to a 1-second lookback window.- Negative value (e.g.,
-1): Disables time-based filtering; all matched files are processed regardless of modification time.
Startup Behavior
At startup the collector sleeps for a random interval of 0–20 seconds to spread load when multiple file devices start simultaneously. The first collection run begins immediately after this delay, then repeats at poll_interval.
A heartbeat monitor checks that the collector reports progress within 120 seconds. If the heartbeat threshold is exceeded the collector is stopped and the device connection state is set to error.
Security
Symlink containment and allow-listed root path enforcement are not implemented. The Director follows symlinks without restriction.
Operators are responsible for ensuring that configured paths do not expose unintended parts of the filesystem.
Examples
Single Glob
Collecting all | |
Multiple Globs with Pipeline
Scanning two directory trees with a single comma-separated | |
Multiline Log Entries
Merging Java-style stack traces into single log entries using a date-prefix regex to detect the start of each entry... | |
Filter Rules
Including only | |
Full Historical Scan
Re-reading all matched files from the beginning by disabling time filtering and resetting the position cache, useful for reprocessing after a pipeline change... | |