Amazon Kinesis
Synopsis
Creates a target that writes log messages to Amazon Kinesis Data Streams with support for batching and AWS authentication. The target handles message delivery efficiently with configurable batch limits. Amazon Kinesis Data Streams is a fully managed streaming data service that enables real-time data processing at scale.
Schema
- name: <string>
description: <string>
type: amazonkinesis
pipelines: <pipeline[]>
status: <boolean>
properties:
key: <string>
secret: <string>
session: <string>
region: <string>
endpoint: <string>
stream: <string>
partition_key: <string>
max_events: <numeric>
timeout: <numeric>
field_format: <string>
debug:
status: <boolean>
dont_send_logs: <boolean>
Configuration
The following fields are used to define the target:
| Field | Required | Default | Description |
|---|---|---|---|
name | Y | Target name | |
description | N | - | Optional description |
type | Y | Must be amazonkinesis | |
pipelines | N | - | Optional post-processor pipelines |
status | N | true | Enable/disable the target |
AWS Credentials
| Field | Required | Default | Description |
|---|---|---|---|
key | N* | - | AWS access key ID for authentication |
secret | N* | - | AWS secret access key for authentication |
session | N | - | Optional session token for temporary credentials |
region | Y | - | AWS region (e.g., us-east-1, eu-west-1) |
endpoint | N | - | Custom Kinesis endpoint URL (for testing or local development) |
* = Conditionally required. AWS credentials (key and secret) are required unless using IAM role-based authentication on AWS infrastructure.
Stream Configuration
| Field | Required | Default | Description |
|---|---|---|---|
stream | Y | - | Kinesis Data Stream name |
partition_key | N | "default" | Partition key for distributing records across shards |
max_events | N | 500 | Maximum number of events per batch (1-500) |
timeout | N | 30 | Connection timeout in seconds |
field_format | N | - | Data normalization format. See applicable Normalization section |
Amazon Kinesis Data Streams supports a maximum of 500 records per PutRecords request. The max_events parameter must be between 1 and 500.
Scheduling
See Scheduling and Pool Behavior for interval and cron fields shared by all targets.
Debug Options
| Field | Required | Default | Description |
|---|---|---|---|
debug.status | N | false | Enable debug logging |
debug.dont_send_logs | N | false | Process logs but don't send to target (testing) |
Details
Amazon Kinesis Data Streams is a fully managed streaming data service that captures and stores data in real time. This target allows you to send log messages to Kinesis streams for processing by downstream applications.
Authentication Methods
Supports static credentials (access key and secret key) with optional session tokens for temporary credentials. When deployed on AWS infrastructure, can leverage IAM role-based authentication without explicit credentials.
All authentication methods call sts:GetCallerIdentity during initialization to validate credentials before proceeding.
IAM Permissions
When using IAM role-based authentication, the following permissions are required:
| IAM Action | Purpose |
|---|---|
sts:GetCallerIdentity | Validate credentials at initialization |
kinesis:PutRecords | Send batch of records to stream |
Minimum IAM policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "STSIdentity",
"Effect": "Allow",
"Action": "sts:GetCallerIdentity",
"Resource": "*"
},
{
"Sid": "KinesisWrite",
"Effect": "Allow",
"Action": "kinesis:PutRecords",
"Resource": "arn:aws:kinesis:REGION:ACCOUNT_ID:stream/STREAM_NAME"
}
]
}
Stream and Shard Architecture
Kinesis Data Streams uses shards as the base throughput unit. Each shard provides:
- Write capacity: 1 MB/second or 1,000 records per second
- Read capacity: 2 MB/second
Records are distributed across shards based on the partition key. A well-distributed partition key ensures even load across shards.
Partition Key Strategy
The partition_key parameter determines how records are distributed across shards:
Static Partition Key (default: "default")
- All records go to the same shard
- Simple but can create hot shards
- Suitable for low-volume streams or testing
Dynamic Partition Key
- Use different keys to distribute load
- Records with the same key go to the same shard
- Maintains ordering for records with the same key
- Better performance for high-volume streams
Batch Processing
The target accumulates messages in memory and sends them in batches using the PutRecords API. Batches are sent when the event count limit (max_events) is reached or during finalization. The maximum batch size is 500 records per request (Amazon Kinesis limit).
Data Retention
Kinesis Data Streams retains data for 24 hours by default, with the option to extend retention up to 365 days. Data is available for consumption by multiple applications simultaneously.
Encryption
Kinesis automatically encrypts data at rest using AWS KMS. Data in transit is encrypted using TLS. All connections to Kinesis use HTTPS endpoints.
Error Handling
If any records fail to be written, the entire batch operation returns an error. Failed records can be identified in the response and retried. Common failure reasons include:
- Throttling due to exceeding shard limits
- Invalid partition key
- Record size exceeding limits (1 MB per record)
Integration with AWS Services
Kinesis Data Streams integrates with other AWS services:
- AWS Lambda for serverless processing
- Amazon Kinesis Data Firehose for delivery to data stores
- Amazon Kinesis Data Analytics for SQL-based stream processing
- Amazon CloudWatch for monitoring and alarms
Examples
Basic Configuration
The minimum configuration for a Kinesis target:
targets:
- name: basic_kinesis
type: amazonkinesis
properties:
key: "AKIAIOSFODNN7EXAMPLE"
secret: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
region: "us-east-1"
stream: "application-logs"
With IAM Role
Configuration using IAM role authentication (no explicit credentials):
targets:
- name: iam_kinesis
type: amazonkinesis
properties:
region: "us-east-1"
stream: "application-logs"
When using IAM role authentication, ensure the EC2 instance, ECS task, or Lambda function has an IAM role with appropriate Kinesis permissions attached.
With Custom Partition Key
Configuration with a custom partition key for better distribution:
targets:
- name: distributed_kinesis
type: amazonkinesis
properties:
key: "AKIAIOSFODNN7EXAMPLE"
secret: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
region: "us-east-1"
stream: "distributed-logs"
partition_key: "server-01"