Azure Blob Storage
Synopsis
Azure Blob Storage device reads files from an Azure Blob Storage container by dequeuing messages from an Azure Storage Queue that notifies the device of new blobs.
Schema
- id: <numeric>
name: <string>
description: <string>
type: azblob
tags: <string[]>
pipelines: <pipeline[]>
status: <boolean>
properties:
container_name: <string>
connection_string: <string>
tenant_id: <string>
client_id: <string>
client_secret: <string>
account: <string>
blob_size: <numeric>
Configuration
Device
| Field | Required | Default | Description |
|---|---|---|---|
id | Y | Unique numeric identifier | |
name | Y | Device name | |
description | N | - | Optional description |
type | Y | Device type identifier (must be azblob) | |
tags | N | - | Array of labels for categorization |
pipelines | N | - | Array of preprocessing pipeline references |
status | N | true | Enable/disable the device |
Connection
The device requires either a connection string or a service principal with account name. container_name is required in all cases.
| Field | Required | Default | Description |
|---|---|---|---|
container_name | Y | Azure Storage Queue name used for blob notifications | |
connection_string | Y* | Azure Storage connection string for authentication | |
tenant_id | Y* | Azure tenant ID for service principal authentication | |
client_id | Y* | Azure client ID for service principal authentication | |
client_secret | Y* | Azure client secret for service principal authentication | |
account | Y* | Azure storage account name for service principal authentication | |
blob_size | N | 100000000 | Maximum blob size in bytes to process (100 MB) |
* = Conditionally required: either connection_string or the combination of tenant_id, client_id, client_secret, and account must be provided.
Details
IAM Permissions
When using service principal authentication, the following Azure RBAC roles are required:
| Azure Role | Scope | Purpose |
|---|---|---|
Storage Blob Data Reader | Storage Account or Container | Read blobs and blob properties |
Storage Queue Data Message Processor | Storage Account or Queue | Dequeue and delete queue messages |
The device validates connectivity at startup by reading blob service properties and queue metadata.
Queue-Based Notification
The device uses an Azure Storage Queue as a notification mechanism. The queue name is specified by container_name. When a blob lands in the storage account, Azure emits a queue message pointing to the blob. The device dequeues these messages and downloads the referenced blobs for processing.
Authentication
When connection_string is provided, the device builds both the blob and queue clients directly from the connection string. When account is provided instead, the device constructs clients using the service principal credentials and builds URLs of the form https://<account>.blob.core.windows.net/ and https://<account>.queue.core.windows.net/<container_name>/.
Examples
Connection String Authentication
Connecting with an Azure Storage connection string... | |
Service Principal Authentication
Connecting with service principal credentials... | |
Large Blob Processing
Increasing the blob size limit for large files... | |