Skip to main content

Azure Blob Storage

Microsoft Azure Long-Term Storage

Synopsis

Azure Blob Storage device reads files from an Azure Blob Storage container by dequeuing messages from an Azure Storage Queue that notifies the device of new blobs.

Schema

- id: <numeric>
name: <string>
description: <string>
type: azblob
tags: <string[]>
pipelines: <pipeline[]>
status: <boolean>
properties:
container_name: <string>
connection_string: <string>
tenant_id: <string>
client_id: <string>
client_secret: <string>
account: <string>
blob_size: <numeric>

Configuration

Device

FieldRequiredDefaultDescription
idYUnique numeric identifier
nameYDevice name
descriptionN-Optional description
typeYDevice type identifier (must be azblob)
tagsN-Array of labels for categorization
pipelinesN-Array of preprocessing pipeline references
statusNtrueEnable/disable the device

Connection

The device requires either a connection string or a service principal with account name. container_name is required in all cases.

FieldRequiredDefaultDescription
container_nameYAzure Storage Queue name used for blob notifications
connection_stringY*Azure Storage connection string for authentication
tenant_idY*Azure tenant ID for service principal authentication
client_idY*Azure client ID for service principal authentication
client_secretY*Azure client secret for service principal authentication
accountY*Azure storage account name for service principal authentication
blob_sizeN100000000Maximum blob size in bytes to process (100 MB)

* = Conditionally required: either connection_string or the combination of tenant_id, client_id, client_secret, and account must be provided.

Details

IAM Permissions

When using service principal authentication, the following Azure RBAC roles are required:

Azure RoleScopePurpose
Storage Blob Data ReaderStorage Account or ContainerRead blobs and blob properties
Storage Queue Data Message ProcessorStorage Account or QueueDequeue and delete queue messages

The device validates connectivity at startup by reading blob service properties and queue metadata.

Queue-Based Notification

The device uses an Azure Storage Queue as a notification mechanism. The queue name is specified by container_name. When a blob lands in the storage account, Azure emits a queue message pointing to the blob. The device dequeues these messages and downloads the referenced blobs for processing.

Authentication

When connection_string is provided, the device builds both the blob and queue clients directly from the connection string. When account is provided instead, the device constructs clients using the service principal credentials and builds URLs of the form https://<account>.blob.core.windows.net/ and https://<account>.queue.core.windows.net/<container_name>/.

Examples

Connection String Authentication

Connecting with an Azure Storage connection string...

- id: 1
name: blob-reader
type: azblob
properties:
connection_string: "DefaultEndpointsProtocol=https;AccountName=myaccount;AccountKey=key123;EndpointSuffix=core.windows.net"
container_name: "blob-notifications"

Service Principal Authentication

Connecting with service principal credentials...

- id: 2
name: enterprise-blob-reader
type: azblob
properties:
tenant_id: "12345678-1234-1234-1234-123456789abc"
client_id: "87654321-4321-4321-4321-cba987654321"
client_secret: "your-client-secret"
account: "enterprisestorage"
container_name: "blob-notifications"

Large Blob Processing

Increasing the blob size limit for large files...

- id: 3
name: large-blob-reader
type: azblob
properties:
connection_string: "DefaultEndpointsProtocol=https;AccountName=datawarehouse;AccountKey=key456"
container_name: "blob-notifications"
blob_size: 536870912