apache_beam.io.azure.blobstorageio module

Azure Blob Storage client.

apache_beam.io.azure.blobstorageio.parse_azfs_path(azfs_path, blob_optional=False, get_account=False)[source]

Return the storage account, the container and blob names of the given azfs:// path.

apache_beam.io.azure.blobstorageio.get_azfs_url(storage_account, container, blob='')[source]

Returns the url in the form of https://account.blob.core.windows.net/container/blob-name

class apache_beam.io.azure.blobstorageio.Blob(etag, name, last_updated, size, mime_type)[source]

Bases: object

A Blob in Azure Blob Storage.

exception apache_beam.io.azure.blobstorageio.BlobStorageIOError[source]

Bases: OSError, PermanentException

Blob Strorage IO error that should not be retried.

exception apache_beam.io.azure.blobstorageio.BlobStorageError(message=None, code=None)[source]

Bases: Exception

Blob Storage client error.

class apache_beam.io.azure.blobstorageio.BlobStorageIO(client=None, pipeline_options=None)[source]

Bases: object

Azure Blob Storage I/O client.

open(filename, mode='r', read_buffer_size=16777216, mime_type='application/octet-stream')[source]

Open an Azure Blob Storage file path for reading or writing.

  • filename (str) – Azure Blob Storage file path in the form azfs://<storage-account>/<container>/<path>.

  • mode (str) – 'r' for reading or 'w' for writing.

  • read_buffer_size (int) – Buffer size to use during read operations.

  • mime_type (str) – Mime type to set for write operations.


Azure Blob Storage file object.


ValueError – Invalid open file mode.

copy(src, dest)[source]

Copies a single Azure Blob Storage blob from src to dest.

  • src – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].

  • dest – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].


TimeoutError – on timeout.

copy_tree(src, dest)[source]

Renames the given Azure Blob storage directory and its contents recursively from src to dest.

  • src – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].

  • dest – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].


List of tuples of (src, dest, exception) where exception is None if the operation succeeded or the relevant exception if the operation failed.


Copies the given Azure Blob Storage blobs from src to dest. This can handle directory or file paths.


src_dest_pairs – List of (src, dest) tuples of azfs://<storage-account>/<container>/[name] file paths to copy from src to dest.


List of tuples of (src, dest, exception) in the same order as the src_dest_pairs argument, where exception is None if the operation succeeded or the relevant exception if the operation failed.

rename(src, dest)[source]

Renames the given Azure Blob Storage blob from src to dest.

  • src – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].

  • dest – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].


Renames the given Azure Blob Storage blobs from src to dest.


src_dest_pairs – List of (src, dest) tuples of azfs://<storage-account>/<container>/[name] file paths to rename from src to dest.

Returns: List of tuples of (src, dest, exception) in the same order as the

src_dest_pairs argument, where exception is None if the operation succeeded or the relevant exception if the operation failed.


Returns whether the given Azure Blob Storage blob exists.


path – Azure Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].


Returns the size of a single Blob Storage blob.

This method does not perform glob expansion. Hence the given path must be for a single Blob Storage blob.

Returns: size of the Blob Storage blob in bytes.


Returns the last updated epoch time of a single Azure Blob Storage blob.

This method does not perform glob expansion. Hence the given path must be for a single Azure Blob Storage blob.

Returns: last updated time of the Azure Blob Storage blob in seconds.


Looks up the checksum of an Azure Blob Storage blob.


path – Azure Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].


Deletes a single blob at the given Azure Blob Storage path.


path – Azure Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].


Deletes the given Azure Blob Storage blobs from src to dest. This can handle directory or file paths.


paths – list of Azure Blob Storage paths in the form azfs://<storage-account>/<container>/[name] that give the file blobs to be deleted.


List of tuples of (src, dest, exception) in the same order as the src_dest_pairs argument, where exception is None if the operation succeeded or the relevant exception if the operation failed.


Deletes all blobs under the given Azure BlobStorage virtual directory.


path – Azure Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name] (ending with a “/”).


List of tuples of (path, exception), where each path is a blob under the given root. exception is None if the operation succeeded or the relevant exception if the operation failed.


Deletes the given Azure Blob Storage blobs from src to dest.


paths – list of Azure Blob Storage paths in the form azfs://<storage-account>/<container>/[name] that give the file blobs to be deleted.


List of tuples of (src, dest, exception) in the same order as the src_dest_pairs argument, where exception is None if the operation succeeded or the relevant exception if the operation failed.

list_prefix(path, with_metadata=False)[source]

Lists files matching the prefix.

  • path – Azure Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].

  • with_metadata – Experimental. Specify whether returns file metadata.


dict of file name -> size; if

with_metadata is True: dict of file name -> tuple(size, timestamp).

Return type:

If with_metadata is False

list_files(path, with_metadata=False)[source]

Lists files matching the prefix.

  • path – Azure Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].

  • with_metadata – Experimental. Specify whether returns file metadata.


generator of tuple(file name, size); if with_metadata is True: generator of tuple(file name, tuple(size, timestamp)).

Return type:

If with_metadata is False

class apache_beam.io.azure.blobstorageio.BlobStorageDownloader(client, path, buffer_size)[source]

Bases: Downloader

property size
get_range(start, end)[source]
class apache_beam.io.azure.blobstorageio.BlobStorageUploader(client, path, mime_type='application/octet-stream')[source]

Bases: Uploader
