apache_beam.io.azure.blobstorageio module
Azure Blob Storage client.
- apache_beam.io.azure.blobstorageio.parse_azfs_path(azfs_path, blob_optional=False, get_account=False)[source]
Return the storage account, the container and blob names of the given azfs:// path.
- apache_beam.io.azure.blobstorageio.get_azfs_url(storage_account, container, blob='')[source]
Returns the url in the form of https://account.blob.core.windows.net/container/blob-name
- class apache_beam.io.azure.blobstorageio.Blob(etag, name, last_updated, size, mime_type)[source]
Bases:
object
A Blob in Azure Blob Storage.
- exception apache_beam.io.azure.blobstorageio.BlobStorageIOError[source]
Bases:
OSError
,PermanentException
Blob Strorage IO error that should not be retried.
- exception apache_beam.io.azure.blobstorageio.BlobStorageError(message=None, code=None)[source]
Bases:
Exception
Blob Storage client error.
- class apache_beam.io.azure.blobstorageio.BlobStorageIO(client=None, pipeline_options=None)[source]
Bases:
object
Azure Blob Storage I/O client.
- open(filename, mode='r', read_buffer_size=16777216, mime_type='application/octet-stream')[source]
Open an Azure Blob Storage file path for reading or writing.
- Parameters:
- Returns:
Azure Blob Storage file object.
- Raises:
ValueError – Invalid open file mode.
- copy(src, dest)[source]
Copies a single Azure Blob Storage blob from src to dest.
- Parameters:
src – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
dest – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
- Raises:
TimeoutError – on timeout.
- copy_tree(src, dest)[source]
Renames the given Azure Blob storage directory and its contents recursively from src to dest.
- Parameters:
src – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
dest – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
- Returns:
List of tuples of (src, dest, exception) where exception is None if the operation succeeded or the relevant exception if the operation failed.
- copy_paths(src_dest_pairs)[source]
Copies the given Azure Blob Storage blobs from src to dest. This can handle directory or file paths.
- Parameters:
src_dest_pairs – List of (src, dest) tuples of azfs://<storage-account>/<container>/[name] file paths to copy from src to dest.
- Returns:
List of tuples of (src, dest, exception) in the same order as the src_dest_pairs argument, where exception is None if the operation succeeded or the relevant exception if the operation failed.
- rename(src, dest)[source]
Renames the given Azure Blob Storage blob from src to dest.
- Parameters:
src – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
dest – Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
- rename_files(src_dest_pairs)[source]
Renames the given Azure Blob Storage blobs from src to dest.
- Parameters:
src_dest_pairs – List of (src, dest) tuples of azfs://<storage-account>/<container>/[name] file paths to rename from src to dest.
- Returns: List of tuples of (src, dest, exception) in the same order as the
src_dest_pairs argument, where exception is None if the operation succeeded or the relevant exception if the operation failed.
- exists(path)[source]
Returns whether the given Azure Blob Storage blob exists.
- Parameters:
path – Azure Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
- size(path)[source]
Returns the size of a single Blob Storage blob.
This method does not perform glob expansion. Hence the given path must be for a single Blob Storage blob.
Returns: size of the Blob Storage blob in bytes.
- last_updated(path)[source]
Returns the last updated epoch time of a single Azure Blob Storage blob.
This method does not perform glob expansion. Hence the given path must be for a single Azure Blob Storage blob.
Returns: last updated time of the Azure Blob Storage blob in seconds.
- checksum(path)[source]
Looks up the checksum of an Azure Blob Storage blob.
- Parameters:
path – Azure Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
- delete(path)[source]
Deletes a single blob at the given Azure Blob Storage path.
- Parameters:
path – Azure Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
- delete_paths(paths)[source]
Deletes the given Azure Blob Storage blobs from src to dest. This can handle directory or file paths.
- Parameters:
paths – list of Azure Blob Storage paths in the form azfs://<storage-account>/<container>/[name] that give the file blobs to be deleted.
- Returns:
List of tuples of (src, dest, exception) in the same order as the src_dest_pairs argument, where exception is None if the operation succeeded or the relevant exception if the operation failed.
- delete_tree(root)[source]
Deletes all blobs under the given Azure BlobStorage virtual directory.
- Parameters:
path – Azure Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name] (ending with a “/”).
- Returns:
List of tuples of (path, exception), where each path is a blob under the given root. exception is None if the operation succeeded or the relevant exception if the operation failed.
- delete_files(paths)[source]
Deletes the given Azure Blob Storage blobs from src to dest.
- Parameters:
paths – list of Azure Blob Storage paths in the form azfs://<storage-account>/<container>/[name] that give the file blobs to be deleted.
- Returns:
List of tuples of (src, dest, exception) in the same order as the src_dest_pairs argument, where exception is None if the operation succeeded or the relevant exception if the operation failed.
- list_prefix(path, with_metadata=False)[source]
Lists files matching the prefix.
- Parameters:
path – Azure Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
with_metadata – Experimental. Specify whether returns file metadata.
- Returns:
- dict of file name -> size; if
with_metadata
is True: dict of file name -> tuple(size, timestamp).
- Return type:
If
with_metadata
is False
- list_files(path, with_metadata=False)[source]
Lists files matching the prefix.
- Parameters:
path – Azure Blob Storage file path pattern in the form azfs://<storage-account>/<container>/[name].
with_metadata – Experimental. Specify whether returns file metadata.
- Returns:
generator of tuple(file name, size); if
with_metadata
is True: generator of tuple(file name, tuple(size, timestamp)).- Return type:
If
with_metadata
is False
- class apache_beam.io.azure.blobstorageio.BlobStorageDownloader(client, path, buffer_size)[source]
Bases:
Downloader
- property size