apache_beam.ml.gcp.visionml module
A connector for sending API requests to the GCP Vision API.
- class apache_beam.ml.gcp.visionml.AnnotateImage(features, retry=None, timeout=120, max_batch_size=None, min_batch_size=None, client_options=None, context_side_input=None, metadata=None)[source]
Bases:
PTransform
A
PTransform
for annotating images using the GCP Vision API. ref: https://cloud.google.com/vision/docs/Batches elements together using
util.BatchElements
PTransform and sends each batch of elements to the GCP Vision API. Element is a Union[str, bytes] of either an URI (e.g. a GCS URI) or bytes base64-encoded image data. Accepts an AsDict side input that maps each image to an image context.- Parameters:
features – (List[
vision.Feature
]) Required. The Vision API features to detectretry – (google.api_core.retry.Retry) Optional. A retry object used to retry requests. If None is specified (default), requests will not be retried.
timeout – (float) Optional. The time in seconds to wait for the response from the Vision API. Default is 120.
max_batch_size – (int) Optional. Maximum number of images to batch in the same request to the Vision API. Default is 5 (which is also the Vision API max). This parameter is primarily intended for testing.
min_batch_size – (int) Optional. Minimum number of images to batch in the same request to the Vision API. Default is None. This parameter is primarily intended for testing.
client_options – (Union[dict, google.api_core.client_options.ClientOptions]) Optional. Client options used to set user options on the client. API Endpoint should be set through client_options.
context_side_input –
(beam.pvalue.AsDict) Optional. An
AsDict
of a PCollection to be passed to the _ImageAnnotateFn as the image context mapping containing additional image context and/or feature-specific parameters. Example usage:image_contexts = [(''gs://cloud-samples-data/vision/ocr/sign.jpg'', Union[dict, ``vision.ImageContext()``]), (''gs://cloud-samples-data/vision/ocr/sign.jpg'', Union[dict, ``vision.ImageContext()``]),] context_side_input = ( p | "Image contexts" >> beam.Create(image_contexts) ) visionml.AnnotateImage(features, context_side_input=beam.pvalue.AsDict(context_side_input)))
metadata – (Optional[Sequence[Tuple[str, str]]]): Optional. Additional metadata that is provided to the method.
- MAX_BATCH_SIZE = 5
- MIN_BATCH_SIZE = 1
- class apache_beam.ml.gcp.visionml.AnnotateImageWithContext(features, retry=None, timeout=120, max_batch_size=None, min_batch_size=None, client_options=None, metadata=None)[source]
Bases:
AnnotateImage
A
PTransform
for annotating images using the GCP Vision API. ref: https://cloud.google.com/vision/docs/ Batches elements together usingutil.BatchElements
PTransform and sends each batch of elements to the GCP Vision API.Element is a tuple of:
(Union[str, bytes], Optional[``vision.ImageContext``])
where the former is either an URI (e.g. a GCS URI) or bytes base64-encoded image data.
- Parameters:
features – (List[
vision.Feature
]) Required. The Vision API features to detectretry – (google.api_core.retry.Retry) Optional. A retry object used to retry requests. If None is specified (default), requests will not be retried.
timeout – (float) Optional. The time in seconds to wait for the response from the Vision API. Default is 120.
max_batch_size – (int) Optional. Maximum number of images to batch in the same request to the Vision API. Default is 5 (which is also the Vision API max). This parameter is primarily intended for testing.
min_batch_size – (int) Optional. Minimum number of images to batch in the same request to the Vision API. Default is None. This parameter is primarily intended for testing.
client_options – (Union[dict, google.api_core.client_options.ClientOptions]) Optional. Client options used to set user options on the client. API Endpoint should be set through client_options.
metadata – (Optional[Sequence[Tuple[str, str]]]): Optional. Additional metadata that is provided to the method.