apache_beam.ml.transforms.embeddings.tensorflow_hub module

class apache_beam.ml.transforms.embeddings.tensorflow_hub.TensorflowHubTextEmbeddings(columns: list[str], hub_url: str, preprocessing_url: str | None = None, **kwargs)[source]

Bases: EmbeddingsManager

Embedding config for tensorflow hub models. This config can be used with MLTransform to embed text data. Models are loaded using the RunInference PTransform with the help of a ModelHandler.

Parameters:
  • columns – The columns containing the text to be embedded.

  • hub_url – The url of the tensorflow hub model.

  • preprocessing_url – The url of the preprocessing model. This is optional. If provided, the preprocessing model will be used to preprocess the text before feeding it to the main model.

  • min_batch_size – The minimum batch size to be used for inference.

  • max_batch_size – The maximum batch size to be used for inference.

  • large_model – Whether to share the model across processes.

get_model_handler() ModelHandler[source]
get_ptransform_for_processing(**kwargs) PTransform[source]

Returns a RunInference object that is used to run inference on the text input using _TextEmbeddingHandler.

class apache_beam.ml.transforms.embeddings.tensorflow_hub.TensorflowHubImageEmbeddings(columns: list[str], hub_url: str, **kwargs)[source]

Bases: EmbeddingsManager

Embedding config for tensorflow hub models. This config can be used with MLTransform to embed image data. Models are loaded using the RunInference PTransform with the help of a ModelHandler.

Parameters:
  • columns – The columns containing the images to be embedded.

  • hub_url – The url of the tensorflow hub model.

  • min_batch_size – The minimum batch size to be used for inference.

  • max_batch_size – The maximum batch size to be used for inference.

  • large_model – Whether to share the model across processes.

get_model_handler() ModelHandler[source]
get_ptransform_for_processing(**kwargs) PTransform[source]

Returns a RunInference object that is used to run inference on the text input using _ImageEmbeddingHandler.