apache_beam.runners.interactive.sql.utils module
Module of utilities for SQL magics.
For internal use only; no backward-compatibility guarantees.
- apache_beam.runners.interactive.sql.utils.register_coder_for_schema(schema: NamedTuple, verbose: bool = False) None [source]
Registers a RowCoder for the given schema if hasn’t.
Notifies the user of what code has been implicitly executed.
- apache_beam.runners.interactive.sql.utils.find_pcolls(sql: str, pcolls: Dict[str, PCollection], verbose: bool = False) Dict[str, PCollection] [source]
Finds all PCollections used in the given sql query.
It does a simple word by word match and calls ib.collect for each PCollection found.
- apache_beam.runners.interactive.sql.utils.replace_single_pcoll_token(sql: str, pcoll_name: str) str [source]
Replaces the pcoll_name used in the sql with ‘PCOLLECTION’.
For sql query using only a single PCollection, the PCollection needs to be referred to as ‘PCOLLECTION’ instead of its variable/tag name.
- apache_beam.runners.interactive.sql.utils.pformat_namedtuple(schema: NamedTuple) str [source]
- class apache_beam.runners.interactive.sql.utils.OptionsEntry(label: str, help: str, cls: Type[PipelineOptions], arg_builder: str | Dict[str, Callable | None], default: str | None = None)[source]
Bases:
object
An entry of PipelineOptions that can be visualized through ipywidgets to take inputs in IPython notebooks interactively.
- cls
The PipelineOptions class/subclass the options belong to.
- arg_builder
Builds the argument/option. If it’s a str, this entry assigns the input ipywidget’s value directly to the argument. If it’s a Dict, use the corresponding Callable to assign the input value to each argument. If Callable is None, fallback to assign the input value directly. This allows building multiple similar PipelineOptions arguments from a single input, such as staging_location and temp_location in GoogleCloudOptions.
- cls: Type[PipelineOptions]
- class apache_beam.runners.interactive.sql.utils.OptionsForm[source]
Bases:
object
A form visualized to take inputs from users in IPython Notebooks and generate PipelineOptions to run pipelines.
- add(entry: OptionsEntry) OptionsForm [source]
Adds an OptionsEntry to the form.
- to_options() PipelineOptions [source]
Builds the PipelineOptions based on user inputs.
Can only be invoked after display_for_input.
- display_for_input() OptionsForm [source]
Displays the widgets to take user inputs.
- class apache_beam.runners.interactive.sql.utils.DataflowOptionsForm(output_name: str, output_pcoll: PCollection, verbose: bool = False)[source]
Bases:
OptionsForm
A form to take inputs from users in IPython Notebooks to build PipelineOptions to run pipelines on Dataflow.
Only contains minimum fields needed.
Inits the OptionsForm for setting up Dataflow jobs.