apache_beam.runners.interactive.sql.beam_sql_magics module

Module of beam_sql cell magic that executes a Beam SQL.

Only works within an IPython kernel.

class apache_beam.runners.interactive.sql.beam_sql_magics.BeamSqlParser[source]

Bases: object

A parser to parse beam_sql inputs.

parse(args: List[str]) Namespace | None[source]

Parses a list of string inputs.

The parsed namespace contains these attributes:

output_name: Optional[str], the output variable name. verbose: bool, whether to display more details of the magic execution. query: Optional[List[str]], the beam SQL query to execute.

Returns:

The parsed args or None if fail to parse.

print_help() None[source]
apache_beam.runners.interactive.sql.beam_sql_magics.on_error(error_msg, *args)[source]

Logs the error and the usage example.

class apache_beam.runners.interactive.sql.beam_sql_magics.BeamSqlMagics(**kwargs: Any)[source]

Bases: Magics

beam_sql(line: str, cell: str | None = None) PValue | None[source]

The beam_sql line/cell magic that executes a Beam SQL.

Parameters:
  • line – the string on the same line after the beam_sql magic.

  • cell – everything else in the same notebook cell as a string. If None, beam_sql is used as line magic. Otherwise, cell magic.

Returns None if running into an error or waiting for user input (running on a selected runner remotely), otherwise a PValue as if a SqlTransform is applied.

magics = {'cell': {'beam_sql': 'beam_sql'}, 'line': {'beam_sql': 'beam_sql'}}
registered = True
apache_beam.runners.interactive.sql.beam_sql_magics.collect_data_for_local_run(query: str, found: Dict[str, PCollection])[source]
apache_beam.runners.interactive.sql.beam_sql_magics.apply_sql(query: str, output_name: str | None, found: Dict[str, PCollection], run: bool = True) Tuple[str, PValue | SqlNode, SqlChain][source]

Applies a SqlTransform with the given sql and queried PCollections.

Parameters:
  • query – The SQL query executed in the magic.

  • output_name – (optional) The output variable name in __main__ module.

  • found – The PCollections with variable names found to be used in the query.

  • run – Whether to prepare the SQL pipeline for a local run or not.

Returns:

A tuple of values. First str value is the output variable name in __main__ module, auto-generated if not provided. Second value: if run, it’s a PValue; otherwise, a SqlNode tracks the SQL without applying it or executing it. Third value: SqlChain is a chain of SqlNodes that have been applied.

apache_beam.runners.interactive.sql.beam_sql_magics.pcolls_from_streaming_cache(user_pipeline: Pipeline, query_pipeline: Pipeline, name_to_pcoll: Dict[str, PCollection]) Dict[str, PCollection][source]

Reads PCollection cache through the TestStream.

Parameters:
  • user_pipeline – The beam.Pipeline object defined by the user in the notebook.

  • query_pipeline – The beam.Pipeline object built by the magic to execute the SQL query.

  • name_to_pcoll – PCollections with variable names used in the SQL query.

Returns:

A Dict[str, beam.PCollection], where each PCollection is tagged with their PCollection variable names, read from the cache.

When the user_pipeline has unbounded sources, we force all cache reads to go through the TestStream even if they are bounded sources.

apache_beam.runners.interactive.sql.beam_sql_magics.cache_output(output_name: str, output: PValue) None[source]
apache_beam.runners.interactive.sql.beam_sql_magics.load_ipython_extension(ipython)[source]

Marks this module as an IPython extension.

To load this magic in an IPython environment, execute: %load_ext apache_beam.runners.interactive.sql.beam_sql_magics.