apache_beam.runners.interactive.sql.sql_chain module
Module for tracking a chain of beam_sql magics applied.
For internal use only; no backwards-compatibility guarantees.
- class apache_beam.runners.interactive.sql.sql_chain.SqlNode(output_name: str, source: Pipeline | Set[str], query: str, schemas: Set[Any] | None = None, evaluated: Set[Pipeline] | None = None, next: SqlNode | None = None, execution_count: int = 0)[source]
Bases:
object
Each SqlNode represents a beam_sql magic applied.
- output_name
the watched unique name of the beam_sql output. Can be used as an identifier.
- Type:
- source
the inputs consumed by this node. Can be a pipeline or a set of PCollections represented by their variable names watched. When it’s a pipeline, the node computes from raw values in the query, so the output can be consumed by any SqlNode in any SqlChain.
- Type:
apache_beam.pipeline.Pipeline | Set[str]
- schemas
the schemas (NamedTuple classes) used by this node.
- Type:
Set[Any]
- evaluated
the pipelines this node has been evaluated for.
- Type:
- next
the next SqlNode applied chronologically.
- class apache_beam.runners.interactive.sql.sql_chain.SchemaLoadedSqlTransform(output_name, query, schemas, execution_count)[source]
Bases:
PTransform
PTransform that loads schema before executing SQL.
When submitting a pipeline to remote runner for execution, schemas defined in the main module are not available without save_main_session. However, save_main_session might fail when there is anything unpicklable. This DoFn makes sure only the schemas needed are pickled locally and restored later on workers.
- class apache_beam.runners.interactive.sql.sql_chain.SqlChain(nodes: Dict[str, SqlNode] | None = None, root: SqlNode | None = None, current: SqlNode | None = None, user_pipeline: Pipeline | None = None)[source]
Bases:
object
A chain of SqlNodes.
- nodes
all nodes by their output_names.
- Type:
Dict[str, apache_beam.runners.interactive.sql.sql_chain.SqlNode]
- root
the first SqlNode applied chronologically.
- current
the last node applied.
- user_pipeline
the user defined pipeline this chain originates from. If None, the whole chain just computes from raw values in queries. Otherwise, at least some of the nodes in chain has queried against PCollections.
- Type: