apache_beam.runners.direct.sdf_direct_runner module
This module contains Splittable DoFn logic that is specific to DirectRunner.
- class apache_beam.runners.direct.sdf_direct_runner.SplittableParDoOverride[source]
Bases:
PTransformOverride
A transform override for ParDo transformss of SplittableDoFns.
Replaces the ParDo transform with a SplittableParDo transform that performs SDF specific logic.
- class apache_beam.runners.direct.sdf_direct_runner.SplittableParDo(ptransform)[source]
Bases:
PTransform
A transform that processes a PCollection using a Splittable DoFn.
- class apache_beam.runners.direct.sdf_direct_runner.ElementAndRestriction(element, restriction, watermark_estimator_state)[source]
Bases:
object
A holder for an element, restriction, and watermark estimator state.
- class apache_beam.runners.direct.sdf_direct_runner.PairWithRestrictionFn(do_fn)[source]
Bases:
DoFn
A transform that pairs each element with a restriction.
- class apache_beam.runners.direct.sdf_direct_runner.SplitRestrictionFn(do_fn)[source]
Bases:
DoFn
A transform that perform initial splitting of Splittable DoFn inputs.
- class apache_beam.runners.direct.sdf_direct_runner.ExplodeWindowsFn(*unused_args, **unused_kwargs)[source]
Bases:
DoFn
A transform that forces the runner to explode windows.
This is done to make sure that Splittable DoFn proceses an element for each of the windows that element belongs to.
- class apache_beam.runners.direct.sdf_direct_runner.RandomUniqueKeyFn(*unused_args, **unused_kwargs)[source]
Bases:
DoFn
A transform that assigns a unique key to each element.
- class apache_beam.runners.direct.sdf_direct_runner.ProcessKeyedElements(sdf, element_coder, restriction_coder, windowing_strategy, ptransform_args, ptransform_kwargs, ptransform_side_inputs)[source]
Bases:
PTransform
A primitive transform that performs SplittableDoFn magic.
Input to this transform should be a PCollection of keyed ElementAndRestriction objects.
- class apache_beam.runners.direct.sdf_direct_runner.ProcessKeyedElementsViaKeyedWorkItemsOverride[source]
Bases:
PTransformOverride
A transform override for ProcessElements transform.
- class apache_beam.runners.direct.sdf_direct_runner.ProcessKeyedElementsViaKeyedWorkItems(process_keyed_elements_transform)[source]
Bases:
PTransform
A transform that processes Splittable DoFn input via KeyedWorkItems.
- class apache_beam.runners.direct.sdf_direct_runner.ProcessElements(process_keyed_elements_transform)[source]
Bases:
PTransform
A primitive transform for processing keyed elements or KeyedWorkItems.
Will be evaluated by runners.direct.transform_evaluator._ProcessElementsEvaluator.
- class apache_beam.runners.direct.sdf_direct_runner.ProcessFn(sdf, args_for_invoker, kwargs_for_invoker)[source]
Bases:
DoFn
A DoFn that executes machineary for invoking a Splittable DoFn.
Input to the ParDo step that includes a ProcessFn will be a PCollection of ElementAndRestriction objects.
This class is mainly responsible for following. (1) setup environment for properly invoking a Splittable DoFn. (2) invoke process() method of a Splittable DoFn. (3) after the process() invocation of the Splittable DoFn, determine if a re-invocation of the element is needed. If this is the case, set state and a timer for a re-invocation and hold output watermark till this re-invocation. (4) after the final invocation of a given element clear any previous state set for re-invoking the element and release the output watermark.
- property step_context
- class apache_beam.runners.direct.sdf_direct_runner.SDFProcessElementInvoker(max_num_outputs, max_duration)[source]
Bases:
object
A utility that invokes SDF process() method and requests checkpoints.
This class is responsible for invoking the process() method of a Splittable DoFn and making sure that invocation terminated properly. Based on the input configuration, this class may decide to request a checkpoint for a process() execution so that runner can process current output and resume the invocation at a later time.
More specifically, when initializing a SDFProcessElementInvoker, caller may specify the number of output elements or processing time after which a checkpoint should be requested. This class is responsible for properly requesting a checkpoint based on either of these criteria. When the process() call of Splittable DoFn ends, this class performs validations to make sure that processing ended gracefully and returns a SDFProcessElementInvoker.Result that contains information which can be used by the caller to perform another process() invocation for the residual.
A process() invocation may decide to give up processing voluntarily by returning a ProcessContinuation object (see documentation of ProcessContinuation for more details). So if a ‘ProcessContinuation’ is produced this class ends the execution and performs steps to finalize the current invocation.
- class Result(residual_restriction=None, process_continuation=None, future_output_watermark=None)[source]
Bases:
object
Returned as a result of a invoke_process_element() invocation.
- Parameters:
residual_restriction – a restriction for the unprocessed part of the element.
process_continuation – a ProcessContinuation if one was returned as the last element of the SDF process() invocation.
future_output_watermark – output watermark of the results that will be produced when invoking the Splittable DoFn for the current element with residual_restriction.
- invoke_process_element(sdf_invoker, output_processor, element, restriction, watermark_estimator_state, *args, **kwargs)[source]
Invokes process() method of a Splittable DoFn for a given element.
- Parameters:
sdf_invoker – a DoFnInvoker for the Splittable DoFn.
element – the element to process
- Returns:
a SDFProcessElementInvoker.Result object.