apache_beam.runners.dask.dask_runner module
DaskRunner, executing remote jobs on Dask.distributed.
The DaskRunner is a runner implementation that executes a graph of transformations across processes and workers via Dask distributed’s scheduler.
- class apache_beam.runners.dask.dask_runner.DaskOptions(flags: Sequence[str] | None = None, **kwargs)[source]
- Bases: - PipelineOptions- Initialize an options class. - The initializer will traverse all subclasses, add all their argparse arguments and then parse the command line specified by flags or by default the one obtained from sys.argv. - The subclasses of PipelineOptions do not need to redefine __init__. - Parameters:
- flags – An iterable of command line arguments to be used. If not specified then sys.argv will be used as input for parsing arguments. 
- **kwargs – Add overrides for arguments passed in flags. For overrides of arguments, please pass the option names instead of flag names. Option names: These are defined as dest in the parser.add_argument() for each flag. Passing flags like {no_use_public_ips: True}, for which the dest is defined to a different flag name in the parser, would be discarded. Instead, pass the dest of the flag (dest of no_use_public_ips is use_public_ips). 
 
 
- class apache_beam.runners.dask.dask_runner.DaskRunnerResult(client: dask.distributed.Client, futures: Sequence[dask.distributed.Future])[source]
- Bases: - PipelineResult- client: dask.distributed.Client