apache_beam.transforms.error_handling module¶
Utilities for gracefully handling errors and excluding bad elements.
- class apache_beam.transforms.error_handling.ErrorHandler(consumer)[source]¶
Bases:
object
ErrorHandlers are used to skip and otherwise process bad records.
Error handlers allow one to implement the “dead letter queue” pattern in a fluent manner, disaggregating the error processing specification from the main processing chain.
This is typically used as follows:
with error_handling.ErrorHandler(WriteToSomewhere(...)) as error_handler: result = pcoll | SomeTransform().with_error_handler(error_handler)
in which case errors encountered by SomeTransform()` in processing pcoll will be written by the PTransform WriteToSomewhere(…) and excluded from result rather than failing the pipeline.
To implement with_error_handling on a PTransform, one caches the provided error handler for use in expand. During expand() one can invoke error_handler.add_error_pcollection(…) any number of times with PCollections containing error records to be processed by the given error handler, or (if applicable) simply invoke with_error_handling(…) on any subtransforms.
The with_error_handling should accept None to indicate that error handling is not enabled (and make implementation-by-forwarding-error-handlers easier). In this case, any non-recoverable errors should fail the pipeline (e.g. propagate exceptions in process methods) rather than silently ignore errors.
- close()[source]¶
Indicates all error-producing operations have reported any errors.
Invokes the provided error consuming PTransform on any provided error PCollections.
- class apache_beam.transforms.error_handling.CollectingErrorHandler[source]¶
Bases:
ErrorHandler
An ErrorHandler that simply collects all errors for further processing.
This ErrorHandler requires the set of errors be retrieved via output() and consumed (or explicitly discarded).