Apache Beam 2.38.0

We are happy to present the new 2.38.0 release of Beam. This release includes both improvements and new functionality. See the download page for this release.

For more information on changes in 2.38.0 check out the detailed release notes.

I/Os

  • Introduce projection pushdown optimizer to the Java SDK (BEAM-12976). The optimizer currently only works on the BigQuery Storage API, but more I/Os will be added in future releases. If you encounter a bug with the optimizer, please file a JIRA and disable the optimizer using pipeline option --experiments=disable_projection_pushdown.
  • A new IO for Neo4j graph databases was added. (BEAM-1857) It has the ability to update nodes and relationships using UNWIND statements and to read data using cypher statements with parameters.
  • amazon-web-services2 has reached feature parity and is finally recommended over the earlier amazon-web-services and kinesis modules (Java). These will be deprecated in one of the next releases (BEAM-13174).

New Features / Improvements

  • Pipeline dependencies supplied through --requirements_file will now be staged to the runner using binary distributions (wheels) of the PyPI packages for linux_x86_64 platform (BEAM-4032). To restore the behavior to use source distributions, set pipeline option --requirements_cache_only_sources. To skip staging the packages at submission time, set pipeline option --requirements_cache=skip (Python).
  • The Flink runner now supports Flink 1.14.x (BEAM-13106).
  • Interactive Beam now supports remotely executing Flink pipelines on Dataproc (Python) (BEAM-14071).

Breaking Changes

  • (Python) Previously DoFn.infer_output_types was expected to return Iterable[element_type] where element_type is the PCollection elemnt type. It is now expected to return element_type. Take care if you have overriden infer_output_type in a DoFn (this is not common). See BEAM-13860.
  • (amazon-web-services2) The types of awsRegion / endpoint in AwsOptions changed from String to Region / URI (BEAM-13563).

Deprecations

  • Beam 2.38.0 will be the last minor release to support Flink 1.11.
  • (amazon-web-services2) Client providers (withXYZClientProvider()) as well as IO specific RetryConfigurations are deprecated, instead use withClientConfiguration() or AwsOptions to configure AWS IOs / clients. Custom implementations of client providers shall be replaced with a respective ClientBuilderFactory and configured through AwsOptions (BEAM-13563).

Bugfixes

  • Fix S3 copy for large objects (Java) (BEAM-14011)
  • Fix quadratic behavior of pipeline canonicalization (Go) (BEAM-14128)
    • This caused unnecessarily long pre-processing times before job submission for large complex pipelines.
  • Fix pyarrow version parsing (Python)(BEAM-14235)

Known Issues

List of Contributors

According to git shortlog, the following people contributed to the 2.38.0 release. Thank you to all contributors!

abhijeet-lele Ahmet Altay akustov Alexander Alexander Zhuravlev Alexey Romanenko AlikRodriguez Anand Inguva andoni-guzman andreukus Andy Ye Ankur Goenka ansh0l Artur Khanin Aydar Farrakhov Aydar Zainutdinov Benjamin Gonzalez Brian Hulette brucearctor bulat safiullin bullet03 Carl Mastrangelo Chamikara Jayalath Chun Yang Daniela Martín Daniel Oliveira Danny McCormick daria.malkova David Cavazos David Huntsperger dmitryor Dmytro Sadovnychyi dpcollins-google egalpin Elias Segundo Antonio emily Etienne Chauchot Hengfeng Li Ismaël Mejía Israel Herraiz Jack McCluskey Jakub Kukul Janek Bevendorff Jeff Klukas Johan Sternby Kamil Breguła Kenneth Knowles Ke Wu Kiley Kyle Weaver laraschmidt Lara Schmidt LE QUELLEC Olivier Luka Kalinovcic Luke Cwik Marcin Kuthan masahitojp Masato Nakamura Matt Casters Melissa Pashniak Michael Li Miguel Hernandez Moritz Mack mosche nancyxu123 Nathan J Mehl Niel Markwick Ning Kang Pablo Estrada paul-tlh Pavel Avilov Rahul Iyer Reuven Lax Ritesh Ghorse Robert Bradshaw Robert Burke Ryan Skraba Ryan Thompson Sam Whittle Seth Vargo sp029619 Steven Niemitz Thiago Nunes Udi Meiri Valentyn Tymofieiev Victor vitaly.terentyev Yichi Zhang Yi Hu yirutang Zachary Houfek Zoe