Apache Beam 2.14.0

We are happy to present the new 2.14.0 release of Beam. This release includes both improvements and new functionality. See the download page for this release.

For more information on changes in 2.14.0, check out the detailed release notes.

Highlights

  • Python 3 support is extended to Python 3.6 and 3.7; in addition to various other Python 3 improvements.
  • Spark portable runner (batch) now available for Java, Python, Go.
  • Added new runner: Hazelcast Jet Runner. (BEAM-7305)

I/Os

  • Schema support added to BigQuery reads. (Java) (BEAM-6673)
  • Schema support added to JDBC source. (Java) (BEAM-6674)
  • BigQuery support for bytes is fixed. (Python 3) (BEAM-6769)
  • Added DynamoDB IO. (Java) (BEAM-7043)
  • Added support unbounded reads with HCatalogIO (Java) (BEAM-7450)
  • Added BoundedSource wrapper for SDF. (Python) (BEAM-7443)
  • Added support for INCRBY/DECRBY operations in RedisIO. (BEAM-7286)
  • Added Support for ValueProvider defined GCS Location for WriteToBigQuery with File Loads. (Java) ((BEAM-7603))

New Features / Improvements

  • Python SDK add support for DoFn setup and teardown methods. (BEAM-562)
  • Python SDK adds new transforms: ApproximateUnique, Latest, Reify, ToString, WithKeys.
  • Added hook for user-defined JVM initialization in workers. (BEAM-6872)
  • Added support for SQL Row Estimation for BigQueryTable. (BEAM-7513)
  • Auto sharding of streaming sinks in FlinkRunner. (BEAM-5865)
  • Removed the Hadoop dependency from the external sorter. (BEAM-7268)
  • Added option to expire portable SDK worker environments. (BEAM-7348)
  • Beam does not relocate Guava anymore and depends only on its own vendored version of Guava. (BEAM-6620)

Breaking Changes

  • Deprecated set/getClientConfiguration in Jdbc IO. (BEAM-7263)

Bugfixes

  • Fixed reading of concatenated compressed files. (Python) (BEAM-6952)
  • Fixed re-scaling issues on Flink >= 1.6 versions. (BEAM-7144)
  • Fixed SQL EXCEPT DISTINCT behavior. (BEAM-7194)
  • Fixed OOM issues with bounded Reads for Flink Runner. (BEAM-7442)
  • Fixed HdfsFileSystem to correctly match directories. (BEAM-7561)
  • Upgraded Spark runner to use spark version 2.4.3. (BEAM-7265)
  • Upgraded Jackson to version 2.9.9. (BEAM-7465)
  • Various other bug fixes and performance improvements.

Known Issues

  • Do NOT use Python MongoDB source in this release. Python MongoDB source added in this release has a known issue that can result in data loss. See (BEAM-7866) for details.
  • Can’t install the Python SDK on macOS 10.15. See (BEAM-8368) for details.

List of Contributors

According to git shortlog, the following people contributed to the 2.14.0 release. Thank you to all contributors!

Ahmet Altay, Aizhamal Nurmamat kyzy, Ajo Thomas, Alex Amato, Alexey Romanenko, Alexey Strokach, Alex Van Boxel, Alireza Samadian, Andrew Pilloud, Ankit Jhalaria, Ankur Goenka, Anton Kedin, Aryan Naraghi, Bartok Jozsef, Bora Kaplan, Boyuan Zhang, Brian Hulette, Cam Mach, Chamikara Jayalath, Charith Ellawala, Charles Chen, Colm O hEigeartaigh, Cyrus Maden, Daniel Mills, Daniel Oliveira, David Cavazos, David Moravek, David Yan, Daniel Lescohier, Elwin Arens, Etienne Chauchot, Fábio Franco Uechi, Finch Keung, Frederik Bode, Gregory Kovelman, Graham Polley, Hai Lu, Hannah Jiang, Harshit Dwivedi, Harsh Vardhan, Heejong Lee, Henry Suryawirawan, Ismaël Mejía, Jan Lukavský, Jean-Baptiste Onofré, Jozef Vilcek, Juta, Kai Jiang, Kamil Wu, Kasia Kucharczyk, Kenneth Knowles, Kyle Weaver, Lara Schmidt, Łukasz Gajowy, Luke Cwik, Manu Zhang, Mark Liu, Matthias Baetens, Maximilian Michels, Melissa Pashniak, Michael Luckey, Michal Walenia, Mikhail Gryzykhin, Ming Liang, Neville Li, Pablo Estrada, Paul Suganthan, Peter Backx, Rakesh Kumar, Rasmi Elasmar, Reuven Lax, Reza Rokni, Robbe Sneyders, Robert Bradshaw, Robert Burke, Rose Nguyen, Rui Wang, Ruoyun Huang, Shoaib Zafar, Slava Chernyak, Steve Niemitz, Tanay Tummalapalli, Thomas Weise, Tim Robertson, Tim van der Lippe, Udi Meiri, Valentyn Tymofieiev, Varun Dhussa, Viktor Gerdin, Yichi Zhang, Yifan Mai, Yifan Zou, Yueyang Qiu.