blog & release
2021/04/29
Apache Beam 2.29.0Kenneth Knowles [@KennKnowles]
We are happy to present the new 2.29.0 release of Beam. This release includes both improvements and new functionality. See the download page for this release.
For more information on changes in 2.29.0, check out the detailed release notes.
Highlights
- Spark Classic and Portable runners officially support Spark 3 (BEAM-7093).
- Official Java 11 support for most runners (Dataflow, Flink, Spark) (BEAM-2530).
- DataFrame API now supports GroupBy.apply (BEAM-11628).
I/Os
- Added support for S3 filesystem on AWS SDK V2 (Java) (BEAM-7637)
- GCP BigQuery sink (file loads) uses runner determined sharding for unbounded data (BEAM-11772)
- KafkaIO now recognizes the
partition
property in writing records (BEAM-11806) - Support for Hadoop configuration on ParquetIO (BEAM-11913)
New Features / Improvements
- DataFrame API now supports pandas 1.2.x (BEAM-11531).
- Multiple DataFrame API bugfixes (BEAM-12071, BEAM-11929)
- DDL supported in SQL transforms (BEAM-11850)
- Upgrade Flink runner to Flink version 1.12.2 (BEAM-11941)
Breaking Changes
- Deterministic coding enforced for GroupByKey and Stateful DoFns. Previously non-deterministic coding was allowed, resulting in keys not properly being grouped in some cases. (BEAM-11719)
To restore the old behavior, one can register
FakeDeterministicFastPrimitivesCoder
withbeam.coders.registry.register_fallback_coder(beam.coders.coders.FakeDeterministicFastPrimitivesCoder())
or use theallow_non_deterministic_key_coders
pipeline option.
Deprecations
- Support for Flink 1.8 and 1.9 will be removed in the next release (2.30.0) (BEAM-11948).
Known Issues
- See a full list of open issues that affect this version.
List of Contributors
According to git shortlog
, the following people contributed to the 2.29.0 release. Thank you to all contributors!
Ahmet Altay, Alan Myrvold, Alex Amato, Alexander Chermenin, Alexey Romanenko, Allen Pradeep Xavier, Amy Wu, Anant Damle, Andreas Bergmeier, Andrei Balici, Andrew Pilloud, Andy Xu, Ankur Goenka, Bashir Sadjad, Benjamin Gonzalez, Boyuan Zhang, Brian Hulette, Chamikara Jayalath, Chinmoy Mandayam, Chuck Yang, dandy10, Daniel Collins, Daniel Oliveira, David Cavazos, David Huntsperger, David Moravek, Dmytro Kozhevin, Emily Ye, Esun Kim, Evgeniy Belousov, Filip Popić, Fokko Driesprong, Gris Cuevas, Heejong Lee, Ihor Indyk, Ismaël Mejía, Jakub-Sadowski, Jan Lukavský, John Edmonds, Juan Sandoval, 谷口恵輔, Kenneth Jung, Kenneth Knowles, KevinGG, Kiley Sok, Kyle Weaver, MabelYC, Mackenzie Clark, Masato Nakamura, Milena Bukal, Miltos, Minbo Bae, Miraç Vuslat Başaran, mynameborat, Nahian-Al Hasan, Nam Bui, Niel Markwick, Niels Basjes, Ning Kang, Nir Gazit, Pablo Estrada, Ramazan Yapparov, Raphael Sanamyan, Reuven Lax, Rion Williams, Robert Bradshaw, Robert Burke, Rui Wang, Sam Rohde, Sam Whittle, Shehzaad Nakhoda, Shehzaad Nakhoda, Siyuan Chen, Sonam Ramchand, Steve Niemitz, sychen, Sylvain Veyrié, Tim Robertson, Tobias Kaymak, Tomasz Szerszeń, Tomasz Szerszeń, Tomo Suzuki, Tyson Hamilton, Udi Meiri, Valentyn Tymofieiev, Yichi Zhang, Yifan Mai, Yixing Zhang, Yoshiki Obata