blog & release
2024/08/15
Apache Beam 2.58.1Danny McCormick
We are happy to present the new 2.58.1 release of Beam. This release includes both improvements and new functionality. See the download page for this release.
New Features / Improvements
- Fixed issue where KafkaIO Records read with
ReadFromKafkaViaSDFare redistributed and may contain duplicates regardless of the configuration. This affects Java pipelines with Dataflow v2 runner and xlang pipelines reading from Kafka, (#32196)
Known Issues
- Large Dataflow graphs using runner v2, or pipelines explicitly enabling the
upload_graphexperiment, will fail at construction time (#32159). - Python pipelines that run with 2.53.0-2.58.0 SDKs and read data from GCS might be affected by a data corruption issue (#32169). The issue will be fixed in 2.59.0 (#32135). To work around this, update the google-cloud-storage package to version 2.18.2 or newer.
For the most up to date list of known issues, see https://github.com/apache/beam/blob/master/CHANGES.md
List of Contributors
According to git shortlog, the following people contributed to the 2.58.1 release. Thank you to all contributors!
Danny McCormick
Sam Whittle
Latest from the blog
blog & gsoc
2025/10/14
Google Summer of Code 2025 - Enhanced Interactive Pipeline Development Environment for JupyterLab
Canyu Chen
blog & gsoc
2025/09/26
Google Summer of Code 2025 - Beam ML Vector DB/Feature Store integrations
Mohamed Awnallah
blog & gsoc
2025/09/23
Google Summer of Code 2025 - Beam YAML, Kafka and Iceberg User Accessibility
Charles Nguyen

