Apache Beam 2.48.0

We are happy to present the new 2.48.0 release of Beam. This release includes both improvements and new functionality. See the download page for this release.

For more information on changes in 2.48.0, check out the detailed release notes.

Note: The release tag for Go SDK for this release is sdks/v2.48.2 instead of sdks/v2.48.0 because of incorrect commit attached to the release tag sdks/v2.48.0.

Highlights

  • “Experimental” annotation cleanup: the annotation and concept have been removed from Beam to avoid the misperception of code as “not ready”. Any proposed breaking changes will be subject to case-by-case pro/con decision making (and generally avoided) rather than using the “Experimental” to allow them.

I/Os

  • Added rename for GCS and copy for local filesystem (Go) (#25779).
  • Added support for enhanced fan-out in KinesisIO.Read (Java) (#19967).
    • This change is not compatible with Flink savepoints created by Beam 2.46.0 applications which had KinesisIO sources.
  • Added textio.ReadWithFilename transform (Go) (#25812).
  • Added fileio.MatchContinuously transform (Go) (#26186).

New Features / Improvements

  • Allow passing service name for google-cloud-profiler (Python) (#26280).
  • Dead letter queue support added to RunInference in Python (#24209).
  • Support added for defining pre/postprocessing operations on the RunInference transform (#26308)
  • Adds a Docker Compose based transform service that can be used to discover and use portable Beam transforms (#26023).

Breaking Changes

  • Passing a tag into MultiProcessShared is now required in the Python SDK (#26168).
  • CloudDebuggerOptions is removed (deprecated in Beam v2.47.0) for Dataflow runner as the Google Cloud Debugger service is shutting down. (Java) (#25959).
  • AWS 2 client providers (deprecated in Beam v2.38.0) are finally removed (#26681).
  • AWS 2 SnsIO.writeAsync (deprecated in Beam v2.37.0 due to risk of data loss) was finally removed (#26710).
  • AWS 2 coders (deprecated in Beam v2.43.0 when adding Schema support for AWS Sdk Pojos) are finally removed (#23315).

Bugfixes

  • Fixed Java bootloader failing with Too Long Args due to long classpaths, with a pathing jar. (Java) (#25582).

Known Issues

  • PubsubIO writes will throw SizeLimitExceededException for any message above 100 bytes, when used in batch (bounded) mode. (Java) (#27000).
  • Long-running Python pipelines might experience a memory leak: #28246.

List of Contributors

According to git shortlog, the following people contributed to the 2.48.0 release. Thank you to all contributors!

Abzal Tuganbay

Ahmed Abualsaud

Alexey Romanenko

Anand Inguva

Andrei Gurau

Andrey Devyatkin

Balázs Németh

Bazyli Polednia

Bruno Volpato

Chamikara Jayalath

Clay Johnson

Damon

Daniel Arn

Danny McCormick

Darkhan Nausharipov

Dip Patel

Dmitry Repin

George Novitskiy

Israel Herraiz

Jack Dingilian

Jack McCluskey

Jan Lukavský

Jasper Van den Bossche

Jeff Zhang

Jeremy Edwards

Johanna Öjeling

John Casey

Katie Liu

Kenneth Knowles

Kerry Donny-Clark

Kuba Rauch

Liam Miller-Cushon

MakarkinSAkvelon

Mattie Fu

Michel Davit

Moritz Mack

Nick Li

Oleh Borysevych

Pablo Estrada

Pranav Bhandari

Pranjal Joshi

Rebecca Szper

Reuven Lax

Ritesh Ghorse

Robert Bradshaw

Robert Burke

Rouslan

RuiLong J

RyujiTamaki

Sam Whittle

Sanil Jain

Svetak Sundhar

Timur Sultanov

Tony Tang

Udi Meiri

Valentyn Tymofieiev

Vishal Bhise

Vitaly Terentyev

Xinyu Liu

Yi Hu

bullet03

darshan-sj

kellen

liferoad

mokamoka03210120

psolomin