Java transform catalog overview

Element-wise

TransformDescription
FilterGiven a predicate, filter out all elements that don't satisfy the predicate.
FlatMapElementsApplies a function that returns a collection to every element in the input and outputs all resulting elements.
KeysExtracts the key from each element in a collection of key-value pairs.
KvSwapSwaps the key and value of each element in a collection of key-value pairs.
MapElementsApplies a function to every element in the input and outputs the result.
ParDoThe most-general mechanism for applying a user-defined DoFn to every element in the input collection.
PartitionRoutes each input element to a specific output collection based on some partition function.
RegexFilters input string elements based on a regex. May also transform them based on the matching groups.
ReifyTransforms for converting between explicit and implicit form of various Beam values.
ToStringTransforms every element in an input collection to a string.
WithKeysProduces a collection containing each element from the input collection converted to a key-value pair, with a key selected by applying a function to the input element.
WithTimestampsApplies a function to determine a timestamp to each element in the output collection, and updates the implicit timestamp associated with each input. Note that it is only safe to adjust timestamps forwards.
ValuesExtracts the value from each element in a collection of key-value pairs.

Aggregation

TransformDescription
ApproximateQuantilesUses an approximation algorithm to estimate the data distribution within each aggregation using a specified number of quantiles.
ApproximateUniqueUses an approximation algorithm to estimate the number of unique elements within each aggregation.
CoGroupByKeySimilar to GroupByKey, but groups values associated with each key into a batch of a given size
CombineTransforms to combine elements according to a provided CombineFn.
CombineWithContextAn extended version of Combine which allows accessing side-inputs and other context.
CountCounts the number of elements within each aggregation.
DistinctProduces a collection containing distinct elements from the input collection.
GroupByKeyTakes a keyed collection of elements and produces a collection where each element consists of a key and all values associated with that key.
GroupIntoBatchesBatches values associated with keys into Iterable batches of some size. Each batch contains elements associated with a specific key.
HllCountEstimates the number of distinct elements and creates re-aggregatable sketches using the HyperLogLog++ algorithm.
LatestSelects the latest element within each aggregation according to the implicit timestamp.
MaxOutputs the maximum element within each aggregation.
MeanComputes the average within each aggregation.
MinOutputs the minimum element within each aggregation.
SampleRandomly select some number of elements from each aggregation.
SumCompute the sum of elements in each aggregation.
TopCompute the largest element(s) in each aggregation.

Other

TransformDescription
CreateCreates a collection from an in-memory list.
FlattenGiven multiple input collections, produces a single output collection containing all elements from all of the input collections.
PAssertA transform to assert the contents of a PCollection used as part of testing a pipeline either locally or with a runner.
ViewOperations for turning a collection into view that may be used as a side-input to a ParDo.
WindowLogically divides up or groups the elements of a collection into finite windows according to a provided WindowFn.