apache_beam.io.filebasedsink module¶
File-based sink.
- class apache_beam.io.filebasedsink.FileBasedSink(file_path_prefix, coder, file_name_suffix='', num_shards=0, shard_name_template=None, mime_type='application/octet-stream', compression_type='auto', *, max_records_per_shard=None, max_bytes_per_shard=None, skip_if_empty=False)[source]¶
Bases:
Sink
A sink to a GCS or local files.
To implement a file-based sink, extend this class and override either
write_record()
orwrite_encoded_record()
.If needed, also overwrite
open()
and/orclose()
to customize the file handling or write headers and footers.The output of this write is a
PCollection
of all written shards.- Raises:
TypeError – if file path parameters are not a
str
orValueProvider
, or if compression_type is not member ofCompressionTypes
.ValueError – if shard_name_template is not of expected format.
- open(temp_path)[source]¶
Opens
temp_path
, returning an opaque file handle object.The returned file handle is passed to
write_[encoded_]record
andclose
.
- write_record(file_handle, value)[source]¶
Writes a single record go the file handle returned by
open()
.By default, calls
write_encoded_record
after encoding the record with this sink’s Coder.
- write_encoded_record(file_handle, encoded_value)[source]¶
Writes a single encoded record to the file handle returned by
open()
.