go/writer: column-oriented parquet write buffer by jacobmarble · Pull Request #4485 · estuary/connectors

jacobmarble · 2026-05-15T16:53:37Z

Description:

Switch the Parquet write buffer from row-major [][]any to column-major []any of typed slice pointers (*[]int64, *[]parquet.ByteArray, etc.) with a parallel []int16 definition-level slice. This avoids boxing a 3-word slice header on every append.
Values are converted to their Parquet physical types by the Write method instead of during the flush loop. This eliminates the transposition step and reduces GC pressure. It also exposes type conversion errors emitted by getFooVal(val) at the same time.
Parquet Variant types require two physical columns. This prepares the internal API to do that better.

Workflow steps:

n/a

Documentation links affected:

n/a

Notes for reviewers:

This is the simple part of #4471

go/writer: column-oriented parquet write buffer

a77f7de

jacobmarble requested review from a team and removed request for a team May 15, 2026 16:53

jacobmarble marked this pull request as draft May 15, 2026 17:05

jacobmarble mentioned this pull request May 15, 2026

go/writer: add variant parquet type #4487

Draft

Provide feedback