Let's add a materialize-iceberg task-level option to use Iceberg timestamptz_ns column types instead of timestamptz.
Current Landscape
Iceberg v3
Iceberg table format v3 adds support for nanosecond timestamps, serialized as "8-byte little-endian long".
The range of values is narrower than the more common microsecond timestamps:
- microsecond timestamps: 290,308 BCE -> 294,246 CE
- nanosecond timestamps: 1677 CE -> 2262 CE
Blocker for the materialize-s3-iceberg path: pyiceberg ≤0.11.1 (latest) cannot write to v3 tables — write_manifest_list has no V3 writer and data_file_statistics_from_parquet_metadata rejects ns-precision parquet column stats, so Transaction.add_files fails at append time.
Flow
Timestamps are serialized and transferred as RFC3339 strings, whose range is 1 CE -> 9999 CE, and technically has no precision limit. The following connectors explicitly handle nanosecond precision in code:
source-oracle — parses TIMESTAMP WITH TIME ZONE with 9-digit fractional seconds: replication.go:29-34, main.go:60-63
source-kafka — handles Avro TimestampNanos and LocalTimestampNanos logical types: pull.rs:514-536
materialize-snowflake — persists to TIMESTAMP_LTZ/NTZ/TZ at scale 9 via nanosecond-scaled binary decimal encoding: bdec.go:819-832
Note that "parses nanos from source" (Oracle, Kafka) and "persists nanos to a ns-precision column" (Snowflake) are distinct claims — most other materializers can carry a 9-digit RFC3339 string through opaquely but truncate when writing to the destination column type.
Notes
Out-of-range timestamp semantics
Flow's wire format allows year 0001–9999, but timestamptz_ns (int64 Unix nanos) is only valid 1677-09-21 → 2262-04-11. Out-of-range values are clamped to the ns min/max.
Reader compatibility
timestamptz_ns requires Iceberg v3 readers. Validate writes with a DuckDB-based unit test.
Schema evolution
Iceberg doesn't allow in-place promotion between timestamptz and timestamptz_ns (different physical encodings). The implementation must support bidirectional migration — both timestamptz → timestamptz_ns and timestamptz_ns → timestamptz.
Let's add a
materialize-icebergtask-level option to use Icebergtimestamptz_nscolumn types instead oftimestamptz.Current Landscape
Iceberg v3
Iceberg table format v3 adds support for nanosecond timestamps, serialized as "8-byte little-endian long".
The range of values is narrower than the more common microsecond timestamps:
Blocker for the
materialize-s3-icebergpath: pyiceberg ≤0.11.1 (latest) cannot write to v3 tables —write_manifest_listhas no V3 writer anddata_file_statistics_from_parquet_metadatarejects ns-precision parquet column stats, soTransaction.add_filesfails at append time.Flow
Timestamps are serialized and transferred as RFC3339 strings, whose range is 1 CE -> 9999 CE, and technically has no precision limit. The following connectors explicitly handle nanosecond precision in code:
source-oracle— parsesTIMESTAMP WITH TIME ZONEwith 9-digit fractional seconds:replication.go:29-34,main.go:60-63source-kafka— handles AvroTimestampNanosandLocalTimestampNanoslogical types:pull.rs:514-536materialize-snowflake— persists toTIMESTAMP_LTZ/NTZ/TZat scale 9 via nanosecond-scaled binary decimal encoding:bdec.go:819-832Note that "parses nanos from source" (Oracle, Kafka) and "persists nanos to a ns-precision column" (Snowflake) are distinct claims — most other materializers can carry a 9-digit RFC3339 string through opaquely but truncate when writing to the destination column type.
Notes
Out-of-range timestamp semantics
Flow's wire format allows year 0001–9999, but
timestamptz_ns(int64 Unix nanos) is only valid 1677-09-21 → 2262-04-11. Out-of-range values are clamped to the ns min/max.Reader compatibility
timestamptz_nsrequires Iceberg v3 readers. Validate writes with a DuckDB-based unit test.Schema evolution
Iceberg doesn't allow in-place promotion between
timestamptzandtimestamptz_ns(different physical encodings). The implementation must support bidirectional migration — bothtimestamptz→timestamptz_nsandtimestamptz_ns→timestamptz.