materialize-s3-iceberg: s/python/iceberg-go/#4347
Conversation
93f26fc to
a645fa8
Compare
|
@jacobmarble do we still want to do this? |
I still want to do it, sooner is better IMHO. The Python Iceberg implementation is more mature than the Golang impl, but we don't push it very hard. The PR hasn't been updated recently because no one had reviewed it yet. If you support the effort (with caveat to mitigate regression risks) then I'll rebase and squash to make the review easier. |
|
@jacobmarble yeah I think this is worth doing if we can mitigate the risk 👍🏽 |
21068fe to
68612bf
Compare
68612bf to
ccfd351
Compare
a5c2c4d to
4d0993d
Compare
Description:
Replaces the embedded Python
iceberg-ctlsubprocess with the native Goiceberg-golibrary. The connector now creates and writes to Iceberg tables in-process.New tables are created with metadata retention defaults (gzip metadata, drop prior versions on commit, expire snapshots older than 24h) to keep metadata files under object-store size limits.
As a nice side-effect, this also works around a date range issue in Python. When Iceberg metadata contains dates outside of the years 1 through 9999, it fails to parse because the corresponding Python type has this limit. Now that Python is removed from the code path, these dates are parsed without issue.
Workflow steps:
No user-visible changes.
Documentation links affected:
None.
Notes for reviewers:
iceberg-ctl/Python tree is removed wholesale.go/schema-gen/generate.gonow traversesoneOf/anyOfbranches so fixups apply to nullable-wrapped fields emitted by invopop.