Skip to content

Support Iceberg nested and binary GPU writes#14611

Open
res-life wants to merge 15 commits intoNVIDIA:mainfrom
res-life:iceberg-nested
Open

Support Iceberg nested and binary GPU writes#14611
res-life wants to merge 15 commits intoNVIDIA:mainfrom
res-life:iceberg-nested

Conversation

@res-life
Copy link
Copy Markdown
Collaborator

@res-life res-life commented Apr 15, 2026

Fixes #14239.

Depends on:

Description

  • Preserve Iceberg field ids and the original write schema so nested arrays, maps, and binary columns keep the Parquet layout and metrics Iceberg expects.
  • Extend Iceberg integration coverage for append, CTAS, RTAS, and overwrite flows, including focused regression coverage for binary and array writes.
  • Update ITs

Checklists

Documentation

  • Updated for new or modified user-facing features or behaviors
  • No user-facing change

Testing

  • Added or modified tests to cover new code paths
  • Covered by existing tests
    (Please provide the names of the existing tests in the PR description.)
  • Not required

Performance

  • Tests ran and results are added in the PR description
  • Issue filed with a link in the PR description
  • Not required

@res-life res-life marked this pull request as draft April 15, 2026 06:33
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 15, 2026

Greptile Summary

This PR enables GPU-accelerated Iceberg writes for nested types (arrays, maps, structs) and binary columns by propagating Iceberg field IDs through the Spark schema metadata and into cuDF's ColumnWriterOptions. The core mechanism uses reflection to set private cuDF fields (hasParquetFieldId, parquetFieldId, isBinary) as a documented interim workaround until rapidsai/cudf#22347 lands public setters. Integration tests are substantially strengthened — the previous "expected fallback" tests are replaced with GPU-correctness checks using pre-materialized Parquet sources and assert_no_cpu_project_exec, and new focused binary/array-of-binary CTAS tests are added. The previous P1 concern (accidental removal of Hex support) is not present in this version; expr[Hex] is in GpuOverrides.scala.

Confidence Score: 5/5

Safe to merge; no P0/P1 issues found — the two comments are P2 style suggestions about a deliberately temporary reflection workaround.

The previous P1 (accidental Hex removal) is absent in this revision. No resource leaks, no broken field-ID round-trip logic, and no correctness gaps were found. The reflection-based cuDF field mutation is explicitly acknowledged with a TODO + tracking issue (rapidsai/cudf#22347). Test coverage is substantively improved. Score stays at the P2-only ceiling of 5.

sql-plugin/src/main/scala/com/nvidia/spark/rapids/SchemaUtils.scala — contains the reflection layer; should be revisited once cuDF#22347 ships public setters.

Important Files Changed

Filename Overview
sql-plugin/src/main/scala/com/nvidia/spark/rapids/SchemaUtils.scala Introduces reflection-based mutation of private cuDF ColumnWriterOptions fields (hasParquetFieldId, parquetFieldId, isBinary) as a workaround for missing public setters; adds nested field-ID propagation for ArrayType/MapType and rewrites BinaryType as a marked list; flagged with TODO + cuDF tracking issue.
iceberg/common/src/main/scala/org/apache/iceberg/spark/GpuTypeToSparkType.scala Adds appendNestedFieldIdMetadata and nestedMetadataJson to encode Iceberg nested field IDs into Spark StructField metadata; iterates ListType/MapType children correctly via fields().asScala.
iceberg/common/src/main/scala/org/apache/iceberg/spark/source/GpuSparkFileWriterFactory.scala Converts the single lazy TaskAttemptContext to a per-call newTaskAttemptContext(sparkType) that stamps ParquetWriteSupport.setSchema into the Hadoop conf; both data and delete writers now get their correct schema.
iceberg/common/src/main/scala/org/apache/iceberg/spark/source/GpuSparkPositionDeltaWrite.scala Replaces hard-coded positionDeleteSparkType/Map.empty in prepareWrite with dataSparkTypeWithFieldIds/writeProps; adds dataWriterSchemaFor to correctly pick DELETE's path-pos schema vs the Iceberg data schema for other commands.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala Expands Iceberg cudfWrite sig to nested STRUCT/ARRAY/MAP/BINARY; allows map-of-map in MapFromArrays; adds BINARY to ExpandExec.
integration_tests/src/main/python/iceberg/init.py Adds materialize_parquet_source helper, assert_no_cpu_project_exec, a new iceberg_nested_write_gens_list, and parquet_write_corrected_conf; all follow existing patterns.
integration_tests/src/main/python/iceberg/iceberg_ctas_test.py Converts fallback-only tests to GPU-correctness tests using pre-materialized Parquet sources; adds focused binary/array CTAS coverage.
integration_tests/src/main/python/iceberg/iceberg_append_test.py Replaces fallback-only append tests with GPU-correctness tests; promotes _fallback variants to positive GPU tests with assert_no_cpu_project_exec check.
tests/src/test/spark350/scala/org/apache/iceberg/spark/GpuTypeToSparkTypeSuite.scala New Scala unit tests covering nestedMetadataJson/appendNestedFieldIdMetadata for primitives, flat lists/maps, nested list-of-list, map-of-list, struct-of-struct, and the full toSparkType round-trip.

Sequence Diagram

sequenceDiagram
    participant IcebergSchema as Iceberg Schema
    participant GpuTypeToSparkType
    participant StructField as Spark StructField (metadata)
    participant writerOptionsFromField as SchemaUtils.writerOptionsFromField
    participant setParquetFieldId as setParquetFieldId (reflection)
    participant ColumnWriterOptions as cuDF ColumnWriterOptions
    participant ParquetFile as Parquet File

    IcebergSchema->>GpuTypeToSparkType: toSparkType(schema)
    GpuTypeToSparkType->>GpuTypeToSparkType: appendNestedFieldIdMetadata(fieldType)
    GpuTypeToSparkType->>StructField: attach {parquet.field.id, rapids.parquet.list.element.field.id, ...}
    Note over StructField: Top-level field carries nested element/key/value IDs as side-channel metadata
    StructField->>writerOptionsFromField: fieldMeta with nested IDs
    writerOptionsFromField->>writerOptionsFromField: nestedFieldMetadata() → child Metadata
    writerOptionsFromField->>ColumnWriterOptions: build() element/key/value options
    writerOptionsFromField->>setParquetFieldId: setParquetFieldId(listOptions/mapOptions, parentId)
    setParquetFieldId->>ColumnWriterOptions: reflection: hasParquetFieldId=true, parquetFieldId=N
    ColumnWriterOptions->>ParquetFile: Write with Parquet field IDs preserved
Loading

Reviews (8): Last reviewed commit: "Update docs" | Re-trigger Greptile

Signed-off-by: Chong Gao <res_life@163.com>
@res-life res-life marked this pull request as ready for review April 16, 2026 10:05
Chong Gao added 2 commits April 17, 2026 15:55
Signed-off-by: Chong Gao <res_life@163.com>
Signed-off-by: Chong Gao <res_life@163.com>
@sameerz sameerz added the task Work required that improves the product but is not user facing label Apr 21, 2026
Copy link
Copy Markdown
Collaborator

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @res-life for this pr. One missing point is that I hope you can do a careful check in the dml tests, since there are some fallbacks no longer necessary.

Comment thread sql-plugin/src/main/scala/com/nvidia/spark/rapids/SchemaUtils.scala
Comment thread integration_tests/src/main/python/iceberg/iceberg_append_test.py
Comment thread integration_tests/src/main/python/iceberg/iceberg_ctas_test.py
Comment thread integration_tests/src/main/python/iceberg/iceberg_delete_test.py Outdated
Comment thread integration_tests/src/main/python/iceberg/iceberg_delete_test.py Outdated
Comment thread iceberg/common/src/main/scala/org/apache/iceberg/spark/GpuTypeToSparkType.scala Outdated
Chong Gao and others added 8 commits April 30, 2026 14:56
Use ParquetUtil.footerMetrics directly instead of routing the original
Iceberg schema through GpuParquetMetricsHelper. With field IDs correctly
written in the Parquet footer, the schema argument is redundant.

Addresses review comment from liurenjie1024 on PR NVIDIA#14611.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The two @allow_non_gpu("ColumnarToRowExec", "BatchScanExec") decorators
on Iceberg delete tests are no longer required.

Addresses review comment from liurenjie1024 on PR NVIDIA#14611.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Drops the queue-based sub-batch emission change that was bundled into
the nested/binary writes branch. It is unrelated to Iceberg field-id
preservation and should be pursued separately.

Addresses review comment from liurenjie1024 on PR NVIDIA#14611.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Drop the standalone IcebergParquetMetadataKeys file. The nested
list-element / map-key / map-value field-id keys are an internal
contract between the Iceberg writer (GpuTypeToSparkType) and
SchemaUtils, so define them where they're consumed and rename them
without the iceberg-specific prefix. Replace SchemaUtils' duplicate
private FIELD_ID_METADATA_KEY with the canonical
ParquetUtils.FIELD_ID_METADATA_KEY.

Addresses review comment from liurenjie1024 on PR NVIDIA#14611.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
cuDF already exposes withBinaryColumn(name, nullable, parquetFieldId)
on NestedBuilder, so the buildBinaryOptions / markAsBinary helpers and
the isBinary reflection are unnecessary. Switch the BinaryType writer
paths to call the public API directly. The remaining reflection on
hasParquetFieldId / parquetFieldId for list and map containers stays
until cuDF exposes builder variants that accept parquetFieldId; comment
updated to reflect that and drop the bogus issues/NEW placeholder.

Addresses review comment from liurenjie1024 on PR NVIDIA#14611.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
iceberg_full_gens_list already covers nested arrays, structs, and maps,
so the *_nested_types variants in iceberg_append_test.py and
iceberg_ctas_test.py duplicate the *_all_cols tests. Keep the
binary-focused CTAS test since it exercises a distinct code path.

Addresses review comments from liurenjie1024 on PR NVIDIA#14611.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add scaladoc to nestedMetadataJson / appendNestedFieldIdMetadata
explaining why nested Iceberg field ids are stored on the parent
StructField via flat keys plus JSON-serialized sub-Metadata, and how
SchemaUtils consumes them. Promote both helpers to package-private so
they can be exercised directly.

Add GpuTypeToSparkTypeSuite covering primitive, list, map, list-of-list,
map-of-(list, list), struct-of-struct, and an end-to-end toSparkType
round-trip that confirms the nested ids land on the parent field.

Addresses review comment from liurenjie1024 on PR NVIDIA#14611.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The branch's GpuOverrides.scala / mathExpressions.scala edits dropped
the Hex registration that landed on main in PR NVIDIA#14575, so hex(...) was
silently falling back to CPU. Reinstate:

- expr[Hex] entry in GpuOverrides
- GpuHex case class in mathExpressions
- test_hex coverage in arithmetic_ops_test.py
- Hex rows in operatorsScore.csv / supportedExprs.csv (3.3.0+)
- Hex docs in advanced_configs.md and supported_ops.md

Addresses Greptile P1 on PR NVIDIA#14611.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@res-life res-life marked this pull request as draft April 30, 2026 09:02
@res-life res-life marked this pull request as ready for review April 30, 2026 23:41
@res-life
Copy link
Copy Markdown
Collaborator Author

build

Signed-off-by: Chong Gao <res_life@163.com>
@res-life
Copy link
Copy Markdown
Collaborator Author

res-life commented May 1, 2026

build

Chong Gao added 2 commits May 1, 2026 10:47
Signed-off-by: Chong Gao <res_life@163.com>
@res-life
Copy link
Copy Markdown
Collaborator Author

res-life commented May 1, 2026

build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

task Work required that improves the product but is not user facing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEA] Support write nested data type for iceberg.

4 participants