splitTargetSizeInHalfGpu by data size if not target size by zpuller · Pull Request #14684 · NVIDIA/spark-rapids

zpuller · 2026-04-27T18:00:56Z

Description

splitTargetSizeInHalfGpu is used as the split policy in both the GPU shuffle coalesce path and the join gather path (AbstractGpuJoinIterator). When OOM occurs, it halves targetSize and retries — but if the actual data is already smaller than targetSize/2, this is a no-op and the retry will OOM again. The shuffle coalesce path had an element-count fallback for this case; the join path did not.

This change moves the fallback logic into the split policy so it can be handled in a consistent way for both shuffle coalesce and join.

In the join case, we cannot discretely divide by elements in the same way as we don't have the full collection ahead of time, so we take a different approach that still accomplishes approximately the same goal; if the target size / 2 is > the total estimated data size, when the split would have previously failed/been a no-op, we instead choose the new target to be the estimated total data size / 2 instead of target size / 2.

Checklists

Documentation

Updated for new or modified user-facing features or behaviors
No user-facing change

Testing

Added or modified tests to cover new code paths
Covered by existing tests
(Please provide the names of the existing tests in the PR description.)
Not required

Performance

Tests ran and results are added in the PR description
Issue filed with a link in the PR description
Not required

Signed-off-by: Zach Puller <zpuller@nvidia.com>

zpuller · 2026-04-27T18:21:26Z

build

greptile-apps · 2026-04-27T18:26:08Z

Greptile Summary

This PR fixes a no-op OOM-retry bug in both the GPU shuffle coalesce and join gather paths by adding a dataSize hint to AutoCloseableTargetSize and using dataSize/2 as the split target when the actual data is already smaller than targetSize/2. The logic is centralized in splitTargetSizeInHalfInternal and the dataSize field is preserved across splits so the optimization can fire consistently.

Confidence Score: 4/5

Safe to merge; the core fix is correct and the two issues are edge-case regressions unlikely to surface in production.

All findings are P2. The main concern is the removed element-count fallback in createSplitPolicyByTargetSize: for zero-byte tables, splitIndex stays 0 and concat([]) crashes rather than producing a handled OOM. The Long.MaxValue loop-exit trick is a minor style smell. Neither affects the primary happy-path fix.

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuShuffleCoalesceExec.scala — specifically the createSplitPolicyByTargetSize split loop and the removed element-count fallback.

Important Files Changed

Filename	Overview
sql-plugin/src/main/scala/com/nvidia/spark/rapids/RmmRapidsRetryIterator.scala	Adds `dataSize` field to `AutoCloseableTargetSize` and uses it in `splitTargetSizeInHalfInternal` to avoid no-op retries; `dataSize` is now forwarded to the result so the optimization can apply on subsequent splits.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/AbstractGpuJoinIterator.scala	Moves `targetSizeWrapper` creation inside the `gathererStore.map` block so `estimatedDataSize` (rows × per-row estimate) is passed to the retry wrapper, enabling the new `dataSize/2` optimization in the join gather path.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuShuffleCoalesceExec.scala	Removes element-count fallback in `createSplitPolicyByTargetSize`, relying on `dataSize/2` to guarantee a valid byte-split; two P2 issues: zero-byte tables leave `splitIndex == 0` (crash on `concat([])`), and the `Long.MaxValue` sentinel relies on integer overflow semantics.
tests/src/test/scala/com/nvidia/spark/rapids/WithRetrySuite.scala	Adds a targeted test for `splitTargetSizeInHalfGpu` using `dataSize/2` when `dataSize < targetSize/2`; verifies the new split target is `dataSize/2 = 100` and not `targetSize/2 = 500`.
tests/src/test/scala/com/nvidia/spark/rapids/GpuShuffleCoalesceSuite.scala	Lowers `minSplitSize` from 1 KB to 1 byte in the forced-split-retry test to match smaller test data; comment explains the production default (10 MB).

Sequence Diagram

sequenceDiagram
    participant Caller
    participant withRetry
    participant splitPolicy
    participant splitTargetSizeInHalfInternal

    Caller->>withRetry: AutoCloseableTargetSize(targetSize, minSize, dataSize)
    withRetry->>Caller: attempt (targetSize)
    Caller-->>withRetry: GpuSplitAndRetryOOM
    withRetry->>splitPolicy: split(target)
    splitPolicy->>splitTargetSizeInHalfInternal: target
    alt dataSize > 0 AND dataSize <= targetSize/2
        splitTargetSizeInHalfInternal-->>splitPolicy: newTarget = dataSize/2
    else
        splitTargetSizeInHalfInternal-->>splitPolicy: newTarget = targetSize/2
    end
    splitPolicy-->>withRetry: AutoCloseableTargetSize(newTarget, minSize, dataSize)
    withRetry->>Caller: attempt (newTarget, smaller)
    Caller-->>withRetry: success
    withRetry-->>Caller: result

_{Reviews (2): Last reviewed commit: "pr comments" | Re-trigger Greptile}

Signed-off-by: Zach Puller <zpuller@nvidia.com>

zpuller · 2026-04-27T19:20:08Z

build

zpuller added 3 commits April 27, 2026 11:00

splitTargetSizeInHalfGpu by elements if not bytes

e3c7433

Signed-off-by: Zach Puller <zpuller@nvidia.com>

license header

367c1de

Signed-off-by: Zach Puller <zpuller@nvidia.com>

revert InternalRowToColumnarBatchIterator.java

f8aaa76

Signed-off-by: Zach Puller <zpuller@nvidia.com>

zpuller marked this pull request as ready for review April 27, 2026 18:21

zpuller requested a review from a team April 27, 2026 18:22

greptile-apps Bot reviewed Apr 27, 2026

View reviewed changes

Comment thread sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuShuffleCoalesceExec.scala

Comment thread sql-plugin/src/main/scala/com/nvidia/spark/rapids/RmmRapidsRetryIterator.scala Outdated

pr comments

f7dd670

Signed-off-by: Zach Puller <zpuller@nvidia.com>

sameerz added the bug Something isn't working label Apr 28, 2026

zpuller requested a review from abellina April 28, 2026 16:31

zpuller changed the title ~~splitTargetSizeInHalfGpu by elements if not bytes~~ splitTargetSizeInHalfGpu by data size if not target size May 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

splitTargetSizeInHalfGpu by data size if not target size#14684

splitTargetSizeInHalfGpu by data size if not target size#14684
zpuller wants to merge 4 commits intoNVIDIA:mainfrom
zpuller:split_target

zpuller commented Apr 27, 2026 •

edited

Loading

Uh oh!

zpuller commented Apr 27, 2026

Uh oh!

greptile-apps Bot commented Apr 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

zpuller commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zpuller commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklists

Uh oh!

zpuller commented Apr 27, 2026

Uh oh!

greptile-apps Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

zpuller commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zpuller commented Apr 27, 2026 •

edited

Loading

greptile-apps Bot commented Apr 27, 2026 •

edited

Loading