fix: harden uring completion boundaries#14
Merged
Conversation
Signed-off-by: Robin He <git@hybscloud.com>
Signed-off-by: Robin He <git@hybscloud.com>
Signed-off-by: Robin He <git@hybscloud.com>
Signed-off-by: Robin He <git@hybscloud.com>
Signed-off-by: Robin He <git@hybscloud.com>
Signed-off-by: Robin He <git@hybscloud.com>
Signed-off-by: Robin He <git@hybscloud.com>
Signed-off-by: Robin He <git@hybscloud.com>
Signed-off-by: Robin He <git@hybscloud.com>
Signed-off-by: Robin He <git@hybscloud.com>
Signed-off-by: Robin He <git@hybscloud.com>
Signed-off-by: Robin He <git@hybscloud.com>
Signed-off-by: Robin He <git@hybscloud.com>
There was a problem hiding this comment.
Pull request overview
This PR hardens the uring completion boundary by introducing a unified completion-error decoder, making Wait/WaitDirect/WaitExtended perform an IOPOLL-aware empty-CQ visibility check, and adding MultishotSubscription.HandleCQE for in-loop multishot dispatch while keeping the pooled ExtSQE alive across cancel-submit. Documentation across the README variants and GoDocs is reorganized to clarify caller-side serialization, retired-route handling, and zero-copy/xattr buffer lifetimes.
Changes:
- New
CompletionErrorhelper plusErr()methods onCQEView/DirectCQE/ExtCQE, with all wait-family paths funneled through a sharedobserveCQEmptyLockedso IOPOLL rings poll before reporting empty. MultishotSubscriptiongainsHandleCQEfor same-loop dispatch and acancelSubmitguard that defers ExtSQE retirement until cancel submission no longer needs theuser_data.- README/GoDoc rewrites: clarify Wait/WaitDirect/WaitExtended serialization, multishot subscription lifecycle, ExtSQE GC obligations, xattr value lifetimes, ZCRX boundary contract; renamed
AsyncCancelFDtest toAsyncCancel.
Reviewed changes
Copilot reviewed 41 out of 42 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| errors.go / errors_darwin.go / errors_test.go | Adds CompletionError + Err() on completion view types and tests. |
| cqe_view.go / cqe_view_darwin.go / cqe_view_test.go | Doc updates; consolidates ExtCQE test from removed file. |
| cqe_extended_linux.go / cqe_extended_darwin.go / cqe_direct_linux.go / cqe_direct_darwin.go | Use observeCQEmptyLocked; doc updates. |
| cqe_extended_internal_test.go (deleted) | Test moved into cqe_view_test.go. |
| io_uring_linux.go | Splits enter() into wakeSQPollLocked + flushSubmitLocked; adds cqReady and observeCQEmptyLocked for IOPOLL visibility; wait/waitBatch loop on transient empty. |
| io_uring_linux_test.go | Adds tests for new helpers and consolidated CQE32 stride tests. |
| cq_stride_linux_test.go (deleted) | Tests folded into io_uring_linux_test.go. |
| io_uring_darwin_internal_test.go | Imports the SQE view test from removed darwin file. |
| sqe_view_darwin_test.go (deleted) | Test moved to io_uring_darwin_internal_test.go. |
| multishot.go | Adds HandleCQE; renames canceling → cancelSubmit; defers ExtSQE retirement until cancel submission completes; expanded thread-safety doc. |
| multishot_darwin.go | Adds Darwin HandleCQE stub returning false. |
| multishot_internal_linux_test.go | New tests for HandleCQE, cancel-submit ext retention, and benchmarks. |
| multishot_test.go | Switches integration test from AsyncCancelFD to AsyncCancel. |
| multishot_manual_dispatch_test.go (deleted) | Helper moved into testhelpers_linux_test.go. |
| testhelpers_linux_test.go | Adds dispatchSimplifiedMultishotCQE helper. |
| extended_sqe_reset_internal_linux_test.go (deleted) | Tests folded into interface_wrapper_internal_linux_test.go. |
| interface_wrapper_internal_linux_test.go | Adds advanced-registration failure tests, ResizeRings failure test, and the moved ExtSQE-reset tests. |
| interface_linux.go / interface_darwin.go | Doc updates around serialization, AsyncCancel, xattr, multicast/zero-copy buffer lifetime. |
| operations_linux.go | Doc-only clarification on asyncCancel. |
| ctx.go / ctx_darwin.go | ExtSQE doc note about GC not tracing UserData. |
| submit_state.go | Doc update naming Wait/WaitDirect/WaitExtended in serialization contract. |
| doc.go | Doc rewrite of multishot example and lifecycle. |
| README.md / README.zh-CN.md / README.ja.md / README.fr.md / README.es.md | Reflow paragraphs and add new wait-name/HandleCQE guidance. |
| examples/file_io_test.go / examples/testhelpers_test.go | Use new cqe.Err() API. |
| zerocopy_test.go | Bumps payload above multicast single-target threshold; adds skip on EOPNOTSUPP and stricter completion expectations. |
| cq_overflow_test.go | Adds non-nil params to test ring fixture. |
| go.mod / go.sum | Bumps internal code.hybscloud.com/* module versions. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR hardens the
uringcompletion boundary. CQE results now decode through one public helper, IOPOLL waits recheck completion visibility before empty-queue classification, and multishot completions route through their owning subscription.The boundary stays narrow:
uringpreserves kernel CQE facts while caller-side runtime code owns retained route state, retry cadence, scheduling, and higher-level policy.Changes
CompletionErrorplus.Err()helpers on copied CQE observations.Wait,WaitDirect, andWaitExtendeduse the IOPOLL-aware empty-CQ observation path before returningiox.ErrWouldBlock.MultishotSubscription.HandleCQE(CQEView) boolfor immediate same-loop multishot dispatch and retain the ExtSQE until cancel submission no longer needs itsuser_data.