[BUG FIX] Fix contact point pruning on Apple Metal. by duburcqa · Pull Request #2831 · Genesis-Embodied-AI/genesis-world

duburcqa · 2026-05-25T15:32:54Z

No description provided.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 087a4cbadd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

…is-Embodied-AI#2831 Rebase of my cooperative-warp dedup work onto Alexis's Genesis-Embodied-AI#2831 (fix_contact_pruning_batched) which fixes the Quadrants Metal codegen bug that was blocking dedup on macOS. PR Genesis-Embodied-AI#2831's contact.py changes are incorporated verbatim into my cooperative kernel: 1. Memory-fence scratch write between the lower-hull and upper-hull passes of Andrew's monotone chain (contact_hull_stack[max_contact_pairs - 1, i_b] = 0). Without this, on Metal _B >= 2 the upper-hull reads of contact_hull_stack don't observe the lower-hull writes and every candidate is kept (hull size == bucket size, no pruning). Standalone repro: perso_hugh/prot/qd_metal_hull_chain_visibility_repro.py 2. New tunable prune_hull_collinear_tol (1e-3 default) separated from the depth coplanarity tol. Lets users tune the hull-popper cross- product slop independently of the depth gate. 3. Deep-penetration restore threshold changed from average to MAX of hull penetrations. A non-hull contact whose penetration substantially exceeds the deepest hull vertex represents a distinct physical support (deep middle of a long body) that the support-polygon argument doesn't authorize dropping. Side cleanups now possible because Genesis-Embodied-AI#2831 fixes the Metal bug: * Removed the gs.backend not in (gs.cuda, gs.amdgpu) guard from collider.py. Dedup is now enabled on Metal too; only CPU is still excluded (the cooperative kernel adds overhead without parallel speedup on a serial backend). * Removed the @pytest.mark.skipif(darwin) from test_contact_pruning; the test now runs on every GPU backend. The cooperative-warp structure (32 threads/env, qd.simt.subgroup reductions for the per-bucket mean-normal / centroid, coop projection and mark-drop, fused phase 3 compact+spatial-sort) is preserved from the previous branch. Note: this branch is currently based on duburcqa:fix_contact_pruning_batched (Genesis-Embodied-AI#2831). Once Genesis-Embodied-AI#2831 merges, the branch base should be moved back to origin/main and the Genesis-Embodied-AI#2831 commit will simply be absorbed. Backup of the pre-rebase tip: hp/dedup-alexis-perf-9d-stage3-r2-pre-2831-rebase-backup.

hughperkins · 2026-05-25T16:46:28Z

Hi @duburcqa — heads-up that this PR appears to introduce a regression in test_smooth_box_no_drift for the [False-capsule-prim-prim] and [False-capsule-prim-mesh] variants.

I verified this by running the failing tests on duburcqa/fix_contact_pruning_batched at 1957db6 directly (no other changes on top) via a cluster GPU node (--backend ndarray):

2 failed, 26 passed in 116.66s

FAILED tests/test_rigid_physics.py::test_smooth_box_no_drift[False-capsule-prim-prim]
FAILED tests/test_rigid_physics.py::test_smooth_box_no_drift[False-capsule-prim-mesh]

Max abs diff among violations: 0.01361053 (atol=0.001)
Mismatched elements: 8 / 32 (25%)

The [True-capsule-prim-prim] and [True-capsule-prim-mesh] variants pass on the same branch, so it's specific to gjk_collision=False + capsule + primitive entity.

The failing scenes are single-link single-geom (a primitive capsule on either a primitive box or a convex mesh box), so the link-pair pruning kernel isn't even invoked for them (link_pair_pruning_supported = False). My best guess is the new prune_hull_collinear_tol field on ColliderInfo is changing the dataclass shape and forcing other kernels that reference collider_info (e.g. func_clamp_and_sort_contacts) to recompile, producing tiny LLVM/codegen drift that compounds past the test's tight atol=1e-3 over 300 steps.

Same failure shows up identically in deskai6 PR #2830 (rebased on top of this PR), so my downstream cooperative-kernel changes aren't the cause.

Let me know if you'd like the full failure logs or a more targeted reproduction.

…backends + n_envs Adopt Alexis's fused func_clamp_prune_and_sort_contacts kernel (single per-env serial pass that does clamp + optional pruning + optional spatial sort, gated at compile time via qd.static). This supersedes the deskai6 cooperative warp-per-env split into func_prune_contacts + func_clamp_and_sort_contacts: the fusion removes one kernel launch + one full 11-field permute pass per step, which on net matches or beats the cooperative-warp approach without needing any backend-specific gating. Remove the deskai6 gates that previously disabled dedup on CPU and at n_envs > 8192. Both were tied to the cooperative-kernel approach (warp-cooperative loops serialise on CPU; n_envs > 8192 oversaturates the GPU grid). The fused per-env serial kernel has no such overhead, so dedup now runs unconditionally whenever the scene is eligible (link_pair_pruning_supported and not autodiff). Also remove _effective_pruning_tolerance / _DEDUP_DEFAULT_TOLERANCE / _DEDUP_MAX_N_ENVS: the kernel now reads contact_pruning_tolerance directly from the user option (upstream default 0.1 post Genesis-Embodied-AI#2831), matching the fused kernel's expectation. Remove the @pytest.mark.parametrize backend=gs.gpu restriction on test_contact_pruning so it runs on CPU + GPU per matrix.

duburcqa requested a review from YilingQiao as a code owner May 25, 2026 15:32

duburcqa force-pushed the fix_contact_pruning_batched branch from 087a4cb to 1957db6 Compare May 25, 2026 15:33

chatgpt-codex-connector Bot reviewed May 25, 2026

View reviewed changes

Comment thread genesis/engine/solvers/rigid/collider/contact.py Outdated

hughperkins mentioned this pull request May 25, 2026

[MISC] Enable link-level dedupe for gpu and n_nenvs <= 8192 and add optimizations. #2830

Draft

3 tasks

duburcqa force-pushed the fix_contact_pruning_batched branch 2 times, most recently from 7cc824d to c9a77ca Compare May 25, 2026 16:45

duburcqa force-pushed the fix_contact_pruning_batched branch 4 times, most recently from 7401b13 to dc0c216 Compare May 25, 2026 17:01

Fix contact point pruning in batched sim.

4ca455e

duburcqa force-pushed the fix_contact_pruning_batched branch from dc0c216 to 4ca455e Compare May 25, 2026 17:37

duburcqa changed the title ~~[BUG FIX] Fix contact point pruning in batched sim.~~ [BUG FIX] Fix contact point pruning on Apple Metal. May 25, 2026

duburcqa merged commit 7c6c531 into Genesis-Embodied-AI:main May 25, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG FIX] Fix contact point pruning on Apple Metal.#2831

[BUG FIX] Fix contact point pruning on Apple Metal.#2831
duburcqa merged 1 commit into
Genesis-Embodied-AI:mainfrom
duburcqa:fix_contact_pruning_batched

duburcqa commented May 25, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

hughperkins commented May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

duburcqa commented May 25, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

hughperkins commented May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants