Skip to content

[BUG FIX] Fix contact point pruning on Apple Metal.#2831

Merged
duburcqa merged 1 commit into
Genesis-Embodied-AI:mainfrom
duburcqa:fix_contact_pruning_batched
May 25, 2026
Merged

[BUG FIX] Fix contact point pruning on Apple Metal.#2831
duburcqa merged 1 commit into
Genesis-Embodied-AI:mainfrom
duburcqa:fix_contact_pruning_batched

Conversation

@duburcqa
Copy link
Copy Markdown
Collaborator

No description provided.

@duburcqa duburcqa requested a review from YilingQiao as a code owner May 25, 2026 15:32
@duburcqa duburcqa force-pushed the fix_contact_pruning_batched branch from 087a4cb to 1957db6 Compare May 25, 2026 15:33
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 087a4cbadd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread genesis/engine/solvers/rigid/collider/contact.py Outdated
hughperkins added a commit to hughperkins/Genesis that referenced this pull request May 25, 2026
…is-Embodied-AI#2831

Rebase of my cooperative-warp dedup work onto Alexis's Genesis-Embodied-AI#2831
(fix_contact_pruning_batched) which fixes the Quadrants Metal codegen
bug that was blocking dedup on macOS. PR Genesis-Embodied-AI#2831's contact.py changes are
incorporated verbatim into my cooperative kernel:

1. Memory-fence scratch write between the lower-hull and upper-hull
   passes of Andrew's monotone chain (contact_hull_stack[max_contact_pairs - 1, i_b] = 0).
   Without this, on Metal _B >= 2 the upper-hull reads of
   contact_hull_stack don't observe the lower-hull writes and every
   candidate is kept (hull size == bucket size, no pruning).
   Standalone repro: perso_hugh/prot/qd_metal_hull_chain_visibility_repro.py

2. New tunable prune_hull_collinear_tol (1e-3 default) separated from
   the depth coplanarity tol. Lets users tune the hull-popper cross-
   product slop independently of the depth gate.

3. Deep-penetration restore threshold changed from average to MAX of
   hull penetrations. A non-hull contact whose penetration substantially
   exceeds the deepest hull vertex represents a distinct physical
   support (deep middle of a long body) that the support-polygon
   argument doesn't authorize dropping.

Side cleanups now possible because Genesis-Embodied-AI#2831 fixes the Metal bug:

* Removed the gs.backend not in (gs.cuda, gs.amdgpu) guard from
  collider.py. Dedup is now enabled on Metal too; only CPU is still
  excluded (the cooperative kernel adds overhead without parallel
  speedup on a serial backend).
* Removed the @pytest.mark.skipif(darwin) from test_contact_pruning;
  the test now runs on every GPU backend.

The cooperative-warp structure (32 threads/env, qd.simt.subgroup
reductions for the per-bucket mean-normal / centroid, coop projection
and mark-drop, fused phase 3 compact+spatial-sort) is preserved from
the previous branch.

Note: this branch is currently based on duburcqa:fix_contact_pruning_batched (Genesis-Embodied-AI#2831).
Once Genesis-Embodied-AI#2831 merges, the branch base should be moved back to origin/main and
the Genesis-Embodied-AI#2831 commit will simply be absorbed.

Backup of the pre-rebase tip: hp/dedup-alexis-perf-9d-stage3-r2-pre-2831-rebase-backup.
@duburcqa duburcqa force-pushed the fix_contact_pruning_batched branch 2 times, most recently from 7cc824d to c9a77ca Compare May 25, 2026 16:45
@hughperkins
Copy link
Copy Markdown
Collaborator

Hi @duburcqa — heads-up that this PR appears to introduce a regression in test_smooth_box_no_drift for the [False-capsule-prim-prim] and [False-capsule-prim-mesh] variants.

I verified this by running the failing tests on duburcqa/fix_contact_pruning_batched at 1957db6 directly (no other changes on top) via a cluster GPU node (--backend ndarray):

2 failed, 26 passed in 116.66s

FAILED tests/test_rigid_physics.py::test_smooth_box_no_drift[False-capsule-prim-prim]
FAILED tests/test_rigid_physics.py::test_smooth_box_no_drift[False-capsule-prim-mesh]

Max abs diff among violations: 0.01361053 (atol=0.001)
Mismatched elements: 8 / 32 (25%)

The [True-capsule-prim-prim] and [True-capsule-prim-mesh] variants pass on the same branch, so it's specific to gjk_collision=False + capsule + primitive entity.

The failing scenes are single-link single-geom (a primitive capsule on either a primitive box or a convex mesh box), so the link-pair pruning kernel isn't even invoked for them (link_pair_pruning_supported = False). My best guess is the new prune_hull_collinear_tol field on ColliderInfo is changing the dataclass shape and forcing other kernels that reference collider_info (e.g. func_clamp_and_sort_contacts) to recompile, producing tiny LLVM/codegen drift that compounds past the test's tight atol=1e-3 over 300 steps.

Same failure shows up identically in deskai6 PR #2830 (rebased on top of this PR), so my downstream cooperative-kernel changes aren't the cause.

Let me know if you'd like the full failure logs or a more targeted reproduction.

@duburcqa duburcqa force-pushed the fix_contact_pruning_batched branch 4 times, most recently from 7401b13 to dc0c216 Compare May 25, 2026 17:01
@duburcqa duburcqa force-pushed the fix_contact_pruning_batched branch from dc0c216 to 4ca455e Compare May 25, 2026 17:37
@duburcqa duburcqa changed the title [BUG FIX] Fix contact point pruning in batched sim. [BUG FIX] Fix contact point pruning on Apple Metal. May 25, 2026
@duburcqa duburcqa merged commit 7c6c531 into Genesis-Embodied-AI:main May 25, 2026
4 checks passed
hughperkins added a commit to hughperkins/Genesis that referenced this pull request May 25, 2026
…backends + n_envs

Adopt Alexis's fused func_clamp_prune_and_sort_contacts kernel (single per-env
serial pass that does clamp + optional pruning + optional spatial sort, gated
at compile time via qd.static). This supersedes the deskai6 cooperative
warp-per-env split into func_prune_contacts + func_clamp_and_sort_contacts:
the fusion removes one kernel launch + one full 11-field permute pass per
step, which on net matches or beats the cooperative-warp approach without
needing any backend-specific gating.

Remove the deskai6 gates that previously disabled dedup on CPU and at
n_envs > 8192. Both were tied to the cooperative-kernel approach
(warp-cooperative loops serialise on CPU; n_envs > 8192 oversaturates the
GPU grid). The fused per-env serial kernel has no such overhead, so dedup
now runs unconditionally whenever the scene is eligible
(link_pair_pruning_supported and not autodiff).

Also remove _effective_pruning_tolerance / _DEDUP_DEFAULT_TOLERANCE /
_DEDUP_MAX_N_ENVS: the kernel now reads contact_pruning_tolerance directly
from the user option (upstream default 0.1 post Genesis-Embodied-AI#2831), matching the fused
kernel's expectation. Remove the @pytest.mark.parametrize backend=gs.gpu
restriction on test_contact_pruning so it runs on CPU + GPU per matrix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants