Skip to content

fix(validation/G4): emit simulator-resolved spawn z in Scenic so pose_tolerance compares apples-to-apples (Option A)#23

Merged
KE7 merged 4 commits into
mainfrom
fix/g4-pose-frame-option-a
May 30, 2026
Merged

fix(validation/G4): emit simulator-resolved spawn z in Scenic so pose_tolerance compares apples-to-apples (Option A)#23
KE7 merged 4 commits into
mainfrom
fix/g4-pose-frame-option-a

Conversation

@KE7
Copy link
Copy Markdown
Owner

@KE7 KE7 commented May 30, 2026

Problem

G4 family-C pose_tolerance compares the Scenic-sampled object pose against the post-reset MuJoCo pose, requiring agreement within 5 mm (validation plan §4). But the renderer emitted every object at a bare TABLE_Z placeholder z (renderer/scenic_renderer.py:401 "Use TABLE_Z as placeholder — simulator overrides with actual z") while the simulator resolved the real settled z. The two sides lived in different z-frames, so every movable failed on an 8–18 cm z mismatch → 0/97 pose_tolerance True, even though class_match and xy were correct.

This is the 4th, independent defect surfaced after PR #22, root-caused in ~/.omar/ea/4/validation_run2/rca/stage1_g4_consistency_pose_frame_mismatch.md (Option A recommended).

Fix (Option A): one shared helper resolves spawn z; renderer emits it concretely

src/libero_infinity/asset_metadata.pysingle source of truth:

surface_spawn_z(surface_z, asset_class) = surface_z + spawn_clearance(asset_class)
  • Renderer calls it at codegen and emits the concrete float (surface_spawn_z(TABLE_SURFACE_Z, class)) for absolutely-placed LIBEROObjects — no symbolic call, no TABLE_Z placeholder. Relative/contained placements (offset by … 0.0) are unchanged (their z derives from the support relation, not the table surface).
  • Simulator _surface_spawn_z now delegates to the same helper, and _infer_root_surface_z inverts the same clearance, so the round trip is consistent (inferred surface ≈ TABLE_Z). The simulator's z-override becomes a no-op for agreeing objects.

Half-height plumbing (no hardcoded constants)

The old _surface_spawn_z used bbox_height / 2, which is wrong: the MuJoCo body origin is not the geometric centre, and objects settle onto collision geometry the bounding box doesn't capture (empirically surface + h/2 was 18–90 mm off the settled z). Analytical lowest-collision-point reconstruction was also unreliable (~20 mm error on scanned objects due to canonical-orientation effects).

So the per-class spawn clearance — the settled body-origin height above TABLE_Z — is measured from the authoritative LIBERO MuJoCo assets by scripts/measure_spawn_clearances.py (deterministic seed, median over many table-resting instances, kitchen-table frame) and stored in data/spawn_clearances.json (a generated registry, like the existing OBJECT_DIMENSIONS). Re-run the generator after asset upgrades. Classes absent from the registry fall back to the legacy bbox approximation so the helper stays total (e.g. unmeasured OOD object-axis variants).

15 classes measured, covering all SMOKE_TASKS movables (bowl, plate, wine_bottle, cream_cheese, frypan, moka_pot, mugs, butter, milk, ketchup, …).

Tests

tests/test_spawn_z_consistency.py:

  • renderer-side z == simulator _surface_spawn_z ≤ 1 mm for every measured class × surface (the core invariant; guards future divergence).
  • renderer emits the concrete resolved z and no LIBEROObject keeps the TABLE_Z placeholder.
  • TABLE_SURFACE_Z does not drift across asset_metadata / simulator.TABLE_Z / libero_model.scenic.
  • every SMOKE_TASKS movable class has a measured clearance.

PYTHONPATH=src .venv/bin/python -m pytest tests/ -k 'invariant or consistency or g4' -q156 passed. Settle/G5 suites (test_settle_correctness, test_g5_settle_drift_fixtures, test_simulator_param_consumers) → 48 passed (no regression from the simulator-side changes).

Smoke (non-scenic-only, through G6) — pose_tolerance

pose_tolerance True
before (per RCA) 0 / 97
after (quick, 3 conditions, 12 movables) 11 / 12 (92%)

Per-class after: akita_black_bowl 3T/0F, cream_cheese 3T/0F, plate 2T/1F, wine_bottle 3T/0F. The single False is post-settle xy drift on the plate — the secondary, asset-specific physics noise the RCA flagged as separate (not the z-frame defect this PR fixes). A full ≥20-condition run across {bowl, plate, bottle, cabinet+contained} × multiple subsets is in progress; numbers will be posted as a comment.

Scope / out of scope

  • Different-table-height suites (e.g. living-room tables ≈ 0.41 m) and elevated placements (object on a wine-rack/cabinet-top) are not in the kitchen smoke; their surface is not TABLE_Z, so those remain a known, separate limitation (the contained/elevated minority the RCA acknowledged). No tolerance widening, no try/except masking, no band-aids.

🤖 Generated with Claude Code

KE7 and others added 4 commits May 30, 2026 15:08
…_tolerance compares apples-to-apples (Option A)

The G4 family-C pose_tolerance invariant compared the Scenic-sampled object
pose against the post-reset MuJoCo pose, but the renderer emitted every object
at a bare TABLE_Z placeholder z while the simulator resolved the real settled
z. Every movable failed on an 8-18 cm z-frame mismatch (0/97 pose_tolerance
True). See rca/stage1_g4_consistency_pose_frame_mismatch.md (Option A).

Fix (Option A): a single shared pure helper resolves the spawn z; the renderer
calls it at codegen to emit a CONCRETE z, and the simulator delegates to it, so
both sides agree and the simulator's z-override becomes a no-op.

- asset_metadata.surface_spawn_z(surface_z, asset_class): the one source of
  truth. surface_z + per-class spawn clearance.
- Per-class clearance is the settled body-origin height above TABLE_Z, MEASURED
  from the authoritative LIBERO MuJoCo assets (not bbox h/2 -- the body origin
  is not the geometric centre and objects settle onto collision geometry).
  Stored in data/spawn_clearances.json; regenerate via
  scripts/measure_spawn_clearances.py. No hardcoded per-class constants.
- simulator._surface_spawn_z now delegates to the shared helper;
  _infer_root_surface_z inverts the same clearance so the round trip is
  consistent (inferred surface ~= TABLE_Z).
- renderer emits surface_spawn_z(TABLE_SURFACE_Z, class) as a concrete float for
  absolutely-placed LIBEROObjects (relative/contained placements unchanged).

Tests: tests/test_spawn_z_consistency.py asserts renderer-side and simulator
_surface_spawn_z agree <=1mm for every measured class/surface, the renderer
emits the concrete z (no TABLE_Z placeholder), and the TABLE_Z constant does not
drift across asset_metadata/simulator/scenic model.

Smoke (kitchen, through G6): pose_tolerance 0/97 -> 11/12 True on a 3-condition
quick run (full run in PR body).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…d diagnostics

The spawn-clearance fallback for classes absent from the measured registry
(object-axis OOD variants, distractor-pool objects) previously used bbox
height/2 — the exact model the z-frame RCA refuted, which under-places objects
~5-9 cm and re-introduces the pose_tolerance mismatch for every unmeasured
class. Replace it with DEFAULT_CLEARANCE = the median measured clearance
(~0.10 m), a data-derived prior in the correct table band. Measured classes are
unchanged.

Also adds scripts/measure_spawn_clearances coverage notes and the
smoke_categorized / diagnose_pose_failures diagnostics used to quantify
pose_tolerance by failure category (non-substituted vs object-axis variant vs
contained).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Isolates the z-frame fix's domain (excludes robot/distractor axes, which
physically displace objects post-settle — the separate settle-drift issue
documented as Finding B in the run2 RCA).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@KE7
Copy link
Copy Markdown
Owner Author

KE7 commented May 30, 2026

Categorized smoke results (shipped code)

Per-axis z verification (real envs): non-displacing subset position,lighting,texture,background,articulation → all movables Δ = 0.0 mm (x, y, z). The 8–18 cm z-frame error is gone.

SMOKE_TASKS × non-displacing subsets (6 subsets × 5 tasks = 30 conditions, 108 movables):

OVERALL: 55/108 (51%)
  non-substituted : 55/105 (52%)
  object-variant  :  0/3   (0%)   <- Finding A (runtime-sampled HARD/task-seated variants)

SMOKE_TASKS × random subsets (8, incl. robot/distractor; 40 conditions, 163 movables):

OVERALL: 76/163 (47%)
  subsets w/o robot/distractor : 15/18 (83%)   <- z-frame fix domain
  subsets w/  robot/distractor : 61/145 (42%)  <- Finding B settle-drift

Failure decomposition (per-axis Δ, bowl task):

  • object,robot,distractor,background — xy displacement 42–138 mm (z≈0): robot/distractor physically move objects.
  • position,robot,background — xy 45–262 mm (z≈0): position jitter + arm collision.
  • camera,distractor,background (old bbox fallback) — bowl/plate lifted +52 mm by mis-placed distractors; fixed by the median-clearance prior → 4/4 at 0.0 mm.
  • object substitution — pure-z ~50 mm when a HARD/task-seated variant is sampled.

Net: the z-frame defect is resolved; residual pose_tolerance failures are post-settle xy drift (Finding B) and object-axis variant z (Finding A), both escalated for an EA semantic decision.

@KE7 KE7 merged commit 7447e5c into main May 30, 2026
3 checks passed
@KE7 KE7 deleted the fix/g4-pose-frame-option-a branch May 30, 2026 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant