fix(validation/G4): emit simulator-resolved spawn z in Scenic so pose_tolerance compares apples-to-apples (Option A) by KE7 · Pull Request #23 · KE7/libero-infinity

KE7 · 2026-05-30T22:09:09Z

Problem

G4 family-C pose_tolerance compares the Scenic-sampled object pose against the post-reset MuJoCo pose, requiring agreement within 5 mm (validation plan §4). But the renderer emitted every object at a bare TABLE_Z placeholder z (renderer/scenic_renderer.py:401 "Use TABLE_Z as placeholder — simulator overrides with actual z") while the simulator resolved the real settled z. The two sides lived in different z-frames, so every movable failed on an 8–18 cm z mismatch → 0/97 pose_tolerance True, even though class_match and xy were correct.

This is the 4th, independent defect surfaced after PR #22, root-caused in ~/.omar/ea/4/validation_run2/rca/stage1_g4_consistency_pose_frame_mismatch.md (Option A recommended).

Fix (Option A): one shared helper resolves spawn z; renderer emits it concretely

src/libero_infinity/asset_metadata.py — single source of truth:

surface_spawn_z(surface_z, asset_class) = surface_z + spawn_clearance(asset_class)

Renderer calls it at codegen and emits the concrete float (surface_spawn_z(TABLE_SURFACE_Z, class)) for absolutely-placed LIBEROObjects — no symbolic call, no TABLE_Z placeholder. Relative/contained placements (offset by … 0.0) are unchanged (their z derives from the support relation, not the table surface).
Simulator _surface_spawn_z now delegates to the same helper, and _infer_root_surface_z inverts the same clearance, so the round trip is consistent (inferred surface ≈ TABLE_Z). The simulator's z-override becomes a no-op for agreeing objects.

Half-height plumbing (no hardcoded constants)

The old _surface_spawn_z used bbox_height / 2, which is wrong: the MuJoCo body origin is not the geometric centre, and objects settle onto collision geometry the bounding box doesn't capture (empirically surface + h/2 was 18–90 mm off the settled z). Analytical lowest-collision-point reconstruction was also unreliable (~20 mm error on scanned objects due to canonical-orientation effects).

So the per-class spawn clearance — the settled body-origin height above TABLE_Z — is measured from the authoritative LIBERO MuJoCo assets by scripts/measure_spawn_clearances.py (deterministic seed, median over many table-resting instances, kitchen-table frame) and stored in data/spawn_clearances.json (a generated registry, like the existing OBJECT_DIMENSIONS). Re-run the generator after asset upgrades. Classes absent from the registry fall back to the legacy bbox approximation so the helper stays total (e.g. unmeasured OOD object-axis variants).

15 classes measured, covering all SMOKE_TASKS movables (bowl, plate, wine_bottle, cream_cheese, frypan, moka_pot, mugs, butter, milk, ketchup, …).

Tests

tests/test_spawn_z_consistency.py:

renderer-side z == simulator _surface_spawn_z ≤ 1 mm for every measured class × surface (the core invariant; guards future divergence).
renderer emits the concrete resolved z and no LIBEROObject keeps the TABLE_Z placeholder.
TABLE_SURFACE_Z does not drift across asset_metadata / simulator.TABLE_Z / libero_model.scenic.
every SMOKE_TASKS movable class has a measured clearance.

PYTHONPATH=src .venv/bin/python -m pytest tests/ -k 'invariant or consistency or g4' -q → 156 passed. Settle/G5 suites (test_settle_correctness, test_g5_settle_drift_fixtures, test_simulator_param_consumers) → 48 passed (no regression from the simulator-side changes).

Smoke (non-scenic-only, through G6) — pose_tolerance

	pose_tolerance True
before (per RCA)	0 / 97
after (quick, 3 conditions, 12 movables)	11 / 12 (92%)

Per-class after: akita_black_bowl 3T/0F, cream_cheese 3T/0F, plate 2T/1F, wine_bottle 3T/0F. The single False is post-settle xy drift on the plate — the secondary, asset-specific physics noise the RCA flagged as separate (not the z-frame defect this PR fixes). A full ≥20-condition run across {bowl, plate, bottle, cabinet+contained} × multiple subsets is in progress; numbers will be posted as a comment.

Scope / out of scope

Different-table-height suites (e.g. living-room tables ≈ 0.41 m) and elevated placements (object on a wine-rack/cabinet-top) are not in the kitchen smoke; their surface is not TABLE_Z, so those remain a known, separate limitation (the contained/elevated minority the RCA acknowledged). No tolerance widening, no try/except masking, no band-aids.

🤖 Generated with Claude Code

…_tolerance compares apples-to-apples (Option A) The G4 family-C pose_tolerance invariant compared the Scenic-sampled object pose against the post-reset MuJoCo pose, but the renderer emitted every object at a bare TABLE_Z placeholder z while the simulator resolved the real settled z. Every movable failed on an 8-18 cm z-frame mismatch (0/97 pose_tolerance True). See rca/stage1_g4_consistency_pose_frame_mismatch.md (Option A). Fix (Option A): a single shared pure helper resolves the spawn z; the renderer calls it at codegen to emit a CONCRETE z, and the simulator delegates to it, so both sides agree and the simulator's z-override becomes a no-op. - asset_metadata.surface_spawn_z(surface_z, asset_class): the one source of truth. surface_z + per-class spawn clearance. - Per-class clearance is the settled body-origin height above TABLE_Z, MEASURED from the authoritative LIBERO MuJoCo assets (not bbox h/2 -- the body origin is not the geometric centre and objects settle onto collision geometry). Stored in data/spawn_clearances.json; regenerate via scripts/measure_spawn_clearances.py. No hardcoded per-class constants. - simulator._surface_spawn_z now delegates to the shared helper; _infer_root_surface_z inverts the same clearance so the round trip is consistent (inferred surface ~= TABLE_Z). - renderer emits surface_spawn_z(TABLE_SURFACE_Z, class) as a concrete float for absolutely-placed LIBEROObjects (relative/contained placements unchanged). Tests: tests/test_spawn_z_consistency.py asserts renderer-side and simulator _surface_spawn_z agree <=1mm for every measured class/surface, the renderer emits the concrete z (no TABLE_Z placeholder), and the TABLE_Z constant does not drift across asset_metadata/simulator/scenic model. Smoke (kitchen, through G6): pose_tolerance 0/97 -> 11/12 True on a 3-condition quick run (full run in PR body). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…d diagnostics The spawn-clearance fallback for classes absent from the measured registry (object-axis OOD variants, distractor-pool objects) previously used bbox height/2 — the exact model the z-frame RCA refuted, which under-places objects ~5-9 cm and re-introduces the pose_tolerance mismatch for every unmeasured class. Replace it with DEFAULT_CLEARANCE = the median measured clearance (~0.10 m), a data-derived prior in the correct table band. Measured classes are unchanged. Also adds scripts/measure_spawn_clearances coverage notes and the smoke_categorized / diagnose_pose_failures diagnostics used to quantify pose_tolerance by failure category (non-substituted vs object-axis variant vs contained). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Isolates the z-frame fix's domain (excludes robot/distractor axes, which physically displace objects post-settle — the separate settle-drift issue documented as Finding B in the run2 RCA). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

KE7 · 2026-05-30T22:38:38Z

Categorized smoke results (shipped code)

Per-axis z verification (real envs): non-displacing subset position,lighting,texture,background,articulation → all movables Δ = 0.0 mm (x, y, z). The 8–18 cm z-frame error is gone.

SMOKE_TASKS × non-displacing subsets (6 subsets × 5 tasks = 30 conditions, 108 movables):

OVERALL: 55/108 (51%)
  non-substituted : 55/105 (52%)
  object-variant  :  0/3   (0%)   <- Finding A (runtime-sampled HARD/task-seated variants)

SMOKE_TASKS × random subsets (8, incl. robot/distractor; 40 conditions, 163 movables):

OVERALL: 76/163 (47%)
  subsets w/o robot/distractor : 15/18 (83%)   <- z-frame fix domain
  subsets w/  robot/distractor : 61/145 (42%)  <- Finding B settle-drift

Failure decomposition (per-axis Δ, bowl task):

object,robot,distractor,background — xy displacement 42–138 mm (z≈0): robot/distractor physically move objects.
position,robot,background — xy 45–262 mm (z≈0): position jitter + arm collision.
camera,distractor,background (old bbox fallback) — bowl/plate lifted +52 mm by mis-placed distractors; fixed by the median-clearance prior → 4/4 at 0.0 mm.
object substitution — pure-z ~50 mm when a HARD/task-seated variant is sampled.

Net: the z-frame defect is resolved; residual pose_tolerance failures are post-settle xy drift (Finding B) and object-axis variant z (Finding A), both escalated for an EA semantic decision.

KE7 and others added 4 commits May 30, 2026 15:08

style: ruff/black format test_spawn_z_consistency

87b6de1

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

KE7 merged commit 7447e5c into main May 30, 2026
3 checks passed

KE7 deleted the fix/g4-pose-frame-option-a branch May 30, 2026 23:04

KE7 mentioned this pull request May 31, 2026

fix(scenic): close placement-clearance gaps — robot in require graph, distractor↔object AABB, per-(variant,surface) z #24

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(validation/G4): emit simulator-resolved spawn z in Scenic so pose_tolerance compares apples-to-apples (Option A)#23

fix(validation/G4): emit simulator-resolved spawn z in Scenic so pose_tolerance compares apples-to-apples (Option A)#23
KE7 merged 4 commits into
mainfrom
fix/g4-pose-frame-option-a

KE7 commented May 30, 2026

Uh oh!

KE7 commented May 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

KE7 commented May 30, 2026

Problem

Fix (Option A): one shared helper resolves spawn z; renderer emits it concretely

Half-height plumbing (no hardcoded constants)

Tests

Smoke (non-scenic-only, through G6) — pose_tolerance

Scope / out of scope

Uh oh!

KE7 commented May 30, 2026

Categorized smoke results (shipped code)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant