Skip to content

Epic: shared helper module (displayxr-common) + unified capture API — kill the 6× Kooima/common drift #396

@dfattal

Description

@dfattal

Why this epic

The same native helper code — above all the Kooima off-axis projection math (display3d_view.{c,h}, camera3d_view.{c,h}, leia_math.h) — is vendored by copy in at least six places across the org and is already drifting (the clip-plane fix in #393 had to be hand-applied to one copy with no way to sync the others). The #389 work (VK native compositor now reliably composites many window-space layers; new windowspace_handle_*_win test apps) turns the window-space-layer + HUD + input scaffolding into a genuinely reusable building block too. This epic makes the shared code a versioned dependency so editing-a-local-copy becomes structurally impossible.

#393 originally scoped this to the two C++ demos and excluded the engines. That was too narrow: the engine plugins compile the same math C files. This epic supersedes that framing; #393 is re-scoped to be layer 2 (the C++ scaffolding) of the design below.

Drift inventory — every current copy

Copy location Repo Form Notes
test_apps/common/ displayxr-runtime C/C++ this repo's native test apps
common/ displayxr-demo-modelviewer C/C++ Vulkan [0,1] projection fix landed here
common/ displayxr-demo-gaussiansplat C/C++ + unmerged feat/zdp-clip-soft-fade-pick WIP (near/far, selftest, center_view)
include/ kooima-projection C the intended math home — seeded then abandoned (last push 2026-05-04)
Source/DisplayXRCore/Private/Native/ displayxr-unreal C compiled into the UE module (see ADR-003 UE-native off-axis projection)
native~/ (display3d_view.h, displayxr_kooima.{h,cpp}) displayxr-unity C++ → P/Invoke wraps the same Kooima math (see ADR-006 window-relative Kooima)

~20 of the ~28 common/ files are byte-identical between the two demos; nearly all divergence is accidental.

Architecture — one repo, two CMake targets (no new repo)

Repurpose the dormant kooima-projection repo → rename to displayxr-common, exposing two targets so the math core is shared by everything while engines never link the C++/Win32 scaffolding:

displayxr-common   (renamed from kooima-projection — net new repos: 0)
  ├─ displayxr::math     pure C: display3d_view, camera3d_view, leia_math — ZERO deps
  │     └─ linked by: displayxr-unreal (UE module), displayxr-unity (native~)   ← engines, math only
  └─ displayxr::common   C++ scaffolding (HUD/D2D, input/Win32, window mgr,
        xr_session_common, window-space-layer UI, stb, view_params,
        manifest cmake) — depends on displayxr::math
        (NOTE: atlas/screenshot capture is NOT shared here — it is removed in
         favour of a runtime API; see W6.)
        └─ linked by: runtime test apps, modelviewer, gaussiansplat            ← C++ apps

Consumed via CMake FetchContent pinned to a tag (same pattern already used for tinygltf/glm/OpenXR-loader). Local co-dev via FETCHCONTENT_SOURCE_DIR_DISPLAYXRCOMMON=../displayxr-common.

Why one repo, not two: FetchContent clones the whole repo regardless of target, so a separate math repo wouldn't make engines pull less source — the real link-level isolation comes from the displayxr::math target carrying no transitive Win32/D2D deps. A second repo would only earn its keep if the math needed an independent release cadence or had an external (non-DisplayXR) consumer; neither holds today. One repo = one CI, one tag, one branch-protection/CODEOWNERS/versions.json entry.

Divergence policy: mechanism in the lib, policy at the call site. Parameterize (e.g. display3d_compute_view(..., near_offset, far_offset, ...) outputs both the projection matrix and resolved near_z/far_z — modelviewer uses hardware clip, gauss feeds software cull); inject renderer-specific bits via callback; keep genuinely app-local files app-local (e.g. stb_image_impl_macos.cpp). Never #ifdef MODELVIEWER / #ifdef GAUSS inside the lib.

Workstreams & ordering

W1 — Reconcile the math core (blocking prerequisite; highest drift risk).
Land gauss's feat/zdp-clip-soft-fade-pick, then unify display3d_view.{c,h} + camera3d_view.{c,h} into one superset: Vulkan [0,1] projection + near_offset/far_offset API + near_z/far_z outputs + Unity's window-relative Kooima (ADR-006) + any Unreal delta. Validate against each consumer before landing.

W2 — Stand up displayxr-common. Rename kooima-projection; add the displayxr::math target seeded from the reconciled core; add Win+Mac CI on a tiny consumer harness; tag v0.1.0 (math only).

W3 — Adopt the math core everywhere (engines included). Replace each vendored math copy with the pin:

  • displayxr-runtime test apps (*_handle_*_win, windowspace_handle_*_win)
  • displayxr-demo-modelviewer
  • displayxr-demo-gaussiansplat
  • displayxr-unreal (UE Build.cs references the fetched C core instead of Private/Native/)
  • displayxr-unity (native~ build compiles the fetched C core)

W4 — Extract the C++ scaffolding layer (this is re-scoped #393). Reconcile hud/input/xr_session deltas; add the displayxr::common target; migrate the three C++ consumers (delete each common/, add FetchContent). Tag v0.2.0.

W5 — Release discipline. displayxr-common gets its own tags; consumers bump the pin on their own cadence (same as the runtime pin today). Optional CI drift-guard: hash check if any app ever re-vendors a mirror.

W6 — Replace app-side capture with a runtime API (supersedes sharing atlas_capture via the lib).
The I-key / Ctrl+Shift+C screenshot is reimplemented 6× today — 5 per-API readbacks in test_apps/common/atlas_capture_{d3d11,d3d12,gl,vk,metal} (forked byte-for-byte into both demos' common/), plus Unity Runtime/DisplayXRScreenshot.cs and Unreal DisplayXRAtlasCapture.cpp. Rather than fold atlas_capture into displayxr::common, the runtime gains one official xrCaptureAtlasEXT (new XR_EXT_atlas_capture, PROJECTION_ONLY / POST_COMPOSE flag) and every app deletes its readback. Only the platform flash-overlay + filename-numbering UX stays app-local. Full design + per-repo deletion list: docs/roadmap/unified-atlas-capture.md.

Scope note: W6 expands this epic beyond the original Kooima/common math dedup — it adds a runtime OpenXR extension and engine capture reimplementation (the engines' capture code was never in the math drift inventory). Folded here per the shared common/ surface and migration coordination; the runtime extension itself is independently releasable and need not gate on W1–W5.

W7 — RAW inputs / rig generators / render-ready output (spike → design → implement).
A design-first workstream that reshapes how view geometry is served, and changes what displayxr::math is for. Full brainstorm + decisions to be written up as docs/roadmap/raw-vs-render-ready-views.md (next session); model summary:

  • Output is a fixed point: render-ready = standard XrView { pose, skewed XrFovf }, already converged on the plane — rig-agnostic. Legacy/non-aware apps and the runtime's weaver consume exactly this.
  • Rig is an input-side generator, not an output field. Display-centric and camera-centric are two paradigms that route raw inputs → the same output shape. They live in displayxr::math (re-roling W1: display3d_view = display-centric rig, camera3d_view = camera-centric rig — two intentional generators, not two drifting copies). Future rigs = new generators, no API/output change.
  • Raw = the generator inputs: eye positions, display-plane pose, effective canvas rect (handle = window, texture = subrect — no app-class branching), timestamp_ns, is_tracking. The escape hatch for any future modifier (IPD/parallax/convergence/ortho/clip) the runtime can't anticipate — promise complete inputs + a stable output, predict nothing.
  • Runtime links displayxr::math to compute render-ready, so "runtime's render-ready" ≡ "aware app's own computation from raw under the default rig" by construction — kills the "runtime view ≠ my view" bug class.

Session findings (code as of 2026-06-02 — verified, with file:line):

  • The runtime IS the suspected 7th copy. src/xrt/auxiliary/math/m_camera3d_view.{c,h} + m_display3d_view.{c,h} + m_multiview.{c,h} are FOV-only xrt-typed ports of app-side test_apps/common/camera3d_view.c / display3d_view.c (header: "Runtime-side port (xrt types, FOV-only — no matrices)"); consumed by oxr_session.c (#include "math/m_camera3d_view.h" :42). Hand-synced → real drift. ⇒ displayxr::math must expose two layers: FOV-only (runtime consumer) and FOV+matrix (app consumers).
  • Render-ready already ships — as a full TWO-rig system with live modifiers, default camera-centric @ 0.5 D (CORRECTED — the earlier "always display-centric" note was wrong). oxr_session.c:1384-1426 branches on view_state.camera_mode: camera3d_compute_views() (camera-centric) vs display3d_compute_views() (display-centric), mirrored server-side at ipc_server_handler.c:~504. Default is camera-centric, cam_convergence = 0.5 D / 2 m (qwerty_device.c:642,647) — i.e. the intended legacy default already holds. External-window apps (handle/texture, real HWND) are forced display-centric via the !sess->has_external_window gate (window = portal). WebXR-over-IPC gets render-ready (bridge runs on d3d11_service, so the normal path — not the narrow xc==NULL headless raw fast-path at ipc_server_handler.c:415-435).
  • The rig + tunables are driven ONLY by the qwerty debug device, not an app API — that is the actual W7 gap. qwerty_view_state (camera_mode + cam/disp spread/parallax/convergence, qwerty_device.h:51-60) is read by the compositors (comp_gl:872, comp_metal:1238, comp_multi_system:1925) and fed into the view computation. Rig toggle = P key (qwerty_win32.c:546-549 / qwerty_macos.m:431-434qwerty_toggle_camera_mode, qwerty_device.c:1066); convergence/spread on other keys. Apps cannot select rig or set convergence/IPD through OpenXR today.
  • Measured-vs-predicted is already plumbed at the DP. Single accessor xrt_display_processor::get_predicted_eye_positions() (xrt_display_processor.h:145) → xrt_eye_positions already carries timestamp_ns + is_tracking + valid (xrt_display_metrics.h:51-60); MANAGED/MANUAL (docs/specs/vendor/eye-tracking-modes.md) is the predict-vs-passthrough knob. ⇒ no new DP accessor; just surface timestamp_ns/is_tracking into the raw channel (stops at the compositor today).

Reframed W7 goal: render-ready + two rigs + modifiers + camera-centric-0.5D default already exist and are correct — but controllable only via the qwerty debug keyboard. W7 = promote that control to an app-facing OpenXR extension (app selects rig + sets convergence/spread/parallax via XrViewLocateRigEXT), add a raw channel for aware apps that bring their own generator, and keep qwerty as the dev/default driver. Not "add render-ready."

Open decisions for the design doc:

  • First impl-session check: what do cube_handle_* (external-window ⇒ forced display-centric) actually consume from xrLocateViewsXrView.fov, or recompute from .pose? Decides whether the raw channel is new vs "let aware apps opt out of the fov we already overwrite."
  • Default is settled (camera-centric @ 0.5 D, already in code). Open: should external-window apps stay force-display-centric, or be allowed to opt into a rig via the new extension?
  • API shape: request chains XrViewLocateRigEXT { rigModel, convergenceInvDiopters, spread/parallax… } (absent ⇒ current qwerty/default behavior); result chains XrViewDisplayRawEXT { eyePosDisplay[N], displayPlanePose, canvasRect, sampleTimeNs, trackingValid }. One xrLocateViews call, both channels, rig-selectable. The extension and the qwerty device write the same view_state.
  • Make the runtime a displayxr::math consumer (fold m_camera3d_view/m_display3d_view/m_multiview into the lib) — adds an inventory entry W1–W4 didn't list.

Status: spike/design — do not implement before the design doc lands. Independent of W1–W6 mechanically, but shares displayxr::math ownership with W1.

Per-repo touch list (incl. native-build changes)

Repo Change Native build impact
kooima-projection → displayxr-common rename, two targets, CI, tags
displayxr-runtime test apps consume math + common pin CMake only
displayxr-demo-modelviewer delete common/, pin CMake only
displayxr-demo-gaussiansplat land WIP, delete common/, pin CMake only
displayxr-unreal drop Private/Native/ math, pin + W6: drop DisplayXRAtlasCapture readback Build.cs
displayxr-unity drop native~ math copy, pin + W6: gut DisplayXRScreenshot.cs readback native~ CMake / build-win.bat
displayxr-runtime W6: XR_EXT_atlas_capture + delete atlas_capture_* runtime + CMake
displayxr-extensions W6: header auto-sync
displayxr-runtime W7: fold m_multiview/m_display3d_view into displayxr::math (runtime becomes a math consumer); add raw-channel + rig-select to xrLocateViews runtime + CMake

Decisions taken

  • One repo, two CMake targets (not two repos) — avoids proliferation; link isolation via targets.
  • Repurpose existing kooima-projection (net zero new repos); it already holds the math files.
  • FetchContent-by-tag; independent per-consumer cadence preserved.
  • Capture is removed, not shared: the atlas_capture family is deleted in favour of a runtime xrCaptureAtlasEXT (W6), so it is intentionally excluded from displayxr::common.
  • The runtime is the 7th math consumer (W7): m_multiview/m_display3d_view are FOV-only ports of the app-side generators; fold into displayxr::math so render-ready ≡ app-from-raw by construction. display3d_view/camera3d_view are the two canonical rig generators, not drifting copies.

Open questions

  • Exact lib surface: which files are displayxr::common vs app-local (stb impl TU, platform glue).
  • Versioning: tag-per-change vs periodic; whether demo/engine releases gate on a minimum displayxr-common version (as they already do for the runtime).
  • Engine math consumption mechanics: FetchContent vs git-subtree of the C files for UE/Unity native builds (engines don't use top-level CMake the same way).

Related: #393 (re-scoped to W4), #389 (window-space-layer building block), docs/roadmap/unified-atlas-capture.md (W6 capture-API design + per-repo deletion list), docs/roadmap/raw-vs-render-ready-views.md (W7 view-model design — to be written next session), displayxr-unity ADR-006, displayxr-unreal ADR-003.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions