Skip to content

Phased optimization roadmap (Phases 1–9): diagnostics, route contracts, M12 verification, binding/command contracts, FNA reliability, release gates#204

Merged
aaf2tbz merged 13 commits into
mainfrom
codex/phased-optimization-roadmap
Jun 14, 2026
Merged

Phased optimization roadmap (Phases 1–9): diagnostics, route contracts, M12 verification, binding/command contracts, FNA reliability, release gates#204
aaf2tbz merged 13 commits into
mainfrom
codex/phased-optimization-roadmap

Conversation

@aaf2tbz

@aaf2tbz aaf2tbz commented Jun 14, 2026

Copy link
Copy Markdown
Owner

Summary

Implements the full 9-phase MetalSharp Phased Optimization Roadmap
(docs/optimization-roadmap/) as one body of work. Each phase is a separate
commit with its own proof gate green and M9/M10/M11 launch behavior and
artifact paths untouched
.

Baseline before any work: 502 Rust tests passed, 0 failed.
Final: 579 Rust tests passed (+77), 0 failed, clippy + fmt clean,
validate-contracts.py → [PASS] (8), validate-probe-matrix.py → [PASS] (18).

The guiding principle: "moving the right work to Metal" — high-frequency
graphics execution and translation products closer to durable Metal objects —
while leaving compatibility orchestration in Rust/Wine/DXMT.

Per-phase

Phase Commit What landed
1 — Baseline observability 9ebbfcb diagnostics.rs stable launch JSON, LaunchTiming checkpoints, injection sha256, scan timing
2 — Bottle/route contract 7f38675 SteamRouteContract table, M11/M12 passive-refresh preservation, migration preserve/skip report
3 — M12 artifact verifier cd11bdd m12_verify_dry_run + pipeline_dry_run_for (same env builder as launch, no spawn)
4 — Shader/PSO/cache 3f77c7c read-only cache doctor, PsoDiagnosticManifest schema, M12 cache isolation
5 — Descriptor binding 86ba2bf binding_contract root-signature + reflection-ABI validator (Metal direct-binding limits)
6 — Command replay/barriers abe432b command_contract command-list/visibility/render-pass validator
7 — Runtime/migration cleanup e4e2e14 per-artifact report, WinebootState, stop-Wine-Steam target report
8 — Mono/FNA/XNA 2220b4c fna_profile signals, AssetReceipt staging, profile-explain, conservative classifier
9 — Release gates/docs ab76f00 local-gates.md, release-checklist.md, ci-gating-notes.md, roadmap README

Scope decisions (honest)

  • vendor/dxmt is reference source, NOT compiled by this repo's CMake
    build (the shipped DXMT runtime is prebuilt under lib/). Phases 4/5/6
    therefore land the Rust-side contract models and validators that mirror
    what the existing SDK probes/audits parse, without touching shader-lowering
    or command-list semantics. Those typed validators are unit-tested in CI and
    also exposed as live introspection routes.
  • No readiness/launch logic was changed: stop_wine_steam,
    should_wait_for_prefix_idle, dxmt_runtime_ready, find_fna_profile,
    deploy_fna_assemblies are all untouched. Phase 7/8 add observational
    reports and explicit state enums only.
  • No new test mutates the process-global METALSHARP_HOME; all new
    diagnostics/migration tests use explicit-home (_for) variants for parallel
    safety.

New backend routes (14, all read-only)

GET  /diagnostics/launch                     (P1)
GET  /diagnostics/launch/timing              (P1)
GET  /bottles/route-contracts                (P2)
GET  /update/migrate/report                  (P2)
GET  /diagnostics/m12/dry-run                (P3)
GET  /diagnostics/pipeline/dry-run           (P3)
GET  /diagnostics/cache-doctor               (P4)
GET  /diagnostics/pso-manifests              (P4)
POST /diagnostics/binding-contract/validate  (P5)
POST /diagnostics/command-replay/validate    (P6)
GET  /diagnostics/runtime-artifacts          (P7)
GET  /diagnostics/wineboot-state             (P7)
GET  /steam/stop-targets                     (P7)
GET  /diagnostics/fna/signals|explain|classify (P8)

Proof gates

cargo fmt --all -- --check                       # clean
cargo clippy --all-targets -- -D warnings        # clean
cargo build --release                            # ok
cargo test                                       # 579 passed, 0 failed
python3 tools/d3d12-metal-sdk/scripts/validate-contracts.py    # [PASS] 8
python3 tools/d3d12-metal-sdk/scripts/validate-probe-matrix.py # [PASS] 18

Full per-phase detail, including new-test counts and boundary checks, is in
docs/optimization-roadmap/PR-SUMMARY.md.

Draft until reviewed. No M9/M10/M11 launch behavior or artifact path changed.

aaf2tbz and others added 13 commits June 14, 2026 01:02
Add a stable launch diagnostic surface (schema_version 1) that reports the
resolved pipeline, runtime profile, wine binary, prefix, artifact sources
with sha256 hashes, staged DLL hashes, and shader cache directories. Missing
required artifacts now produce a structured failure instead of a silent
fallback.

Thread LaunchTiming checkpoints through prepare_pipeline (pipeline
resolution, recipe build, runtime validation, DLL staging, prefix deploy)
and persist atomically per-bottle. Add Steam library and full-scan timing
persistence. Record sha256 + source sha256 in the injections manifest so
staged-vs-source drift is observable.

New routes (no existing behavior changed):
  GET /diagnostics/launch?appid=&pipeline=
  GET /diagnostics/launch/timing?appid=

M9/M10/M11 artifact paths and launch behavior are untouched.

Tests: 513 passed, 0 failed (was 502; +11 new).
clippy + fmt clean.
Add a declarative Steam route contract (SteamRouteContract) that codifies,
per pipeline, the runtime profile, steam identity mode, launch route,
requires_wine, shared Steam prefix binding, prefix-idle wait policy, compat
tool name, and appid-scoped bottle id template. The contract is derived from
the same primitives the runtime uses so it cannot drift from launch behavior.

Add passive-refresh preservation tests for M11 and M12 (the M9 case already
existed), a data-driven route-contract test covering M9/M10/M11/M12/M13/
FnaArm64/WineBare/D3DMetal, and a deploy_steam_appid staging test.

Add a migration preserve/skip report (migrate::MigrationReport) that records
every preserved and skipped category with a reason during preserve/restore,
persisted atomically to logs/migration-report-latest.json. This is purely
observational — it does not change what is preserved.

New routes: GET /bottles/route-contracts, GET /update/migrate/report.

No new test mutates the process-global METALSHARP_HOME; all new diagnostics
and migration tests use explicit-home (_for) variants for parallel safety.

Tests: 521 passed, 0 failed (3 consecutive runs). clippy + fmt clean.
M9/M10/M11 launch behavior and artifact paths unchanged.
Add m12_verify_dry_run / pipeline_dry_run_for: a read-only verifier that runs
through the same environment builder (steam_pipeline_env_pairs) as
launch_dxmt_metal. It reports, without launching Steam or the game:

- the resolved lib/dxmt-m12/x86_64-windows dir + each deploy DLL with presence,
  sha256, and size
- the lib/dxmt-m12/x86_64-unix sidecars (winemetal.so, libc++.1.dylib,
  libc++abi.1.dylib, libunwind.1.dylib) for the M12/M13 lane
- the exact env pairs the launch path sets, with an env_keys_present map for
  WINEDLLOVERRIDES, DXMT_SHADER_CACHE_PATH, DYLD_FALLBACK_LIBRARY_PATH,
  SteamAppId, DXMT_WINEMETAL_UNIXLIB
- missing required artifacts as a structured ok:false + missing[] array

New routes: GET /diagnostics/m12/dry-run, GET /diagnostics/pipeline/dry-run.

Contract tests: M12 deploys d3d12 from lib/dxmt-m12; M11 excludes d3d12 and
never touches lib/dxmt-m12; M12 dry-run includes d3d12 and M11 does not; M12
dry-run verifies unix sidecars and flags missing artifacts; M12 env sets
winemetal overrides and the isolated m12 shader cache.

docs/architecture/m12-pipeline-map.md now documents the verifier and marks
stability gap #1 (first-class M12 runtime verification) as addressed.

Tests: 526 passed, 0 failed. clippy + fmt clean.
M9/M10/M11 deploy lists and artifact paths unchanged. Verifier is read-only.
vendor/dxmt is reference source NOT compiled by this repo's CMake build (the
shipped DXMT runtime is prebuilt under lib/). Phase 4 therefore lands the
Rust-side observability layer that parses the runtime's on-disk products
without touching shader lowering semantics.

Add shader_cache::cache_doctor (read-only) that reports cache roots, per-DB
size/mtime, total cache_* entry count, newest/oldest mtime, the staged runtime
DLL sha256 (from injections.json) for staleness detection, and a stale_warning
when entries exist without a recorded hash.

Add shader_cache_family / primary_cache_subdir codifying the isolation
contract: M9/M10/M11 share dxmt-metal, M12/M13 use the isolated dxmt-metal12.

Add PsoDiagnosticManifest: the stable JSON schema for DXMT PSO trace sidecars
(DXIL/MSL/root-sig hashes, formats, sample count, uses_stage_in, async
compile, compile status, Metal error, ObjC exception). parse_pso_manifest and
recent_pso_manifests parse the trace JSON DXMT emits under DXMT_LOG_PATH.

New routes: GET /diagnostics/cache-doctor, GET /diagnostics/pso-manifests.

Tests: 534 passed, 0 failed (+8). clippy + fmt clean.
No shader lowering semantics changed; cache inspection is read-only.
M9/M10/M11 cache families remain shared; M12/M13 remain isolated.
…dening

Treat D3D12 root signatures and descriptor heaps as a formal ABI. New
binding_contract module mirrors what the existing SDK audits
(dxil-binding-manifest-audit.py, dxil-root-signature-audit.py) parse:

- RootSignatureManifest (version 1.0/1.1, parameters, static samplers,
  null-descriptor policy)
- RootParameter (DescriptorTable/Constants/CBV/SRV/UAV, shader visibility,
  register space/index, descriptor-table ranges)
- DescriptorRange, StaticSampler, NullDescriptorPolicy, ShaderVisibility
- ReflectionBinding (shader-declared bindings for the ABI check)

validate_root_signature[_with] enforces Metal direct-binding limits
(buffers<=31, textures<=8, samplers<=16, matching the Python audit defaults)
plus D3D12 ABI rules: limit violations, overlapping ranges within a space,
static-sampler register clashes, sparse root parameter indices, and
UINT_MAX unbounded ranges rejected without proven probe support.

Reflection-ABI check proves every reflection binding is covered by a root
parameter range or static sampler (visibility-aware), turning a shader-vs-
root-signature mismatch into a contract failure.

New route: POST /diagnostics/binding-contract/validate.

Tests: 548 passed, 0 failed (+14). clippy + fmt clean.
No runtime binding behavior changed; M9/M10/M11 unaffected. Rust limits match
the Python binding audit defaults so the two gates agree.
…ntract

vendor/dxmt is reference source not compiled here. Phase 6 lands the Rust-side
contract model (typed rules the existing SDK probes validate) so encoder
lifetime is observable and a missing transition is a contract failure.

New command_contract module with:
- ResourceState (D3D12 subset, is_write/is_read)
- ResourceBarrier (incl. split barriers via split_begin)
- RenderPassBoundary (render targets + depth + sample count)
- CommandOp tagged enum: Reset, ResourceBarrier, BeginRenderPass/EndRenderPass,
  ClearRenderTargetView, Draw/DrawIndexed, Dispatch, CopyResource/Region,
  ResolveSubresource, Present, Close, Execute

validate_command_trace enforces visibility (state must permit the use), the
present gate (back buffer must be Present; never-transitioned = Common is
flagged), render-pass boundaries (BeginRenderPass while open flagged, RT set
change without boundary flagged), reset/reuse (reset inside render pass
flagged), and split barriers (un-ended BEGIN_ONLY flagged at end and at write).

Visibility summary counters track total/write->read/read->write/splits/render
passes. COMMON/GENERIC_READ permit implicit read access (D3D12 decay rules).

New route: POST /diagnostics/command-replay/validate.

Tests: 560 passed, 0 failed (+12). clippy + fmt clean.
No runtime command-list/barrier behavior changed; M9/M10/M11 unaffected.
Add runtime_artifact_report[_for] per-artifact verification that goes beyond
file_nonempty presence checks by recording sha256 + size for EACH required
file (M11 lib/dxmt and M12 lib/dxmt-m12, PE + unix sidecars). A missing M12
sidecar is now reported by name. missing_m12_sidecars[_for] provides the
explicit named-missing list for the regression gate.

Add bottles::WinebootState explicit state machine (Idle / PrefixUpdating /
Verifying / PrefixMissing) separating "prefix is updating" (Wine busy) from
"MetalSharp is verifying" (runtime-doctor/preflight), so the UI does not
double-poll or misrepresent a Steam update window. steam_prefix_wineboot_state
[_for] exposes the enum + derived booleans.

Add steam::stop_wine_steam_targets: makes the existing stop filter observable
- lists what stop_wine_steam would target AND the processes explicitly
excluded (macOS Steam client, MetalSharp's own rg/ps), proving the stop is
scoped to real Wine Steam helper processes. No behavior change to stop itself.

New routes: GET /diagnostics/runtime-artifacts, /diagnostics/wineboot-state,
/steam/stop-targets.

All new functions have explicit-home _for variants; no new test mutates the
process-global METALSHARP_HOME (parallel-safe).

Tests: 566 passed, 0 failed (+6, 2 consecutive runs). clippy + fmt clean.
No readiness logic changed; no automatic restart behavior introduced.
New fna_profile module treating Mono/FNA/XNA as a first-class compatibility
family:

- detect_fna_signals: richer flavor detection layered on the existing
  detect_fna_flavor (FNA/MonoGame/XNA + Steamworks.NET, CSteamworks, FAudio,
  FMOD, OpenAL, XInput, x86-vs-native Mono signal, evidence files).
- AssetReceipt + AssetStagingReport: receipt-driven asset staging (filename,
  source path+sha256, dest path+sha256, required vs optional, overwrote-game-
  file flag, reason). Persists atomically to <game_dir>/.metalsharp/fna-
  staging.json so future runs skip re-copying when hashes match.
- explain_profile: "profile explain" diagnostic reporting WHY a game selected
  FNA ARM64 / x86 / XNA-MonoGame x86 / fallback.
- classify_unproven_fna_game: conservative classifier that does NOT claim
  compatibility, stages only reversible shims, offers Wine fallback. Pinned
  known-good app ids never reclassified.
- PINNED_FNA_APPIDS const codifies the protected known-good set.

New routes: GET /diagnostics/fna/signals, /explain, /classify; url_decode
helper for query params.

Tests: 579 passed, 0 failed (+15, 2 consecutive runs). clippy + fmt clean.
Pinned behavior for Terraria/Celeste/Stardew unchanged; deploy_fna_assemblies
and find_fna_profile are not overridden.
Add docs/optimization-roadmap/ with:
- local-gates.md: canonical local gates (Rust/TS/C++/SDK probes) + a table of
  all Phase 1-8 backend diagnostic routes.
- release-checklist.md: version sync (5 files), runtime artifact presence +
  hashes, M12 sidecar presence, legacy DXMT surface, local graphics gates,
  route gates, strict SDK doc gate.
- ci-gating-notes.md: explicit CI-proves vs local-only (graphics gates), with
  pointers to the Phase 4-6 Rust validators that ARE unit-tested in CI.
- README.md: index with per-phase commit map and baseline/final proof.

AGENTS.md now points to local-gates.md from the suggested-tests section.

Verified validate-contracts.py -> [PASS] (8) and validate-probe-matrix.py ->
[PASS] (18). Docs-only; no code behavior changed. Final Rust gate: 579
passed, 0 failed; clippy + fmt clean.

This completes the 9-phase optimization roadmap.
@aaf2tbz aaf2tbz marked this pull request as ready for review June 14, 2026 16:11
@aaf2tbz aaf2tbz merged commit ee51585 into main Jun 14, 2026
7 checks passed
@aaf2tbz aaf2tbz deleted the codex/phased-optimization-roadmap branch June 14, 2026 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant