fix(build): rebuild onboard a2a3 host_runtime on pto-isa commit change#1194
fix(build): rebuild onboard a2a3 host_runtime on pto-isa commit change#1194doraemonmj wants to merge 1 commit into
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request updates the build cache invalidation logic to fold the resolved pto-isa commit into the build stamp and the CMake compile definitions for a2a3 onboard builds. This ensures that updates to pto-isa headers automatically invalidate stale cache objects and force a recompile. The review feedback suggests improving the CMake environment variable check by explicitly comparing against an empty string, and refactoring the Python resolution logic to prioritize PTO_ISA_ROOT over the commit environment variable, raise an error on failure, and avoid speculative .strip() validation.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@simpler_setup/runtime_builder.py`:
- Around line 188-192: The PTO-ISA fallback path in get_pto_isa_commit() returns
a resolved commit from PTO_ISA_ROOT, but the CMake ccache-busting logic in the
runtime builder only keys off SIMPLER_RUN_PTO_ISA_COMMIT. Update the environment
propagation in runtime_builder so the fallback commit also feeds the same CMake
env variable used by the cache stamp, and ensure the PTO-ISA resolution path and
the ccache-busting block both reference the same commit source.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 529cf819-5721-4bec-8a92-6c308700531d
📒 Files selected for processing (3)
docs/developer-guide.mdsimpler_setup/runtime_builder.pysrc/a2a3/platform/onboard/host/CMakeLists.txt
d4d82a9 to
e98795d
Compare
Review整体结论:修复正确、范围清晰、改动最小,核心机制和时序契约我都追过,成立。唯一实质缺口是缺回归测试,见下方 Should-fix。 机制确认两层缓存串联,各有"什么都没变"的盲区,pto-isa-only 变更必须同时击穿两层,本 PR 做到了:
锁步不变量已核对:stamp 与 cmake 宏必须同一个 commit。正常 边界情况也 OK:旧格式 Should fix —— 补回归测试修复核心逻辑(
PR 描述里的 "Local test results" 是一次性手动验证,不构成进仓的回归防线。 Consider
Verdict建议合入,合入前补上单测(Should-fix)。两个 Consider 属打磨。 |
pto-isa is a header-only dependency baked into the a2a3 onboard host_runtime.so at compile time (e.g. kSdmaMaxChan). A pto-isa bump that leaves the runtime repo HEAD untouched was invisible to both build caches, so a reinstall served a stale binary -> SDMA query failure -> 507899. - cmake cache stamp now folds in the resolved pto-isa commit, so a bump mismatches the stored stamp and clears the per-target build cache. - host_runtime compile command carries SIMPLER_PTO_ISA_BUILD_COMMIT so the ccache key changes on a bump (git checkout does not bump header mtimes, which otherwise yields a stale ccache hit under compiler_check=mtime). Together a plain reinstall recompiles against the new pto-isa with no manual 'ccache -C' / 'rm -rf build/'. Scoped to a2a3 onboard, the only variant that embeds pto-isa headers today.
e98795d to
f8bbf29
Compare
|
close #1139 @ChaoZheng109 |
Problem (issue #1139)
pto-isa is a header-only dependency compiled into the a2a3 onboard
host_runtime.so(e.g.kSdmaMaxChaninsdma_workspace_manager.hpp). Apto-isa bump that leaves the runtime repo HEAD untouched was invisible to
both build caches, so a plain reinstall served a stale binary and the SDMA
workspace query failed →
ImportByKey -> 507899cascade inallocate_domain.Two layers each mis-judged "nothing changed":
.git_commitstamp keyed only on theruntime repo HEAD, so a pto-isa-only change never cleared the per-target
cache (git checkout also doesn't bump header mtimes, so cmake's own
incremental check can't see it).
compiler_check=mtimeand unchanged header mtimes after acheckout, the stale
.owas served even afterrm -rf build/.pto_isa_build.jsonis rewritten every install regardless of whether thebinary recompiled, so the existing version guard couldn't catch this case and
the failure was silent.
Fix
runtime_sha:pto-isa=<isa_sha>(a2a3 onboard only). A bump mismatches thestored stamp → the per-target build cache is cleared → clean reconfigure.
host_runtimecompile command now carriesSIMPLER_PTO_ISA_BUILD_COMMIT="<sha>"(unused in code; cache-bust only) so apto-isa bump changes the command line → ccache miss → real recompile.
Both are required: clearing the cmake cache alone doesn't defeat ccache
underneath, and the define alone doesn't help if cmake still thinks it's
incrementally up to date. A plain reinstall now recompiles against the new
pto-isa with no manual
ccache -C/rm -rf build/.Scoped to a2a3 onboard — the only variant that embeds pto-isa headers today
(a5's
SIMPLER_ENABLE_PTO_SDMA_WORKSPACEis currently off), matching_requires_pto_isa_compat_validation().Local test results
Build (real onboard a2a3 toolchain,
pip install --no-build-isolation -e .):build/lib/pto_isa_build.json→pto_isa_commit: 32064ca0…build/cache/a2a3/onboard/*/host/.git_commit→be5f667f…:pto-isa=32064ca0…(new composite format)compile_commands.json:SIMPLER_PTO_ISA_BUILD_COMMIT=\"32064ca0…\"Onboard a2a3 (via
task-submit, device 4):Confirms the rebuilt
host_runtime.so(carrying the new define across all TUs)loads and executes on silicon — no link/load regression from the cache-bust
define.
pre-commit: markdownlint-cli2 / ruff check / ruff format / pyright all pass.