Skip to content

Support: cache prebuilt runtime arena images#1193

Merged
ChaoWao merged 1 commit into
hw-native-sys:mainfrom
ChaoWao:support/cache-prebuilt-runtime-arena-images
Jul 1, 2026
Merged

Support: cache prebuilt runtime arena images#1193
ChaoWao merged 1 commit into
hw-native-sys:mainfrom
ChaoWao:support/cache-prebuilt-runtime-arena-images

Conversation

@ChaoWao

@ChaoWao ChaoWao commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Cache prebuilt runtime-arena images on the DeviceRunner lifetime instead of a process-global cache
  • Reuse cached GM heap, SM, and runtime arena bases on bind hits while rebuilding only on config misses
  • Wire cache hooks through onboard and sim HostApi for both a2a3 and a5
  • Make clang-tidy tolerate unrecoverable stale compile database entries while still linting indexed changed files

Testing

  • git diff --check upstream/main..HEAD
  • pre-commit run --from-ref upstream/main --to-ref HEAD
  • pip install --no-build-isolation -e .

@coderabbitai

coderabbitai Bot commented Jun 29, 2026

Copy link
Copy Markdown

Review Change Stack

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fb8fc117-a900-4dee-b177-29978a6b5a1f

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a global mutex-protected cache for prebuilt PTO2 runtime arena images keyed on resolved ring/scheduler configuration. On cache hits, cached bytes are uploaded directly to the device arena. A new runtime_rebind_device_pointers_from_layout function reinitializes device-dependent orchestrator, scheduler, and per-ring state after attaching a cached image. The executor calls this function post-wiring with failure handling. Docs are updated to reflect the cache/build terminology.

Changes

Prebuilt Runtime Arena Cache and Pointer Rebind

Layer / File(s) Summary
runtime_rebind_device_pointers_from_layout API and implementation
src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_runtime2.h, src/a2a3/runtime/tensormap_and_ringbuffer/runtime/shared/pto_runtime2_init.cpp
Declares and implements the new function. Validates rt, recomputes heap sizes with overflow check, rebinds orchestrator SM header, heap base/size/fatal, per-ring task allocator and fanin spill pool, scheduler SM header, profiling counters, ring scheduler state, and dep pool backing storage. Returns bool.
Host-side prebuilt arena cache
src/a2a3/runtime/tensormap_and_ringbuffer/host/runtime_maker.cpp
Adds <array>, <memory>, <mutex>, <utility>, <vector> includes. Defines PrebuiltRuntimeArenaCacheKey/PrebuiltRuntimeArenaCacheEntry, a global mutex and cache map, and make_prebuilt_cache_key. Replaces unconditional build with keyed cache lookup: cache hit copies bytes to device; cache miss builds, uploads, and stores result.
Executor rebind call
src/a2a3/runtime/tensormap_and_ringbuffer/aicpu/aicpu_executor.cpp
After runtime_wire_arena_pointers, calls runtime_rebind_device_pointers_from_layout; on failure logs error, clears rt, signals runtime_init_ready_, and returns -1.
Docs
docs/dfx/host-trace.md
Updates run_prepared.bind.prebuilt span description to say "cache/build + upload" instead of "image build + upload".

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 Hop hop, no rebuild twice!
The arena image cached so nice,
A mutex guards our precious store,
Rebind the pointers, run some more.
One key to skip the build — how sweet,
The rabbit's cache makes runtime fleet! 🗂️

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: caching prebuilt runtime arena images.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The description matches the core changes: prebuilt runtime-arena caching, pointer rebinding, and host trace docs updates.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a caching mechanism for prebuilt runtime arena images on the host to avoid redundant rebuilds. It also adds a new function, runtime_rebind_device_pointers_from_layout, to refresh dynamic device addresses (such as shared memory and global memory heap bases) on the AICPU side when a cached image is reused. The review feedback points out a potential issue in this new function where sm_dev_base and gm_heap_dev_base are used without null checks, which could lead to undefined behavior or crashes if either pointer is null.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/a2a3/runtime/tensormap_and_ringbuffer/runtime/shared/pto_runtime2_init.cpp Outdated
@ChaoWao ChaoWao force-pushed the support/cache-prebuilt-runtime-arena-images branch 5 times, most recently from 6a5bb8d to 492a6f1 Compare July 1, 2026 01:56
- Keep the cache on the DeviceRunner lifetime so host image and device arena bases stay bound together
- Reuse cached runtime-arena images on bind hits and defer static size derivation to miss paths
- Wire the cache hooks through onboard and sim HostApi for both a2a3 and a5
@ChaoWao ChaoWao force-pushed the support/cache-prebuilt-runtime-arena-images branch from 492a6f1 to 3e9bef3 Compare July 1, 2026 02:43
@ChaoWao ChaoWao merged commit 9e087b6 into hw-native-sys:main Jul 1, 2026
16 checks passed
@ChaoWao ChaoWao deleted the support/cache-prebuilt-runtime-arena-images branch July 1, 2026 03:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant