Fix/mmraw hub download fallback by hubert-marek · Pull Request #83 · PrimeIntellect-ai/renderers

hubert-marek · 2026-06-09T20:50:51Z

Add mmraw multimodal payload mode to Qwen VL renderers and fix Hub download fallback for `preprocessor_config.json`

Introduces mm_store.py, a new multimodal feature store supporting three payload modes: raw (mmraw refs), processed (msgpack artifacts written to disk as mmfile refs), and inline (base64-encoded).
Qwen3-VL, Qwen3.5, and Qwen3.6 renderers now call qwen_image_item_for_render, which emits raw-ref descriptors instead of pixel tensors when in raw mode; pixels can be materialized later via new materialize_pixels / materialize_raw_refs methods.
Adds a cached loader for preprocessor_config.json that falls back to a HuggingFace Hub download when the file is not available locally, with an explicit offline-failure path.
image_cache_max defaults to 0 (disabled) for all Qwen and Kimi K2.5 renderer configs; caching now only activates when explicitly set positive.
Adds RendererPool.materialize_pixels as a proxy that checks out a renderer and forwards the call to the underlying implementation.
Behavioral Change: generate in client.py may now return mmraw or mmfile refs in kwargs_data['image'] instead of tensors depending on mm_payload_mode; callers that previously consumed pixel tensors directly will need to handle the new ref types.

generate() now hands back descriptor-only multi_modal_data (image_grid_thw + mm_hashes + mm_placeholders, no pixel_values). Pixels are re-attached only for the engine POST via the new materialize_pixels (cache hit, else reprocess from the message base64; grid_thw asserted), then stripped again. This keeps the env worker from retaining decoded image tensors for the life of a rollout — resident pixel memory is now bounded by the per-image cache instead of growing with turns x concurrency. Also fixes a latent bridge bug: the merge shallow-copied the mm dict but shared the inner lists, so .extend mutated previous_multi_modal_data in place and corrupted earlier trajectory steps' cumulative sets. Copy the lists. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- generate(force_full_pixels): first attempt sends new-turn images full and prior descriptor-only images hash-only; cache-miss fallback materializes all. - _build_qwen_vl_features: descriptor-aware — encode only pixel-bearing items, emit hash-only (None kwargs slot) for the rest, scattered back to original positions so kwargs_data stays aligned with mm_hashes / mm_placeholders. - image_cache_max default 0 (processed pixels stay request-scoped) + a guard so the disabled path never pops an empty cache; RENDERERS_MM_MAX_INFLIGHT semaphore bounds concurrent payload builds. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Single source of truth for the on-disk MM-offload contract, imported by the verifiers env-worker (images), the renderers feature writer, and prime-rl (both readers): - run-scoped paths under /data/outputs/run_<RUN_ID>/assets/{images,mm_features} (run_id_from_env, run_dir, image_asset_dir, feature_asset_dir + subdir consts). - mmfile format: version-pinned feature fingerprint, mm_feature_path (+ traversal guard), mmfile_ref emit + split_mmfile_ref parse (co-located so they can't drift). - msgpack envelope build + match helpers. - sweep_stale_artifacts: mtime TTL eviction over both asset dirs (content-addressed + re-writable, so over-eviction is safe). Co-Authored-By: Codex <codex@openai.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- _build_qwen_vl_features writes processed vLLM features to mm_store and ships mmfile refs; import the format/paths from mm_store (no local copies). - Collapse RENDERERS_MM_FEATURE_STORE_MODE to off/on, default on (deleted the never-differentiated disk-write-through/disk-read-nonstrict/disk-strict ladder; the latter two emitted identical refs). - _existing_mm_feature_valid now also checks placeholder_length: vLLM validates it on load but the envelope match did not, so a stale wrong-length artifact would fail in vLLM and never self-repair (we kept skipping the rewrite). Mismatch -> treat as invalid -> rewrite. Co-Authored-By: Codex <codex@openai.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

sweep_stale_artifacts now evicts only assets/mm_features (the expensive processed MultiModalKwargsItem payloads). assets/images are never swept: screenshots are terminal browser output with no regeneration path, so they are kept for the whole run as the recoverable source of truth, whereas features are a regenerable cache (the trainer rebuilds pixels from the image and never reads these files; the env-worker rewrites any missing feature on demand). Over- eviction of a feature is therefore safe; over-eviction of an image is not, which is why the sweep deliberately excludes the image subdir. The feature writer (_write_mm_feature_artifact) now refreshes mtime on the already-on-disk-and-valid path, so a recurring feature is treated as hot by the last-use sweep instead of aging out on its first-write mtime and forcing an expensive force_full_pixels reprocess. Test updated to the features-only / keep-images contract. Co-Authored-By: Codex <codex@openai.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- client.py: remove the inline_count/mmfile_count debug counters and the "built qwen-vl mm features ..." debug log (the mode=="off" inline path itself is unchanged). - mm_store.py: fold the redundant mm_feature_run_root alias into feature_asset_dir (internal-only, no external importers). Pure cleanup; no behavior change. Co-Authored-By: Codex <codex@openai.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Env workers emit layout-only descriptors + mmraw refs instead of running the HF image processor; vLLM materializes pixels from the raw image on shared disk (hash + fingerprint + grid/placeholder validated). Avoids AutoProcessor and pixel_values on the env worker, cutting RSS. Co-Authored-By: Codex <codex@openai.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…nfig.json The raw-image (mmraw) layout path resolves the model's image geometry from preprocessor_config.json, but _load_preprocessor_config_json only checked local paths and the local HF cache (try_to_load_from_cache). Hosted env workers render models they never loaded locally, so hub-style ids always missed the cache and every image rollout failed with RuntimeError: Qwen raw image layout could not find preprocessor_config.json for 'Qwen/Qwen3.6-35B-A3B' ... even when the file is publicly available on the Hub. Add an hf_hub_download fallback on cache miss (a few hundred bytes, lands in the HF cache, then memoized by the lru_cache). Offline/no-network workers fall through to the existing RuntimeError, whose message now also mentions Hub reachability alongside the explicit image_* config escape hatch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

hubert-marek · 2026-06-09T20:55:06Z

Superseded by #82 — same branch, correct base (feat/ephemeral-mm-pixels); this one diffs the whole feat branch against main.

eligotts and others added 9 commits May 27, 2026 01:59

Merge remote-tracking branch 'origin/main' into feat/ephemeral-mm-pixels

f7696cd

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

hubert-marek closed this Jun 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/mmraw hub download fallback#83

Fix/mmraw hub download fallback#83
hubert-marek wants to merge 9 commits into
mainfrom
fix/mmraw-hub-download-fallback

hubert-marek commented Jun 9, 2026 •

edited by macroscopeapp Bot

Loading

Uh oh!

hubert-marek commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hubert-marek commented Jun 9, 2026 • edited by macroscopeapp Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add mmraw multimodal payload mode to Qwen VL renderers and fix Hub download fallback for preprocessor_config.json

Uh oh!

hubert-marek commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hubert-marek commented Jun 9, 2026 •

edited by macroscopeapp Bot

Loading

Add mmraw multimodal payload mode to Qwen VL renderers and fix Hub download fallback for `preprocessor_config.json`