Skip to content

perf(rust): cache parsed VNode subtrees for unchanged loop items (#1970, flag-gated)#1973

Merged
johnrtipton merged 4 commits into
mainfrom
perf/parsed-subtree-cache-1970
Jun 27, 2026
Merged

perf(rust): cache parsed VNode subtrees for unchanged loop items (#1970, flag-gated)#1973
johnrtipton merged 4 commits into
mainfrom
perf/parsed-subtree-cache-1970

Conversation

@johnrtipton

Copy link
Copy Markdown
Contributor

Closes #1970.

Extends the per-item loop render cache (#1967/PR #1969) to also cache the parsed VNode subtree, so a reorder of unchanged-content keyed list items skips both render AND html5ever-parse. The render cache cut the render phase, but parse + VDOM-build are ~60% of render_with_diff — the bigger half #1969 couldn't reach (its render-only end-to-end win was Amdahl-bounded to ~6-11%).

Design as built

A second cache + a placeholder-splice with a full-reparse fallback:

  • LoopRenderCache (crates/djust_templates/src/loop_cache.rs) gains a second map (content-hash u64 → parsed Vec<VNode>) keyed by the same content-hash as the render fragments, plus a per-render item manifest. Same LIVEVIEW_CONFIG['loop_render_cache_enabled'] flag, same two cacheability gates (position-independent + item-only body). When off, byte-identical to before.
  • The Node::For arm, for a parse-cache HIT on a foster-parenting-safe item, emits a tiny <dj-pc h=...> placeholder instead of the item HTML — so the assembled string html5ever parses is a short reduced form. render_with_diff / render_binary_diff parse the reduced HTML, splice the cached subtrees back into the placeholders (djust_vdom::splice_loop_placeholders), then re-assign every dj-id.

The dj-id hazard + the strategy (the crux)

dj-ids are assigned purely positionally, pre-order by the parser. A cached subtree carries ids baked at its original parse position, so reusing it verbatim elsewhere duplicates ids — empirically [0,1,2,3,4,1,2] for a 2-of-3 identical-content list vs fresh [0,1,2,3,4,5,6]. Therefore the splice re-walks the assembled tree pre-order, re-assigning ids from the same counter base the full parse would use (0 for initial parse_html; max(old_ids)+1 for continuing parse_html_continue, after the #1550/#1552 ensure_id_counter_at_least bump). This reproduces a fresh full-parse's ids byte-for-byte (proven for initial [0..6] and continuing [7..d]), so the assembled VDOM, every patch (Insert/Replace embed the new node), and last_vdom are identical to the cache-OFF path.

Why a placeholder (not container-location)

The renderer emits a flat HTML string with no enclosing-element context, so it cannot tell djust_live where loop items sit in the tree. Emitting a placeholder lets html5ever place it exactly where the item was — no container-location logic. The catch: html5ever foster-parents non-table content out of <table>/<select> (a <dj-pc> there destroys structure — empirically confirmed). So the For arm gates on the item's own rendered root tag: foster-unsafe roots (tr/td/th/tbody/thead/tfoot/caption/colgroup/col/option/optgroup) skip the parse cache. Foster-unsafe containers, multi-root items, and any splice anomaly (placeholder cache miss / found-count mismatch / a residual <dj-pc>) fall back to a full parse — always correct, just no parse win that render.

Wire determinism

VNode.attrs now serialize in sorted key order (serialize_attrs_sorted). A plain HashMap serializes in nondeterministic bucket order; the parse-cache path (which assembles a node via a different parse than the cache-OFF full parse) would otherwise surface that as an ON-vs-OFF patch-JSON diff. Sorting matches to_html's already-sorted attribute output and makes the patch wire format deterministic (in the spirit of the #1448 wire-protocol pinning).

Both render_with_diff and render_binary_diff are wired symmetrically (parallel-path-drift cure, #1646).

Correctness proof

  • Byte-identity (html + patches + version) cache ON == OFF across plain / keyed / dj-if / cycle / nested / tuple / div / table / select / multi-root templates × initial/reorder/change/append/remove, for both render_with_diff and render_binary_diff. The dj-key reorder round-trip (post-diff dj-ids/dj-keys match cache-off exactly) is the load-bearing case (TestParseCacheByteIdentity1970).
  • Parse-count probe (TestParseCountProbe1970, via loop_parse_cache_hits()/loop_parse_cache_misses()): a reorder of N unchanged keyed items → N parse hits, 0 re-parses; an append re-parses only the new item.
  • Gate-off (tech-debt: implementer-subagent prompt should mandate gate-the-change-off tautology self-test before reporting tests passing #1468): neutering the dj-id re-walk fails 6 byte-identity tests — the re-walk is load-bearing, the tests are non-tautological.
  • New tests: crates/djust_vdom/tests/test_loop_parse_cache_1970.rs (splice + re-walk + gate-off, 5), the parse_cache_1970 module in crates/djust_templates/tests/test_loop_render_cache_1967.rs (foster-safe gate + placeholder + manifest, 8), TestParseCacheByteIdentity1970 / TestParseCountProbe1970 in python/djust/tests/test_loop_render_cache_1967.py (16).

Per-phase bench (median over 60 distinct shuffles, render_with_diff, ON vs OFF)

N parse OFF→ON total OFF→ON
50 0.145 → 0.113 ms (-21.9%) 0.430 → 0.339 ms (-21.3%)
500 1.394 → 1.159 ms (-16.8%) 4.037 → 3.399 ms (-15.8%)

Beats #1969's render-only ~6-11% by also cutting the parse phase (the render phase drops too — the #1969 cache compounds). Richer item bodies widen the win.

Suite results

  • Rust: workspace 860 passed / 0 failed; djust_live 37 passed / 0 failed.
  • Python (all roots): 8661 passed, 21 skipped, 0 failed.
  • cargo fmt --check clean; cargo clippy clean on the three crates.

🤖 Generated with Claude Code

https://claude.ai/code/session_01D4VsbRKefaPjn6Rxs9ig1A

johnrtipton and others added 2 commits June 26, 2026 22:50
…, flag-gated)

Extends the per-item loop RENDER cache (#1967/#1969) to ALSO cache the
PARSED VNode subtree, so a reorder of unchanged-content keyed list items
skips BOTH render AND html5ever-parse — the parse phase is ~60% of
render_with_diff, the bigger half the render cache could not reach.

Design (placeholder + foster-safe gate + post-parse splice with
full-reparse fallback):
- LoopRenderCache gains a SECOND map (content-hash u64 -> parsed Vec<VNode>)
  keyed by the SAME hash as the render fragments, plus a per-render item
  manifest. Same loop_render_cache_enabled flag, same two cacheability gates.
- The For arm, for a parse-cache HIT on a foster-SAFE item (item root not a
  table/select-family tag), emits a tiny <dj-pc h=..> placeholder instead of
  the item HTML; render_with_diff/render_binary_diff parse the REDUCED html
  (cheap), splice the cached subtrees back in, then re-assign ALL dj-ids by a
  pre-order re-walk so the assembled tree is byte-identical to a full parse.
- dj-id strategy: dj-ids are purely positional (parser assigns pre-order), so
  a cached subtree's baked ids are position-WRONG on reuse (proven: naive reuse
  -> [0,1,2,3,4,1,2] duplicates). The re-walk from the same counter base the
  full parse uses reproduces fresh-parse ids exactly (initial + continuing).
- Foster-unsafe containers (table/select), multi-root items, and any splice
  anomaly (cache miss / count mismatch / residual placeholder) fall back to a
  full parse — always correct, no parse win for that render.
- VNode.attrs now serialize in SORTED key order so the patch wire format is
  deterministic (a HashMap serializes in nondeterministic bucket order, which
  the parse-cache path would otherwise expose as an ON-vs-OFF patch diff).

Correctness (byte-identity cache ON == OFF, html + patches + version) proven
across plain/keyed/dj-if/cycle/nested/tuple/div/table/select/multi-root
templates and initial/reorder/change/append/remove ops, for BOTH
render_with_diff and render_binary_diff. The dj-key reorder round-trip
(post-diff dj-ids/dj-keys match cache-off exactly) is the load-bearing case.
Gate-off (#1468) confirmed: neutering the re-walk fails 6 byte-identity tests.

Parse-count probe (TestParseCountProbe1970): a reorder of N unchanged keyed
items -> N parse hits, 0 re-parses; an append re-parses only the new item.

Per-phase bench (median over 60 distinct shuffles, render_with_diff):
  N=50:  parse 0.145->0.113ms (-21.9%), total 0.430->0.339ms (-21.3%)
  N=500: parse 1.394->1.159ms (-16.8%), total 4.037->3.399ms (-15.8%)
beating #1969's render-only ~6-11% end-to-end win.

New tests: crates/djust_vdom/tests/test_loop_parse_cache_1970.rs (splice +
dj-id re-walk + gate-off, 5 cases); the parse_cache_1970 module in
crates/djust_templates/tests/test_loop_render_cache_1967.rs (foster-safe gate
+ placeholder + manifest lifecycle, 8 cases); TestParseCacheByteIdentity1970 /
TestParseCountProbe1970 in python/djust/tests/test_loop_render_cache_1967.py
(end-to-end byte-identity + parse-count probe + gate-off, 16 cases).

Closes #1970.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01D4VsbRKefaPjn6Rxs9ig1A
…1970)

- CHANGELOG [Unreleased] ### Performance: the #1970 parsed-subtree cache
  (design, dj-id re-walk strategy, foster-safe gate, sorted-attrs wire
  determinism, per-phase bench, byte-identity + parse-count + gate-off proof).
- ROADMAP: v1.0.8-3 milestone block (the parse-phase half of #1967/#1969).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01D4VsbRKefaPjn6Rxs9ig1A
@github-actions github-actions Bot added the security-review PR touches security-sensitive code and needs additional review label Jun 27, 2026
@github-actions

Copy link
Copy Markdown

Security Review Required

This PR modifies security-sensitive files. Please ensure the following areas receive additional review:

  • Template Rendering / XSS Prevention: Rust template engine, VDOM, template tags

Review Checklist

  • Changes do not introduce XSS vulnerabilities (no mark_safe(f'...'))
  • Authentication/authorization checks are not weakened
  • Input validation is maintained or improved
  • No secrets or sensitive data exposed in state
  • Existing security tests still pass
  • New security tests added for changed behavior

See SECURITY_GUIDELINES.md for full security review criteria.

@johnrtipton

Copy link
Copy Markdown
Contributor Author

Stage 11 — Adversarial Code Review: REQUEST_CHANGES 🔴

Reviewed in an isolated worktree against PR head 4b249786 (0 commits behind origin/main, not stale). Built the real PyO3 extension (maturin develop --release, verified djust._rust resolves into the review worktree) and ran independent empirical probes — not just the PR's own suite.

The mechanism is well-engineered and the gate-off proof is genuine, but I found one 🔴 byte-identity violation that breaks the PR's central claim (cache ON == OFF) and falsifies the "any anomaly falls back to a full parse — always correct" guarantee.


🔴 MUST-FIX: a user <dj-pc> element (via |safe/mark_safe) collides with the placeholder scheme → ON ≠ OFF, and the "always-correct" fallback parses corrupted HTML

The placeholder protocol is matched by tag name / raw string with no way to distinguish a framework-emitted placeholder from a <dj-pc> element that appears in user content. dj-pc being a custom element name prevents collision with the HTML namespace (as the doc comment argues) — but it does not prevent collision with literal <dj-pc ...> injected through |safe/mark_safe (CMS/rich-text fields, markdown output, any raw-HTML loop content). Auto-escaping saves the default path (<&lt;), so this only bites raw-HTML rendering — but that is a supported, common djust pattern.

Minimal reproducer (run against the built extension):

from djust._rust import RustLiveView
SRC = '<ul>{% for x in xs %}<li dj-key="{{ x.id }}">{{ x.name|safe }}</li>{% endfor %}</ul>'
def render(enabled):
    lv = RustLiveView(SRC); lv.set_loop_render_cache_enabled(enabled)
    lv.update_state({"xs":[{"id":1,"name":"a"},{"id":2,"name":"b"}]}); lv.render_with_diff()
    lv.set_changed_keys(["xs"])
    # item 1 unchanged -> emits a placeholder; item 2 changes to inject a literal <dj-pc>
    lv.update_state({"xs":[{"id":1,"name":"a"},{"id":2,"name":'<dj-pc h="ffff"></dj-pc>X'}]})
    return lv.render_with_diff()[0]
print("ON :", render(True))
print("OFF:", render(False))
ON : <ul dj-id="0"><li dj-id="1" dj-key="1">a</li><li dj-id="2" dj-key="2">X</li></ul>
OFF: <ul dj-id="0"><li dj-id="1" dj-key="1">a</li><li dj-id="2" dj-key="2"><dj-pc dj-id="6" h="ffff"></dj-pc>X</li></ul>

With the cache ON the user's <dj-pc> is silently stripped; with it OFF it's preserved as a real element. Byte-identity violated (HTML, patches, and the version-bumping diff all diverge).

Trigger boundary (empirically bisected):

  • DIVERGES: reduced HTML contains a literal <dj-pc ...></dj-pc> (open tag with a space + a matching </dj-pc>) from |safe content, and ≥1 other item in the same loop emitted a real placeholder this render (so has_loop_placeholders == true).
  • OK: auto-escaped content (default); <dj-pc> with no space; open-only <dj-pc h=..> with no close tag; a single-item list with no co-occurring placeholder (the splice path is skipped). Confirmed on both render_with_diff and render_binary_diff.

Why the "always-correct fallback" doesn't save it (root-caused): in the diverging case try_parse_cache_splice returns None correctly (the injected h="ffff" misses the subtree map → splice_loop_placeholders errors), so the code falls back to parse_full(&html). But html is reconstruct_full_loop_html(reduced, manifest) (crates/djust_live/src/lib.rs:1195), which replaces <dj-pc ...></dj-pc> substrings in document order against the manifest's placeholder entries and, when entries are exhausted, drops the extra <dj-pc> entirely (if let Some(entry) = ph_iter.next()None → nothing emitted). So the fallback parses an already-corrupted string. The validation guard (found != expected || tree_has_placeholder) is downstream of this and never gets correct input. (I also observed parse_hits get bumped to 2 in the colliding render — get_parsed runs before the splice errors — so the probe also over-counts hits in this case.)

Severity: flag is default-OFF, so not a default-path regression; but when enabled it is a real correctness bug for raw-HTML loop content, and the corruption is non-local (the same |safe item renders fine or gets mangled depending on whether a sibling item happened to be unchanged that render). Note also content_hash uses DefaultHasher::new() (fixed-key SipHash, deterministic — crates/djust_templates/src/loop_cache.rs:397), so the h value for any item is predictable; crafted <dj-pc h="<known-hash>"> user content could force a different cached item's subtree to be spliced into that position (content-confusion), not just a drop.

Suggested fixes (any one closes it):

  1. Make the sentinel unforgeable: emit a per-render random nonce in the tag/attr (<dj-pc-<nonce> ...> or data-djn="<nonce>") and only treat elements bearing the current render's nonce as placeholders. User content can't predict it.
  2. Detect collision and fall back on the un-reconstructed string: if the count of <dj-pc occurrences in reduced_html ≠ the manifest placeholder count, abandon the parse-cache path entirely and re-render/parse without placeholders for this render (don't run reconstruct_full_loop_html on a mismatched string).
  3. Add the gate that's already missing: refuse parse-cache eligibility for any item whose rendered HTML itself contains the literal placeholder tag (item_html.contains("<dj-pc")), AND have reconstruct/splice bail to a true full re-render when the parsed <dj-pc> count exceeds the manifest.

Please add a regression test built exactly like the reproducer (a |safe item rendering <dj-pc ...> alongside an unchanged sibling) asserting ON == OFF.


What I independently verified (empirical)

Gate-off (#1468) — the dj-id re-walk IS load-bearing. I neutered rewalk_djust_ids (crates/djust_vdom/src/lib.rs:282) to an immediate return, rebuilt, and re-ran:

  • My own ON-vs-OFF battery: 36 byte-identity checks went RED across 12 scenarios (plain/keyed/div/multiattr/nested-children/div-root-contains-table/attr-order, both render_with_diff and render_binary_diff). The divergence is exactly the predicted stale-baked-id reuse (e.g. appended item gets k/8 instead of M/c).
  • The PR's own python/djust/tests/test_loop_render_cache_1967.py: 8 failed under gate-off (PR body says "6" — the re-walk is more load-bearing than claimed; harmless doc under-count).
  • Restored, rebuilt, re-ran clean. git status empty, HEAD back at 4b249786.

Independent byte-identity probe (cache load-bearing, not trivially-true). Confirmed the parse cache actually fires (keyed reorder = 3 parse hits / 0 misses, single MoveChild patch, dj-ids 3,1,2 matching OFF exactly). My battery passed ON==OFF (html + patches + version) on: plain, keyed, div, multiattr, nested-children, dj-if, table, select, tuple, single-item, empty→refill, reorder+change-same-render, big repeat-reorder, long 8-shuffle sequence, outer-context {{ prefix }}-{{ x.name }} with prefix A→B→C (the #1969 bug class — correctly served fresh, not stale), auto-escaped <dj-pc in data (escaping defuses it), nested foster-unsafe <div>-root containing <table>, void elements, escaped attrs, whitespace/text between items, nested loops, optgroup. Only the |safe-injected real <dj-pc> case failed (the 🔴 above).

Foster-safety / fallback. Table & select reorders → 0 parse hits / 0 misses (foster-unsafe gate disables the cache) and produce correct output (incl. html5ever <tbody> injection) byte-identical to OFF.

Wire determinism (serialize_attrs_sorted). crates/djust_vdom/tests/wire_protocol_snapshot.rs is green (those snapshots use empty attrs, so unaffected); the full suite below is green, so the global attr-sort shifted no other wire/snapshot test. Attr-order-varied template renders ON==OFF.

Suites reproduced green in my build (matching PR claims):

  • Rust cargo test --workspace --exclude djust_live: 860 passed, 0 failed.
  • Rust cargo test -p djust_live --no-default-features: 37 passed, 0 failed.
  • Python pytest tests/ python/tests/ python/djust/tests/ -n auto: 8661 passed, 21 skipped, 0 failed.
  • cargo fmt --check: clean. cargo clippy -p djust_vdom -p djust_templates -p djust_live: clean.

PASS list

  • dj-id re-walk load-bearing (gate-off RED, 36 + 8 failures) ✅
  • keyed dj-key reorder round-trip byte-identical, parse cache engages (3 hits) ✅
  • outer-context staleness (perf(rust): keyed loop-item render cache for large-list render_with_diff (#1967, flag-gated) #1969 class) correctly NOT served stale ✅
  • foster-unsafe (table/select/optgroup) gate + full-parse fallback correct ✅
  • nested foster-unsafe inside cacheable <div> root correct ✅
  • default-OFF inert; auto-escaped <dj-pc in data harmless ✅
  • serialize_attrs_sorted breaks no existing wire/snapshot test ✅
  • parse-count probe (reorder N hits/0; append re-parses only new) ✅
  • both render_with_diff and render_binary_diff wired symmetrically ✅
  • all three suites green; fmt + clippy clean ✅

Verdict: REQUEST_CHANGES — close the <dj-pc> user-content collision (🔴) with a regression test, then this is a clean approve.

johnrtipton and others added 2 commits June 26, 2026 23:25
…e by |safe content (#1970)

Adversarial-review 🔴: a loop item rendering a LITERAL unescaped `<dj-pc ...>`
element via `|safe`/`mark_safe`, alongside a sibling that emitted a real
placeholder that same render, broke byte-identity — cache-ON stripped the
user's `<dj-pc>` (and foster-relocated following content) while cache-OFF
preserved it. Root cause: `reconstruct_full_loop_html` matched the bare
`<dj-pc ...>` substring, so user content was indistinguishable from the
framework sentinel; when manifest entries were exhausted it DROPPED the extra
`<dj-pc>`, and the full-parse fallback then parsed an already-corrupted string.
Aggravator: a fixed-key `DefaultHasher` made the `h=` hash predictable, so
crafted `<dj-pc h="<known-hash>">` content could splice a DIFFERENT cached
item's subtree into that position (content-confusion).

Fix (structural): the placeholder sentinel tag now carries a per-render random
nonce — `dj-pc-<nonce_hex>` instead of bare `dj-pc`. Generated fresh every
render in `LoopRenderCache::begin_render` (RandomState-seeded, per-render,
unpredictable across requests). The For arm emits `<dj-pc-<nonce> h=..>`;
reconstruct + splice match ONLY the current render's nonce tag, so a literal
`<dj-pc>` in user content (no nonce) is never matched -> preserved verbatim, no
hijack. Defense-in-depth: parse-cache eligibility additionally refuses any item
whose rendered HTML contains the literal sentinel prefix
(`item_html_is_foster_safe` -> `contains_sentinel_prefix`), and the splice's
residual-sentinel guard now checks the `dj-pc` PREFIX (`tree_contains_tag_prefix`)
so a stray sentinel of any nonce forces the full-parse fallback. html5ever
keeps `dj-pc-<hex>` intact (valid custom-element name; hex nonce already
lowercase) — verified.

Reproduce-first: the exact review reproducer is RED before this commit (user's
`<dj-pc>` stripped) and GREEN after (byte-identical). Gate-off (#1468):
neutering the nonce to the bare prefix makes the 3 sentinel-collision tests
RED — the nonce is load-bearing. The crafted-`h=` hijack test confirms no
content-confusion. The reorder parse-cache win is unchanged (REORDER still
N parse hits / 0 re-parses); the full byte-identity battery, parse-count
probe, and the Rust (863+37) + Python (8664) suites stay green.

New tests: `TestParseCacheSentinelCollision1970` (|safe literal + crafted-h=
hijack, render_with_diff + render_binary_diff) in
`python/djust/tests/test_loop_render_cache_1967.py`;
`item_with_literal_sentinel_is_ineligible` + the nonce-tag placeholder assertion
in `crates/djust_templates/tests/test_loop_render_cache_1967.rs`;
`literal_unnonced_dj_pc_is_not_spliced` + `gate_off_bare_prefix_would_match_user_content`
in `crates/djust_vdom/tests/test_loop_parse_cache_1970.rs`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01D4VsbRKefaPjn6Rxs9ig1A
Document the adversarial-review 🔴 fix in the #1970 Performance entry: the
placeholder sentinel now carries a per-render random nonce (`dj-pc-<nonce>`) so
a `|safe`/`mark_safe` item rendering a literal `<dj-pc>` can't be mistaken for a
placeholder or hijack a cached subtree via a crafted `h=`. Adds the new
sentinel-collision test classes to the test list + the bare-prefix gate-off.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01D4VsbRKefaPjn6Rxs9ig1A
@johnrtipton

Copy link
Copy Markdown
Contributor Author

Stage 12 — Re-Review: APPROVE ✅

Re-review of the FIX (commits fb9e723b + 91ad6ac8) for the 🔴 I raised on the first pass. Built + probed in an isolated worktree against the rebuilt PyO3 extension; branch is 0 commits behind origin/main (not stale). Every claim below is empirical from my own probes, not the implementer's self-report.

✅ The original 🔴 is CLOSED

Ran the EXACT prior reproducer against the rebuilt extension (a |safe item rendering a literal <dj-pc h="ffff"></dj-pc>X alongside a real cached sibling):

=== render_with_diff ===
ON : <ul dj-id="0">…<li dj-id="2" dj-key="2"><dj-pc dj-id="6" h="ffff"></dj-pc>X</li></ul>
OFF: <ul dj-id="0">…<li dj-id="2" dj-key="2"><dj-pc dj-id="6" h="ffff"></dj-pc>X</li></ul>
BYTE-IDENTICAL: True
=== render_binary_diff ===
BINARY-IDENTICAL: True

The user's literal <dj-pc> is now preserved ON==OFF — both render_with_diff and render_binary_diff. Confirmed the cache genuinely engages (not silently disabled): a 4-item render with 3 unchanged siblings reports parse_hits=3 parse_misses=0 on the 2nd render while still preserving the literal sentinel in the changed item.

✅ Nonce gate-off (#1468) — the nonce is LOAD-BEARING, tests are NOT tautological

Neutered placeholder_tag(nonce) → bare dj-pc prefix (the pre-fix shape) at crates/djust_templates/src/loop_cache.rs:436, rebuilt, re-ran:

  • My reproducer went RED: ON strips the user's <dj-pc h="ffff"><li>X</li>; OFF preserves it → BYTE-IDENTICAL: False (and BINARY-IDENTICAL: False).
  • The 3 integration collision tests went RED: TestParseCacheSentinelCollision1970::test_literal_dj_pc_via_safe_is_byte_identical[True]/[False] and ::test_crafted_hash_cannot_hijack_a_cached_subtree (python/djust/tests/test_loop_render_cache_1967.py:363,383).
  • Restored the nonce, rebuilt → all 3 GREEN again.

Notable nuance the gate-off surfaced: the contains_sentinel_prefix eligibility gate (loop_cache.rs:511) is genuine belt-and-braces but does not by itself close the hole — with the nonce neutered, the now-ineligible item renders its literal <dj-pc> into reduced_html, and reconstruct_full_loop_html strips it. The per-render nonce is the actual load-bearing protection. That's the right design; just flagging that the defense-in-depth gate is not independently sufficient.

✅ Probed for NEW holes the nonce could open — none found

  • Forged nonce'd tag (<dj-pc-deadbeef h="ffff"> in |safe content): preserved verbatim ON==OFF — the attacker can't guess the per-render nonce.
  • Nonce reuse across renders: no divergence over repeated renders; next_render_nonce() (loop_cache.rs:453) draws a fresh RandomState::new() SipHash key (OS entropy) + counter + nanotime per render. It is NOT seeded from item content/hash/any client-observable value.
  • Nonce never leaks to the client: over 20 reorder renders, no dj-pc-<hex> sentinel ever appeared in output — reconstruction is fully server-side, so the "echo the nonce back" attack is structurally impossible.
  • html5ever round-trip: a full dj-key reorder of 5 items → parse_hits=5 parse_misses=0, byte-identical ON==OFF including the dj-id re-walk (dj-id re-walk confirmed load-bearing). html5ever keeps dj-pc-<hex> intact (no lowercase/hyphen mangling).
  • Residual-sentinel guard (tree_contains_tag_prefix, crates/djust_vdom/src/lib.rs:319): a |safe item rendering <dj-pc-cafe h="1"> is preserved verbatim ON==OFF.

🟡 Minor (perf-only, not blocking): contains_sentinel_prefix matches <dj-pc, and tree_contains_tag_prefix matches the dj-pc prefix — so a legitimate user custom element whose tag starts with dj-pc (e.g. <dj-pcard>, <dj-pc-widget>) is refused parse-cache eligibility. This is correctness-safe (verified byte-identical ON==OFF for both) — it just falls back to a normal render. Such element names are vanishingly rare; no change required, noting for completeness.

✅ Prior PASS list re-verified after the fix touched the hot path

  • Outer-context staleness (perf(rust): keyed loop-item render cache for large-list render_with_diff (#1967, flag-gated) #1969 class): suffix v1→v2 with unchanged item bindings correctly re-renders both items (a-v2,b-v2); no stale serve. ON==OFF.
  • Foster gate + fallback: <tr>-in-table reorder byte-identical ON==OFF with dj-id re-walk; <tr> items correctly NOT parse-cached.
  • Default-OFF inert: loop_render_cache_enabled() defaults to False.
  • Reorder win unchanged: 5 hits / 0 misses (matches the claim).

✅ Suites + lint

  • Rust: 863 passed (cargo test --workspace --exclude djust_live) + 37 passed (cargo test -p djust_live --no-default-features), 0 failed — matches the claim exactly.
  • Python: 8664 passed, 21 skipped, 0 failed (pytest tests/ python/tests/ python/djust/tests/ -n auto) — matches the claim exactly.
  • cargo fmt --check clean; clippy on the 3 touched crates clean.
  • WS-auth flake confirmed pre-existing/unrelated: test_mount_batch_with_login_view_does_not_close_shared_socket passed both in the full -n auto run AND in isolation; it lives in python/djust/tests/test_ws_auth_close_socket.py and is untouched by this PR's diff (Rust loop-cache only).

Worktree-isolated throughout; restored the shared .venv editable install to its prior target and confirmed import djust._rust resolves correctly; never touched git config (core.bare = false).

Verdict: APPROVE ✅ — the 🔴 is empirically closed by a load-bearing per-render nonce; no new hole opened.

@johnrtipton johnrtipton merged commit 0116ea4 into main Jun 27, 2026
15 checks passed
@johnrtipton johnrtipton deleted the perf/parsed-subtree-cache-1970 branch June 27, 2026 03:40
@johnrtipton

Copy link
Copy Markdown
Contributor Author

Retrospective — PR #1973 (pipeline-drain)

Task: v1.0.8-3 — drained #1970 (parse-phase loop render cache, the bigger half of the #1967/#1969 lever). Caches parsed VNode subtrees per item (content-hash → Vec<VNode>, same key + flag + two gates as #1969's render cache) so a reorder of unchanged keyed items skips both render AND html5ever-parse. Flag-gated default-OFF.

Quality: 4/5

Correct, well-tested, flag-gated, real end-to-end win (~16–21% reorder render_with_diff, beating #1969's render-only ~6–11% by also cutting the ~60% parse phase) — but, exactly as in #1969, the first pass shipped a real 🔴 the mandatory adversarial hot-path review caught.

What went well

  • The mandatory worktree-isolated adversarial review earned its keep again (2nd PR running): it broke output-identity (a literal <dj-pc> injected via |safe/mark_safe, with a sibling emitting a real placeholder, was silently stripped cache-ON vs preserved cache-OFF), root-caused it to reconstruct_full_loop_html dropping extra placeholders, and flagged the predictable-content_hash → forgeable-h= content-confusion aggravator. The implementer's own green battery + a pre-emptive "this case is clean" attestation had both missed it.
  • The fix was structural, not a patch: a per-render random nonce on the sentinel tag (dj-pc-<nonce>) made the placeholder unforgeable by user content — closing the strip AND the crafted-h= hijack in one change — plus defense-in-depth eligibility/residual guards. Re-review gate-off proved the nonce is the actual load-bearing protection (the prefix-eligibility gate alone is NOT sufficient).
  • The crux dj-id hazard was handled correctly: cached subtrees are id-agnostic, the assembled tree is re-walked pre-order to re-assign ids from the same counter base a full parse uses — byte-for-byte identical to cache-OFF, so the keyed VDOM diff ({% kanban_board %} emits no dj-id anchors → every card move fails VDOM patching and falls back to html_recovery #1678/fix(vdom): fully fix #1678 — dj-if marker preservation + nested-boundary patch ordering #1682) is unaffected. Foster-parenting-unsafe roots (table/select family) fall back to a full parse.

What didn't (lessons)

  • Coverage-completeness recurrence (rc4 rule tech-debt: djust_live cannot be cargo-tested — gate the extension-module feature behind a Cargo flag #1543), again: the "what content can a loop item render" axis had an untested variant — |safe raw HTML containing the literal sentinel — and the bug lived entirely in it. The auto-escaped variant (which the implementer DID test) is safe; the raw-HTML one is not. Self-attestation ("I tested the collision case, it's clean") covered only the tested variant.
  • A peer's self-test does not substitute for the independent gate: the implementer correctly reported its own tests green both times; the independent adversarial review is what found the variant it missed. The two-gate (CI + independent worktree-adversarial) treatment for hot-path/correctness PRs is vindicated for the 2nd consecutive PR in this arc.

Verified

  • Independent adversarial review: Stage 11 REQUEST_CHANGES 🔴 → fix → Stage 12 APPROVE ✅ (both posted above). Original 🔴 reproducer byte-identical ON==OFF on both render_with_diff + render_binary_diff; nonce gate-off RED-proven load-bearing; no new holes from the fix.
  • CI: 13/13 checks green (incl. rust-tests, python-tests py3.12 + py3.14t free-threaded, playwright, browser-smoke, demo-checks, security-tests), 0 failures.
  • Suites reproduced in the reviewer's own build: Rust 863 + 37 / 0 failed; Python 8664 / 0 failed; fmt + clippy clean. One pre-existing WS-auth pollution flake confirmed unrelated (Rust-only PR).
  • 🟡 non-blocking: custom-element tags literally starting with dj-pc (<dj-pcard>) are refused parse-caching by the prefix gate — correctness-safe, vanishingly rare, no change required.

RETRO_COMPLETE

johnrtipton added a commit that referenced this pull request Jun 28, 2026
… ROADMAP bucket (#1974)

main now tracks the 1.1 development line: 1.0.8 shipped (2026-06-23, tag v1.0.8)
and 1.1.0 is the active release (rc1/rc2/rc3 cut on the `1.1` branch). main's
version files lagged at 1.0.8, which led the #1970 drain to mislabel its ROADMAP
milestone as "v1.0.8-3 / ships in 1.0.8". This:

- Bumps the version 1.0.8 → 1.1.0 across pyproject.toml, Cargo.toml, both
  __init__.py files, uv.lock, Cargo.lock (via `make version VERSION=1.1.0`).
- Renames the drain milestone v1.0.8-3 → v1.1.0-1 (ships in 1.1.0) and marks
  #1970 ✅ PR #1973. The #1967/#1969/#1970 work is under [Unreleased] and lands
  in 1.1.0 via the main→`1.1` resync.

1.0 is maintenance-only going forward (bug/security fixes); 1.1 develops on main.

Co-authored-by: Claude <noreply@anthropic.com>
johnrtipton pushed a commit that referenced this pull request Jun 28, 2026
Brings the post-1.0.8 perf arc onto the v1.1 release line: the keyed
loop-item render cache (#1967/PR #1969), the parsed-VNode-subtree cache
(#1970/PR #1973), and the cold-start filter-bridge warm (#1968) — all
flag-gated default-OFF, under LIVEVIEW_CONFIG['loop_render_cache_enabled'].
Also brings the corrected v1.1.0-1 ROADMAP drain bucket and main's version
bump to the 1.1 line.

Version files kept at 1.1.0rc3 (the release branch owns its version during
rc; main's 1.1.0 bump is overridden here per the resync convention).

Validated on the merged tree: cargo check + full Rust suite green
(workspace excl djust_live + djust_live --no-default-features, 37 passed),
full Python suite 8712 passed / 0 failed across tests/ python/tests/
python/djust/tests/.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
johnrtipton added a commit that referenced this pull request Jun 28, 2026
Cut rc4 from main (the 1.1 dev+release line going forward; 1.0.x patches
live on the `1.0` branch). main's CHANGELOG adopts the release-history rc
sections (imported verbatim from the v1.1.0rc3 tag) so future cuts come
from main directly. The new [1.1.0rc4] section is the delta since rc3:

- #1967 keyed loop-item render cache (PR #1969)
- #1970 parsed VNode subtree cache (PR #1973)
- #1968 cold-start filter-bridge warm

All flag-gated default-OFF (LIVEVIEW_CONFIG['loop_render_cache_enabled']).
Coverage verified: every [Unreleased] entry is captured in rc1/rc2/rc3 or
the rc4 delta (no CHANGELOG history lost).

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

security-review PR touches security-sensitive code and needs additional review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: cache PARSED VNode subtrees for unchanged loop items (the parse-phase lever — bigger than #1967)

1 participant