diff --git a/.agents/skills/dev-render-htmlcss-feature/SKILL.md b/.agents/skills/dev-render-htmlcss-feature/SKILL.md
index 3c218236fc..f72bbb9d70 100644
--- a/.agents/skills/dev-render-htmlcss-feature/SKILL.md
+++ b/.agents/skills/dev-render-htmlcss-feature/SKILL.md
@@ -14,6 +14,12 @@ load only when the user explicitly runs it. The loop is a conductor
over `/research`, `/fixtures`, and `/render-reftest` — those auto-trigger
on their own for narrower work.
+**Sibling skill.** For features in the SVG path
+(`crates/grida/src/htmlcss/svg/`, `resvg-test-suite` corpus,
+multi-oracle scoring against expected.png + baked Chrome PNG), use
+[`dev-render-htmlcss-svg-feature`](../dev-render-htmlcss-svg-feature/SKILL.md)
+instead. Same five-phase shape, different tooling.
+
**Lifecycle.** Expect this skill to grow as new divergence patterns
surface. It will likely go stale in parts once htmlcss hits
Chromium-parity on L0/L1; treat the _phase structure_ as durable and
diff --git a/.agents/skills/dev-render-htmlcss-svg-feature/SKILL.md b/.agents/skills/dev-render-htmlcss-svg-feature/SKILL.md
new file mode 100644
index 0000000000..fffece4d2d
--- /dev/null
+++ b/.agents/skills/dev-render-htmlcss-svg-feature/SKILL.md
@@ -0,0 +1,504 @@
+---
+name: dev-render-htmlcss-svg-feature
+description: >
+ Manual-invocation only. Five-phase feature loop (audit → ground →
+ fixture → implement → verify) for driving a single SVG feature to
+ Chromium parity in the grida `htmlcss::svg` renderer. Sibling to
+ `dev-render-htmlcss-feature` (HTML/CSS path); same loop shape,
+ different corpus (resvg-test-suite + Chrome bake) and different
+ scoring (multi-oracle: consensus / disputed / UB).
+---
+
+# grida-htmlcss::svg — feature loop
+
+**What this is.** A heavy, manually-invoked loop for driving a single
+SVG feature forward in `crates/grida/src/htmlcss/svg/`. Do not
+auto-trigger; load only when the user explicitly runs it.
+
+**Sister skill.** [`dev-render-htmlcss-feature`](../dev-render-htmlcss-feature/SKILL.md)
+covers the HTML/CSS path (different code surface, different fixture
+suite, single oracle). This SKILL covers the SVG path: same five-phase
+shape, but different corpus and different scoring policy.
+
+| Aspect | HTML/CSS sister skill | This (SVG) skill |
+| --------------- | -------------------------------------------- | ------------------------------------------------------------------- |
+| Renderer module | `crates/grida/src/htmlcss/` (excl. `svg/`) | `crates/grida/src/htmlcss/svg/` |
+| Corpus | `fixtures/test-html/L0/` (we author) | `fixtures/local/resvg-test-suite/` (vendored, 1679 fixtures) |
+| Oracle | Playwright Chromium (refbrowser, on-the-fly) | `expected.png` (suite author) **+** baked Chrome PNG |
+| Scoring | Single oracle, gate=`L0.exact` floor 1.0 | Multi-oracle: consensus / disputed / UB; gate = consensus pass-rate |
+| Tooling | `cargo run -p grida_wpt -- render --suite …` | `cargo run -p grida_dev -- reftest ` |
+
+---
+
+## Why multi-oracle (read this before phase 1)
+
+The resvg-test-suite ships one `expected.png` per fixture — but that
+PNG is the **suite author's** read of the spec, not a browser
+oracle. For ~12% of fixtures Chrome diverges from `expected.png`
+(disputed) and ~4% have no defined behavior at all (UB). The harness
+ingests the suite's `results.csv` (a 9-renderer status matrix) to
+classify each fixture:
+
+- **Consensus** (`chrome=PASSED` in csv): `expected.png` is
+ authoritative. Optimize against this set — it is the headline
+ parity number.
+- **Disputed** (`chrome=FAILED/CRASHED`): Chrome diverges from
+ `expected.png`. The harness scores against a baked Chrome PNG too;
+ effective score = max(`vs_expected`, `vs_chrome`). Passing on
+ either oracle counts.
+- **UB** (`chrome=UNKNOWN`): excluded from headline parity entirely.
+ Don't optimize, don't regress, don't celebrate a pass.
+
+When you read a score, always read its bucket. A 0.30 score on a
+disputed fixture is meaningless without `vs_chrome`; a 0.30 score on
+a consensus fixture is a real bug.
+
+---
+
+## The five phases
+
+```text
+┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
+│ 1. AUDIT │→ │2. GROUND │→ │3. FIXTURE│→ │ 4. IMPL │→ │5. VERIFY │
+└──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘
+ │ │
+ └───── ← ─── loop ← ─── score < floor ← ─── diff ← ───────┘
+```
+
+### 1. Audit — "what's the actual state of this feature?"
+
+**Question.** Where does the feature live in the htmlcss::svg pipeline
+today? Which resvg fixtures touch it, and where do they fall in the
+oracle buckets?
+
+**Actions.**
+
+- Scan `crates/grida/src/htmlcss/svg/` for the property/element name
+ in dom/parse, style/cascade, layout, paint. SVG features can be
+ parsed-but-dropped, computed-but-misrouted, or just unhandled —
+ each has a different fix shape.
+- Get the suite-wide picture and the worst-fail shortlist:
+ ```sh
+ cargo run --release -p grida_dev -- reftest summary
+ ```
+ Headline = consensus pass-rate. The `worst consensus failures` block
+ is the real-bug shortlist.
+- For each fixture relevant to your feature, run `inspect`:
+ ```sh
+ cargo run --release -p grida_dev -- reftest inspect
+ ```
+ This prints oracle status, all 9 renderer flags, last-run scores
+ vs both oracles, and the four PNG paths. **This is the agent's
+ primary diagnostic tool.** Use `--json` for machine consumption.
+- Filter the run to just the feature's category:
+ ```sh
+ cargo run --release -p grida_dev -- reftest run \
+ --suite-dir fixtures/local/resvg-test-suite \
+ --renderer htmlcss --threshold 0.1 \
+ --filter 'filters_feSpecularLighting'
+ ```
+- Cross-reference [`crates/grida/src/htmlcss/svg/README.md`](../../../crates/grida/src/htmlcss/svg/README.md)
+ for the Blink module map and any prior decision.
+
+**Deliverable.** A short audit note (PR draft / task prompt):
+
+- _Pipeline state_: not-parsed / parsed-but-dropped / partial /
+ Chromium-parity-except-X. Cite the file:line.
+- _Affected fixtures, partitioned by oracle_:
+ - `consensus` (must fix): list with `vs_expected` scores.
+ - `disputed` (situational): list with `vs_expected` and
+ `vs_chrome`. If we already match Chrome, this is a non-issue.
+ - `ub` (ignore): list count only.
+- _Priority bucket_: easy-and-important / easy-low-value /
+ hard-important / hard-low-value.
+
+**Exit when.** You can name the broken pipeline stage **and** point
+at a specific consensus fixture that demonstrates it. If your only
+failing fixture is disputed-but-we-match-chrome, there is no bug to
+fix.
+
+### 2. Ground — "how do real engines solve this?"
+
+**Question.** What's the canonical implementation strategy in a
+mature SVG engine? We adapt; we don't invent.
+
+**Actions.** Invoke `/research`. The references for SVG features:
+
+- **Blink (`third_party/blink/renderer/core/svg/`,
+ `core/layout/svg/`, `core/paint/svg/`)** — authoritative for the
+ layout/paint divergence calls; the engine our chrome-baseline is
+ rendered against.
+- **WebKit (`Source/WebCore/svg/`)** — third voice. Useful when
+ Blink has Chrome-specific behavior (the disputed bucket often
+ contains Chrome quirks).
+- **resvg (`crates/resvg/`, `crates/usvg/`)** — Rust, readable.
+ Useful for parsing and computed-value rules. But remember: when we
+ diverge from resvg toward Chrome, that's correct for the disputed
+ bucket — resvg authored most of the contested expecteds.
+- **Spec** (SVG 1.1 / 2 / specific module). Read this first.
+
+**Deliverable.** A research note with:
+
+- The spec section(s) that govern behavior.
+- 3–6 lines on Blink's structure, plus contrast with resvg's read
+ when relevant.
+- Explicit deviation, if any, and which oracle motivates it (Chrome
+ vs. resvg/expected). Cite the disputed fixture(s) by name.
+
+**Exit when.** You can defend the implementation shape by pointing
+at Blink, **and** if disputed fixtures are involved, you can name
+the oracle the implementation is targeting and why.
+
+### 3. Fixture — "which test proves it?"
+
+**Different from the HTML/CSS path: we don't author SVG fixtures —
+we pick them.** The 1679 resvg-test-suite fixtures already cover
+nearly every SVG primitive. Authoring new ones is reserved for
+genuine gaps in the upstream coverage.
+
+**Actions.**
+
+- Identify 1–3 fixtures that exercise the feature unambiguously.
+ `reftest inspect ` gives you the rel path, oracle status,
+ and current score in one shot.
+- Prefer **consensus** fixtures as your primary target — they have an
+ unambiguous oracle. A consensus fixture going from 0.3 → 1.0 is
+ a clean win.
+- For features whose semantics Chrome interprets differently, you
+ may need a **disputed** fixture as your target. That is fine, but
+ the bake oracle (Chrome PNG) must be present. Run:
+ ```sh
+ cargo run --release -p grida_dev -- reftest bake --filter
+ ```
+ Verify with `inspect` that `chrome.png` is now resolved.
+- If genuinely no upstream fixture covers the case, consider whether
+ the feature is in scope. Authoring a new SVG fixture should be
+ rare; check WPT (`fixtures/local/wpt/svg/`) before authoring.
+
+**Deliverable.**
+
+- A list of 1–3 target fixtures with their rel paths, oracle status,
+ and current scores.
+- For disputed targets: confirmation that the chrome-baseline PNG
+ exists for each.
+
+**Exit when.** Each target fixture renders through `reftest run`
+(filtered), produces all four PNGs in the result dir, and has a
+known starting score against the chosen oracle.
+
+### 4. Implement — "what code change realizes the behavior?"
+
+**Question.** What is the minimum diff in
+`crates/grida/src/htmlcss/svg/` to make the targets pass?
+
+**Actions.**
+
+- Touch the smallest surface possible. Don't combine refactor +
+ feature; the multi-oracle scoring already adds noise to the
+ signal — adding refactor noise on top is a debugging nightmare.
+- Trace the SVG pipeline end-to-end:
+ parse (`dom/`) → style cascade (`style/`) → geometry/layout
+ (`geometry/`, `layout/`) → paint (`paint/`). A feature can fail
+ at any stage.
+- Add unit tests where behavior is data-assertable (computed paint
+ server, resolved length, geometry). The Rust tests catch
+ regressions the reftest cannot, especially for SVG's lots of
+ "value resolved correctly but ended up at the wrong z-position"
+ bugs.
+- Mirror the Blink structure when in doubt; the htmlcss::svg module
+ map docs are explicit about Blink anchors.
+
+**Deliverable.**
+
+- Code change scoped to the feature.
+- Any new data tests.
+- A one-line behavior summary in the PR description, written in
+ spec terms (e.g. "feSpecularLighting now treats specularExponent=0
+ as the default 1, matching Chrome / SVG 1.1 §15.21.4").
+
+**Exit when.** `cargo check -p grida -p grida_dev` is clean,
+existing unit tests pass, and the targeted fixtures render via
+`reftest run` without panic.
+
+### 5. Verify — "does it actually match Chromium?"
+
+**Actions.**
+
+1. Run the targeted slice:
+ ```sh
+ cargo run --release -p grida_dev -- reftest run \
+ --suite-dir fixtures/local/resvg-test-suite --renderer htmlcss \
+ --threshold 0.1 --filter
+ ```
+2. Read the consensus pass-rate and worst-N:
+ ```sh
+ cargo run --release -p grida_dev -- reftest summary --report \
+ target/reftests/resvg-test-suite.htmlcss/report.json
+ ```
+3. For every target fixture, `reftest inspect `. Read both
+ `vs_expected` and `vs_chrome`. If the oracle is `consensus`, both
+ should match. If `disputed`, at least one must clear `pass_floor`
+ (default 0.95).
+4. **Read the diff PNG.** `inspect` prints its path. A high
+ similarity score on a sparse fixture can mask a completely
+ broken feature. Open the PNG.
+5. Run the **full suite** at least once before declaring done. New
+ features routinely regress neighbors via shared codepaths
+ (cascade ordering, paint server resolution, etc.):
+ ```sh
+ cargo run --release -p grida_dev -- reftest run \
+ --suite-dir fixtures/local/resvg-test-suite --renderer htmlcss \
+ --threshold 0.1
+ cargo run --release -p grida_dev -- reftest summary
+ ```
+ Compare the headline consensus pass-rate to the pre-change number.
+ It must not drop. If it dropped, something regressed — diff the
+ `worst_consensus` lists before vs. after.
+
+**Close the loop.**
+
+- Headline rose, no consensus regressions? Done.
+- Headline rose but disputed bucket got worse? Investigate. We may
+ have over-fit to the resvg interpretation and broken Chrome
+ alignment.
+- Headline dropped on consensus? Stop. Either revert or fix; do
+ **not** lower `pass_floor`.
+
+**Deliverable.** PR description with:
+
+- Before/after consensus pass-rate (from `reftest summary`).
+- Per-fixture before/after scores for the targets, with both
+ `vs_expected` and `vs_chrome` when relevant.
+- Diff PNG review for any score < 1.0.
+- Specific divergence-surface note for any residual gap.
+
+**Exit when.** Someone reading the PR description without the
+context can tell exactly which fixtures moved, which oracle they
+moved against, and whether anything regressed.
+
+---
+
+## Handoffs and artifacts
+
+| Phase | Artifact | Location |
+| --------- | --------------------------------------------------------- | --------------------------------------------------------------- |
+| Audit | Current-state note, oracle-partitioned fixture list | PR description / task prompt |
+| Ground | Research note (spec + Blink/resvg cross-ref) | PR description or `docs/wg/feat-2d/` |
+| Fixture | Target fixture list with oracle status; baked chrome.png | `fixtures/local/resvg-test-suite/chrome-baseline/` (gitignored) |
+| Implement | Code change, data tests, spec-language behavior summary | `crates/grida/src/htmlcss/svg/` |
+| Verify | Before/after summary, per-fixture scores, diff PNG review | PR description |
+
+---
+
+## Gate policy
+
+The gate is the **consensus pass-rate** — `oracle_buckets.consensus.passing
+/ oracle_buckets.consensus.total` from `report.json` (aka the headline
+in `reftest summary`).
+
+- The consensus pass-rate must not drop. Period.
+- Disputed bucket: track but do not gate. Improvements here are good;
+ small regressions are tolerable if `vs_chrome` improved overall.
+- UB bucket: never gate, never optimize, never report.
+- `pass_floor` (default 0.95) is the per-fixture passing threshold.
+ Don't relax it to "make a fixture pass." If a fixture is slipping
+ by 0.001, fix the renderer or eat the score.
+
+### What "destructive" means here
+
+A change is destructive if it:
+
+- Lowers the consensus pass-rate.
+- Lowers `pass_floor` in `reftest.toml`.
+- Removes `[test.oracles]` config or detaches the chrome baseline.
+- Increases `--threshold` to absorb real divergence.
+- Reclassifies a UB fixture as consensus by editing `results.csv` to
+ fit our renderer.
+
+None are acceptable without explicit human approval.
+
+---
+
+## Anti-patterns
+
+| Anti-pattern | Why it fails | Instead |
+| ---------------------------------------------------------- | --------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------- |
+| Optimizing against `expected.png` for a disputed fixture | You're aligning with the suite author, not the browser. Chrome users will see something else. | Bake the chrome baseline; align there. |
+| Treating UB fixtures as bugs | UB has no oracle; "passing" or "failing" is meaningless. | Filter `oracle_status: "ub"` out of your worklist. |
+| Reading `similarity_score` without reading `oracle_status` | Same number means different things across buckets. | Always `inspect` first. The bucket is the context. |
+| Skipping audit, fixing the worst score | The worst score is often UB or disputed-we-match-chrome — fixing them is wrong-headed. | Triage `summary worst_consensus` first; that is the real-bug list. |
+| Combining renderer change + chrome rebake in one PR | Reviewer can't tell which delta came from which. | Bake first, land separately. Or note explicitly which bake was needed. |
+| Authoring a new SVG fixture | The 1679-fixture suite already covers ~all primitives. | Find the existing one. Author only when WPT also lacks coverage. |
+| Lowering `pass_floor` because a fixture is at 0.94 | The floor exists so regressions are loud. | Fix the renderer. Or document the residual surface. |
+| Claiming "verified" without running the full suite | Targeted runs miss neighbor regressions through shared codepaths. | Always do a full-suite check before declaring done. |
+
+---
+
+## Loop mode (autonomous /loop driver)
+
+When invoked via `/loop`, each iteration drives the full five-phase
+cycle end-to-end on one target. Phases do **not** split across wakes.
+The protocol below makes iterations idempotent and resumable.
+
+### State file
+
+In-flight loop state lives in the project memory at
+`memory/project_dev_render_htmlcss_svg_loop.md`. Two fields:
+
+- _Active target_: `` + phase reached (1–5) + brief note.
+- _Pre-iteration consensus pass-rate_: snapshot from
+ `reftest summary --json | jq '.headline.consensus_pass_rate'`,
+ written before phase 4 so phase 5 can detect regressions.
+
+If the file is absent or the active target is empty, the iteration
+auto-picks a new target.
+
+### Skip list
+
+Targets the loop has tried but should not retry sit in
+`memory/project_dev_render_htmlcss_svg_skip.md`:
+
+- One `` per line, optional `# reason` suffix.
+- Loaded each iteration; matching targets are excluded from auto-pick.
+- Reasons to add: three iterations without progress, known-OOS feature,
+ blocked on upstream fix.
+
+### Per-iteration protocol
+
+```
+1. Read state.
+ - If active target exists, resume there.
+ - Else: auto-pick (see below).
+
+2. Pre-flight (skip if resuming past phase 1):
+ - reftest inspect --json
+ - If oracle_status == "ub": add to skip list, restart at 1.
+ - If oracle_status == "disputed" AND vs_chrome >= pass_floor:
+ add to skip list (we already match chrome), restart at 1.
+ - If oracle_status == "disputed" AND chrome.png is missing:
+ reftest bake --filter ; re-inspect.
+
+3. Snapshot pre-iteration consensus pass-rate to state.
+
+4. Drive phases 1-5 on the target.
+
+5. Verify gate (phase 5):
+ - Diff post vs. pre consensus_pass_rate.
+ - If dropped: revert the diff, log "regression on ", add
+ to skip list with reason. Do NOT continue to next target.
+ - If improved or held: clear active target, commit, update state.
+
+6. Update state. If target hit pass_floor, clear it. If progress was
+ made but not done, leave active target with new phase number.
+
+7. Increment "no-progress streak" if score didn't improve. At 3,
+ move target to skip list and clear.
+```
+
+### Auto-pick
+
+```sh
+cargo run --release -p grida_dev -- reftest summary --json \
+ | jq -r '.worst_consensus[].test_name'
+```
+
+Walk the list top-down; pick the first whose `test_name` is **not**
+in the skip list. If the list is empty, the loop's job is done —
+write "consensus saturated; no more bugs in worklist" to state and
+stop.
+
+### Termination
+
+The loop stops (and pings the user) when any of:
+
+- `summary --json | jq '.headline.consensus_pass_rate' >= 0.99` —
+ only edge cases remain; switch to manual.
+- `worst_consensus` is empty after skip-list filtering — nothing
+ left to autonomously work on.
+- Three consecutive iterations with no commit — escalate.
+- Any `reftest run` panic that isn't a fixture-level error.
+
+### Bake invariant
+
+The loop must never bake on every iteration — it's a one-time cost
+per fixture per Chrome version. Only invoke `reftest bake` when:
+
+- The target's category has chrome.png missing for at least one
+ disputed fixture (detected via `inspect`).
+- An explicit `bake-needed` flag is set in state by a prior
+ iteration.
+
+`reftest bake --retry-failed` is used after any bake produces
+`BAKE_ERRORS.log`; do not skip this.
+
+### What loop mode does NOT do
+
+- Edit `reftest.toml` (gate config is human-only).
+- Edit `results.csv` (oracle source-of-truth is upstream-only).
+- Force-merge despite a regression (see gate policy).
+- Author new fixtures (the corpus is vendored).
+
+---
+
+## The template — paste this to kick off a cycle
+
+Fill in the brackets. Expect to run the loop in passes
+(audit+ground+fixture → implement → verify), with a checkpoint at
+each pass that future-you or a reviewer can read without the
+conversation.
+
+```text
+Drive the htmlcss::svg feature loop for: .
+Follow the dev-render-htmlcss-svg-feature skill
+(.agents/skills/dev-render-htmlcss-svg-feature/SKILL.md).
+
+Scope:
+- Feature:
+- Hypothesis:
+- Target:
+
+Produce, in order:
+
+1. Audit note: pipeline stage, fixture list partitioned by oracle bucket
+ (consensus / disputed / ub), before-scores from `reftest summary` and
+ `reftest inspect`.
+2. Ground note: spec section, Blink approach, resvg approach (when
+ different — explain which oracle the implementation targets).
+3. Fixture list: 1-3 target fixtures with rel paths, oracle status,
+ baked chrome.png present (yes/no).
+4. Implementation: minimal diff in crates/grida/src/htmlcss/svg/.
+ Data tests where assertable.
+5. Verify report:
+ - before/after consensus pass-rate (from `reftest summary`)
+ - per-fixture before/after vs_expected and vs_chrome
+ - diff PNG review for any sub-1.0 score
+ - confirm no consensus regressions in the full-suite run
+
+Gate: consensus pass-rate must not drop. pass_floor stays at 0.95.
+Do not relax the gate. If a target doesn't reach pass_floor, leave a
+specific divergence-surface note.
+
+Tools:
+- `cargo run --release -p grida_dev -- reftest summary [--json]`
+- `cargo run --release -p grida_dev -- reftest inspect [--json]`
+- `cargo run --release -p grida_dev -- reftest run [--filter ]`
+- `cargo run --release -p grida_dev -- reftest bake [--filter ]`
+
+Use /research for phase 2.
+```
+
+---
+
+## Quick reference — reftest subcommands
+
+| Command | Purpose |
+| -------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------- |
+| `reftest run [--filter pat]` | Render fixtures, score against expected.png + chrome.png, write `report.json`. |
+| `reftest summary [--json]` | Headline consensus pass-rate + per-bucket stats + worst-N consensus failures. |
+| `reftest inspect [--json]` | Per-fixture diagnostic: oracle flags, scores, PNG paths. Accepts `cat_group_name` or `cat/group/name.svg`. |
+| `reftest bake [--filter pat] [--retry-failed] [--force]` | Bake Chrome PNGs into `/chrome-baseline/`. Idempotent; `--retry-failed` batches prior errors into one node invocation. |
+| `reftest view ` | Serve the dashboard. |
+
+See [`crates/grida_dev/AGENTS.md`](../../../crates/grida_dev/AGENTS.md) and
+[`crates/grida/src/htmlcss/svg/README.md`](../../../crates/grida/src/htmlcss/svg/README.md)
+for the underlying contracts.
diff --git a/.agents/skills/render-reftest/SKILL.md b/.agents/skills/render-reftest/SKILL.md
index d35f002273..2245c57718 100644
--- a/.agents/skills/render-reftest/SKILL.md
+++ b/.agents/skills/render-reftest/SKILL.md
@@ -130,20 +130,44 @@ strategy before authoring a new fixture.
### SVG — pre-baked reference vs. resvg-generated reference
-Two strategies, depending on whether a reference image already exists:
+Three strategies, depending on what oracle data is available:
-**Strategy A — pre-baked reference (preferred when available)**
+**Strategy A — pre-baked reference (single oracle)**
Use if the SVG comes from a test suite that ships reference PNGs alongside it
-(W3C SVG 1.1, resvg-test-suite, Oxygen Icons). The oracle is the co-located
-PNG; `grida_dev reftest` picks it up automatically via `reftest.toml`.
+(W3C SVG 1.1, Oxygen Icons). The oracle is the co-located PNG;
+`grida_dev reftest run` picks it up automatically via `reftest.toml`.
```sh
# W3C suite — reference PNGs are in png/ next to svg/
-cargo run -p grida_dev --release -- reftest \
+cargo run -p grida_dev --release -- reftest run \
--suite-dir fixtures/local/W3C_SVG_11_TestSuite --bg white
```
+**Strategy A+ — pre-baked reference + Chrome bake (multi-oracle)**
+
+Use for `resvg-test-suite`. The vendored `expected.png` is the suite
+author's read of the spec, but for ~12% of fixtures Chrome diverges
+from it. The harness ingests the suite's `results.csv` (a 9-renderer
+status matrix) and a baked Chrome PNG to classify each fixture into
+**consensus** / **disputed** / **UB** buckets. Effective per-fixture
+score is `max(vs_expected, vs_chrome)`; the headline parity number is
+the consensus pass-rate (excludes disputed and UB).
+
+```sh
+# One-time: bake Chrome PNGs (puppeteer; deterministic per Chrome version)
+cargo run -p grida_dev --release -- reftest bake
+
+# Run + summarize
+cargo run -p grida_dev --release -- reftest run \
+ --suite-dir fixtures/local/resvg-test-suite --renderer htmlcss
+cargo run -p grida_dev --release -- reftest summary
+```
+
+For the full driver loop (audit → ground → fixture → impl → verify
+against the multi-oracle gate), see
+[`dev-render-htmlcss-svg-feature`](../dev-render-htmlcss-svg-feature/SKILL.md).
+
**Strategy B — resvg as dynamic oracle (when no pre-baked image exists)**
For arbitrary SVG files with no reference PNG, resvg is an independent,
diff --git a/.agents/skills/research/SKILL.md b/.agents/skills/research/SKILL.md
index 46f70d2000..406b2f1034 100644
--- a/.agents/skills/research/SKILL.md
+++ b/.agents/skills/research/SKILL.md
@@ -3,16 +3,15 @@ name: research
description: >
Research upstream and peer projects to inform Grida's design and
implementation. Use when investigating how Chromium, Skia, Servo,
- Taffy, or peer canvas editors solve a problem before writing code. Covers
- source-code exploration, research document authoring, and the
- study-adapt-differ pattern used in .plan.md files. Relevant dirs:
- docs/wg/research/, docs/wg/feat-2d/, crates/grida/.
+ Taffy, or peer canvas editors solve a problem before writing code.
+ Covers source-code exploration and research document authoring under
+ docs/wg/research/.
---
# Code Research Skill
-Workflow for going from "how does X work?" to a documented, actionable
-plan grounded in prior art from upstream and peer projects.
+Workflow for going from "how does X work?" to a documented survey
+grounded in prior art from upstream and peer projects.
## When to Use This Skill
@@ -20,7 +19,7 @@ plan grounded in prior art from upstream and peer projects.
- Looking up undocumented Skia API behavior
- Understanding CSS feature semantics before adding support
- Comparing canvas editor architectures (excalidraw, tldraw)
-- Writing or extending `docs/wg/research/` documents or `.plan.md` files
+- Writing or extending `docs/wg/research/` documents
---
@@ -28,220 +27,131 @@ plan grounded in prior art from upstream and peer projects.
Before touching any external repo, check what Grida already knows.
-### 1. Check existing research
-
-```text
-docs/wg/research/chromium/ # 15 docs covering the full compositor pipeline
-├── index.md # START HERE — topic map
-├── glossary.md ├── compositor-architecture.md
-├── property-trees.md ├── render-surfaces.md
-├── damage-tracking.md ├── paint-recording.md
-├── tiling-and-rasterization.md ├── tiling-deep-dive.md
-├── memory-and-priority.md ├── scheduler.md
-├── interaction-and-quality.md
-├── resolution-scaling-during-interaction.md
-├── pinch-zoom-deep-dive.md └── effect-optimizations.md
-```
-
-### 2. Check plan documents
-
-Key plan docs with distilled research:
-
-- `docs/wg/feat-2d/zoom-compositor-strategy.plan.md` — Chromium pinch-zoom adaptation
-- `docs/wg/feat-2d/renderer-rewrite.plan.md` — Chromium compositor mapping
-- `docs/wg/feat-2d/optimization.md` — master optimization catalog
-
-### 3. Check code cross-references
-
-```sh
-grep "chromium\|servo\|adapted from\|ported from\|based on" --include="*.rs"
-```
-
-Known citations:
-
-- `crates/grida/src/runtime/effect_tree.rs` — Chromium EffectTree/EffectNode
-- `crates/csscascade/src/rcdom/mod.rs` — Servo html5ever rcdom
-- `crates/csscascade/` — Servo style system (`style::servo::*`)
-- `third_party/usvg/src/text/layout.rs` — Chromium font metrics
-- `packages/grida-cmath/index.ts` — Snap.svg arc-to-cubic
-
-### Discovery queries
-
| What you need | How to find it |
| ---------------------------- | ------------------------------------------------------------- |
-| Existing research on a topic | `docs/wg/research/chromium/index.md` |
-| Plan docs with external refs | `grep "Chromium\|servo\|upstream" docs/wg/**/*.plan.md` |
+| Existing research on a topic | `docs/wg/research/chromium/index.md` (topic map) |
| Code that cites sources | `grep "based on\|adapted from\|ported from" --include="*.rs"` |
| Vendored third-party code | `ls third_party/` |
| Feature docs for a subsystem | `ls docs/wg/feat-*/` |
+If already documented, cite it and move on.
+
---
## Reference Repositories
**Local clones (optional):** If `~/Documents/GitHub/` exists, it may contain default-style clones (sibling dirs named by repo, e.g. `skia`). Prefer searching there before cloning or using only the web.
-### Graphics & Rendering
-
-| Repo | Lang | When to reference | Key paths |
-| --------------------------------------------------- | ---- | ----------------------------------------------------------------------------- | ----------------------------------------------------- |
-| [chromium](https://github.com/chromium/chromium) | C++ | Skia usage, compositing, layer trees, paint scheduling, tiling, GPU resources | `cc/` `third_party/blink/renderer/` `components/viz/` |
-| [skia](https://github.com/google/skia) | C++ | Undocumented API behavior, GPU internals, filter details | `src/gpu/` `src/core/` `src/effects/` |
-| [rust-skia](https://github.com/rust-skia/rust-skia) | Rust | Rust binding ergonomics — our direct `skia-safe` dependency | `skia-safe/src/` |
-| [resvg](https://github.com/linebender/resvg) | Rust | SVG rendering, path conversion, filter effects | `crates/resvg/src/` `crates/usvg/src/` |
-
-### Web Standards & CSS
-
-| Repo | Lang | When to reference | Key paths |
-| -------------------------------------------- | ---- | ------------------------------------------------------------------------- | ---------------------------------------- |
-| [servo](https://github.com/servo/servo) | Rust | CSS layout, DOM, Rust browser-engine patterns. We vendor its style system | `components/style/` `components/layout/` |
-| [stylo](https://github.com/servo/stylo) | Rust | CSS parsing and style resolution | `style/` |
-| [taffy](https://github.com/DioxusLabs/taffy) | Rust | Flexbox/Grid layout algorithms and Rust-native layout engine internals | `src/tree/` `src/compute/` `src/style/` |
-
-### Canvas Editor Peers
+| Repo | Lang | When to reference | Key paths |
+| ------------------------------------------------------ | ---- | ----------------------------------------------------------------------------- | ----------------------------------------------------- |
+| [chromium](https://github.com/chromium/chromium) | C++ | Skia usage, compositing, layer trees, paint scheduling, tiling, GPU resources | `cc/` `third_party/blink/renderer/` `components/viz/` |
+| [skia](https://github.com/google/skia) | C++ | Undocumented API behavior, GPU internals, filter details | `src/gpu/` `src/core/` `src/effects/` |
+| [rust-skia](https://github.com/rust-skia/rust-skia) | Rust | Rust binding ergonomics — our direct `skia-safe` dependency | `skia-safe/src/` |
+| [resvg](https://github.com/linebender/resvg) | Rust | SVG rendering, path conversion, filter effects | `crates/resvg/src/` `crates/usvg/src/` |
+| [servo](https://github.com/servo/servo) | Rust | CSS layout, DOM, Rust browser-engine patterns. We vendor its style system | `components/style/` `components/layout/` |
+| [stylo](https://github.com/servo/stylo) | Rust | CSS parsing and style resolution | `style/` |
+| [taffy](https://github.com/DioxusLabs/taffy) | Rust | Flexbox/Grid layout algorithms | `src/tree/` `src/compute/` `src/style/` |
+| [excalidraw](https://github.com/excalidraw/excalidraw) | TS | Canvas API optimization, rendering heuristics, interaction | `packages/excalidraw/renderer/` |
+| [tldraw](https://github.com/tldraw/tldraw) | TS | CRDT state model, modular SDK, DOM/SVG canvas | `packages/editor/` `packages/store/` |
-| Repo | Lang | When to reference | Key paths |
-| ------------------------------------------------------ | ---- | ---------------------------------------------------------- | -------------------------------------------------------------- |
-| [excalidraw](https://github.com/excalidraw/excalidraw) | TS | Canvas API optimization, rendering heuristics, interaction | `packages/excalidraw/renderer/` `packages/excalidraw/element/` |
-| [tldraw](https://github.com/tldraw/tldraw) | TS | CRDT state model, modular SDK, DOM/SVG canvas | `packages/editor/` `packages/store/` |
-
-### Searching large repos
-
-**Chromium** — use https://source.chromium.org/ for fast browsing. Narrow to: `cc/layers/` `cc/tiles/` `cc/trees/` (compositing), `cc/paint/` (recording), `cc/scheduler/` (frame scheduling).
-
-**Servo** — `components/style/properties/` (CSS props), `components/style/stylist.rs` (resolution), `components/layout/` (layout).
-
-**Skia** — headers in `include/core/` `include/effects/`, GPU in `src/gpu/ganesh/` (GL) or `src/gpu/graphite/` (Metal/Vulkan).
+**Searching large repos:** use https://source.chromium.org/ for Chromium. Narrow to `cc/layers/` `cc/tiles/` `cc/trees/` (compositing), `cc/paint/` (recording), `cc/scheduler/` (frame scheduling). Servo: `components/style/properties/`, `components/style/stylist.rs`, `components/layout/`. Skia: `include/core/` `include/effects/`, GPU in `src/gpu/ganesh/` (GL) or `src/gpu/graphite/` (Metal/Vulkan).
---
## The Research Workflow
-### Step 1: Frame the question
-
-Write a specific, bounded question. Bad: "how does Chromium handle rendering?" Good: "how does Chromium decide which layers get their own composited surface?"
-
-### Step 2: Check existing knowledge
-
-1. Read `docs/wg/research/chromium/index.md`
-2. `grep "" docs/wg/**/*.plan.md`
-3. `grep "" --include="*.rs" crates/`
-4. Read `docs/wg/feat-2d/optimization.md`
-
-If already documented, cite it and move on.
-
-### Step 3: Explore the source
-
-Use targeted searches — do not read entire codebases:
-
-```sh
-grep "struct LayerTreeHost" --include="*.h" -r cc/ # find the owning type
-grep "ShouldCreateRenderSurface" --include="*.cc" -r cc/ # find decision points
-grep "kDefault.*Tile\|kMax.*Memory" --include="*.h" -r cc/ # find design constants
-```
-
-### Step 4: Extract findings
-
-For each finding, record: **what** (mechanism + file path), **why** (rationale), **constants** (thresholds/heuristics), **applicability** (maps to our architecture?).
-
-### Step 5: Document or apply
+1. **Frame the question.** Specific and bounded. Bad: "how does Chromium handle rendering?" Good: "how does Chromium decide which layers get their own composited surface?"
+2. **Check existing knowledge** in `docs/wg/research/` and code comments before going upstream.
+3. **Explore the source** with targeted searches — never read entire codebases:
+ ```sh
+ grep "struct LayerTreeHost" --include="*.h" -r cc/ # owning type
+ grep "ShouldCreateRenderSurface" --include="*.cc" -r cc/ # decision points
+ grep "kDefault.*Tile\|kMax.*Memory" --include="*.h" -r cc/ # design constants
+ ```
+4. **Extract findings:** **what** (mechanism + file path), **why** (rationale), **constants** (thresholds/heuristics).
+5. **Document:**
| Scope | Action |
| ---------------------------- | --------------------------------------------- |
| Quick answer | Code comment citing the source |
| Reusable subsystem knowledge | Research doc in `docs/wg/research//` |
-| New feature design | "Reference Approach" section in `.plan.md` |
| Confirming existing approach | Update the relevant research doc |
---
## Writing Research Documents
-Docs live in `docs/wg/research//`. Create new subdirectories as needed (`servo/`, `skia/`).
+> **A research document is a pure survey of how upstream solves a problem.** It describes
+> the upstream system on its own terms, in enough depth that a reader could reimplement
+> the design from the doc alone. It is **not** a plan, a proposal, or a gap analysis —
+> Grida should be essentially absent from these pages.
-Every research document must contain:
-
-1. **Title and scope** — what subsystem, what questions it answers
-2. **Source references** — upstream file paths (with commit hash if volatile)
-3. **Architecture description** — how the subsystem works, with diagrams for pipelines
-4. **Key data structures** — important types and relationships (use upstream names)
-5. **Constants and heuristics** — magic numbers and their reasoning
-6. **Relevance to Grida** — how this maps or doesn't map to our architecture
+Docs live in `docs/wg/research//`. Create new subdirectories as needed (`servo/`, `skia/`).
-**Conventions:** Use upstream terminology. Include short code excerpts (5-15 lines) with file path citations. Organize by concept, not by file. Update `index.md` when adding new docs. File names: lowercase, hyphenated, topic-descriptive.
+### Stay in survey mode
----
+Write as if Grida did not exist. The reader is someone trying to understand the upstream project — Chromium, Skia, Servo, etc. — not someone planning a Grida change. Concretely:
-## Applying Research to Plan Documents
+- **Frame in upstream terms.** "Blink resolves `clip-path` by..." (✅) — not "Blink does X, which is what we'd need..." (❌).
+- **Use upstream names** for types, files, functions, constants.
+- **Quote the spec, the source, the upstream commit.** Anchor every claim to a file path or spec section.
+- **Compare upstream to upstream.** Chromium vs. resvg vs. Servo belongs here; Chromium vs. Grida does not.
+- **Spec gotchas, magic numbers, edge cases** are on-topic — they explain the upstream design.
-Grida uses a **study-adapt-differ** pattern. Every `.plan.md` referencing external work needs three sections:
+### Keep Grida out of the body
-### 1. "Reference Approach" (study)
+Forbidden in research docs:
-Describe how upstream solves it — cite the research doc, name types/algorithms, include key constants:
+- "Relevance to Grida" / "What we borrow" / "What we differ on" sections.
+- Citations of `crates/grida/...` or any in-repo file path.
+- "Plan for our implementation" / "Implementation checklist" sections.
+- Sentences using "we", "our renderer", "our codebase", or "our fix" for Grida-side work.
+- "This is the gap blocking N fixtures" intros, "TODO" lists, "where the fix lands" footers.
-```markdown
-## Chromium's Approach (Reference)
+A neutral "the Rust binding for this Skia API is `skia_safe::FooBar`" is fine — it surveys the primitive. Becomes off-limits the moment the sentence ties it to Grida ("we use this at filename.rs:42").
-Chromium uses a CoverageIterator that walks tilings from highest to
-lowest resolution. During pinch-zoom, TreePriority is set to
-SMOOTHNESS_TAKES_PRIORITY to show stretched tiles rather than checkerboard.
-See: `docs/wg/research/chromium/pinch-zoom-deep-dive.md`
-```
+### Required structure
-### 2. "What We Borrow" (adapt)
+1. **Title and scope** — what subsystem, what questions it answers (in upstream terms).
+2. **Source references** — upstream file paths (with commit hash if volatile).
+3. **Architecture description** — how the upstream subsystem works, with diagrams for pipelines.
+4. **Key data structures** — important types and relationships (use upstream names).
+5. **Constants and heuristics** — magic numbers and their reasoning.
+6. **Cross-project comparison** (when relevant) — Chromium vs. resvg vs. Servo vs. Skia, each on its own terms.
-Mapping table from upstream concepts to our types:
+**Conventions:** Upstream terminology. Short code excerpts (5–15 lines) with file path citations. Organize by concept, not by file. Update `index.md` when adding new docs. File names: lowercase, hyphenated, topic-descriptive.
-```markdown
-| Chromium | Grida | Notes |
-| ----------------------------- | ------------------------------- | ---------------------------- |
-| Stale-tile GPU stretching | `LayerImage` reuse at old scale | Per-node instead of per-tile |
-| Power-of-2 raster scale steps | `snap_to_power_of_two()` | Reduces re-rasterization |
-```
+### Review your draft before saving
-### 3. "What We Do Differently" (differ)
+Treat this as a required step — past drafts have repeatedly slipped Grida-side content into research, and it's cheapest to catch right before save:
-What we're NOT adopting and why. This prevents future contributors from "fixing" intentional divergences:
+- [ ] Search for `Grida`, `grida`, `crates/grida`, `our `, `we `, `we'd`, `we have`, `we use`. Every match must justify itself — usually by being removed.
+- [ ] Skim for headings like "Relevance to ...", "What we borrow", "What we differ on", "Plan for ...", "Implementation checklist", "Where our code is wrong". Delete them.
+- [ ] Check the intro and conclusion. Framed around an upstream question, or around a Grida gap? Reframe to upstream.
+- [ ] Check "See also" / "References". Internal Grida paths there are a smell.
+- [ ] If the doc would be useless to a reader who didn't know Grida exists, it isn't a research doc yet. Move the Grida-side content out.
-```markdown
-- **No multiple concurrent tilings.** We cache per-node, not per-tile.
- WASM memory constraints make multi-tiling impractical.
-- **No worker-thread rasterization.** Single-threaded WASM constraint.
- We compensate with time-budgeted incremental re-raster.
-```
+If you find yourself wanting to write "and here's how this maps to our renderer", stop — that's not what a research doc is for.
---
## Pitfalls
-**Researching what's already documented.** Check `docs/wg/research/`, plan docs, and code comments first. The Chromium research alone is 15 documents.
-
-**Cargo-culting without understanding constraints.** Always filter upstream approaches through our constraints:
-
-- Single thread (WASM)
-- Per-node cache (not spatial tiles)
-- Infinite canvas (viewport culling is primary)
-- Stable scene graph (no CSS reflow on zoom)
-
-**Reading too broadly.** Chromium is 30M+ lines. Arrive with a specific question, find the code path, extract the answer, leave. Use source.chromium.org.
-
-**Skipping the "differ" section.** Most dangerous omission. Without it, future contributors will "fix" intentional divergences from Chromium.
-
-**Confusing Skia docs with Skia behavior.** Skia's documentation is minimal and sometimes wrong. Read the implementation for performance characteristics and edge cases.
-
-**Stale source references.** Reference stable concepts (struct names, enum variants) over line numbers. Include enough context to relocate if files move.
-
-**Mixing terminologies.** Research docs: upstream terms. Plan docs: mapping tables. Code comments: our terms with parenthetical upstream reference.
+- **Writing a Grida-flavored survey.** Most common failure mode — the brief is "research X" and the agent writes a survey ending in "here's how we should do it". Stay in survey mode.
+- **Researching what's already documented.** Check `docs/wg/research/` and code comments first. The Chromium research alone is 15 documents.
+- **Reading too broadly.** Arrive with a specific question, find the code path, extract the answer, leave. Use source.chromium.org.
+- **Confusing Skia docs with Skia behavior.** Skia's documentation is minimal and sometimes wrong. Read the implementation.
+- **Stale source references.** Reference stable concepts (struct names, enum variants) over line numbers.
+- **Mixing terminologies.** Research docs: upstream terms. Code comments: Grida terms with parenthetical upstream reference.
---
## Checklist
- [ ] Framed a specific, bounded question
-- [ ] Checked existing research, plan docs, and code comments
+- [ ] Checked existing research and code comments
- [ ] Identified the right repo and narrowed to specific directories
- [ ] Extracted findings with file paths, rationale, and constants
-- [ ] Assessed applicability against our constraints
-- [ ] Documented findings (research doc, plan section, or code comment)
+- [ ] Wrote the research doc as a pure upstream survey — no Grida content
+- [ ] **Reviewed the draft** against the [review checklist](#review-your-draft-before-saving) — searched for `Grida`, `our`, `we`; deleted any planning sections
- [ ] Updated `index.md` if a new research doc was created
diff --git a/.gitignore b/.gitignore
index f56ebf41b7..291ad9d62b 100644
--- a/.gitignore
+++ b/.gitignore
@@ -209,6 +209,10 @@ node-compile-cache/
**/__tests__/artifacts/
**/__tests__/.tmp/
+# Reftest harness — generated baseline PNGs and puppeteer install
+crates/grida_dev/scripts/package.json
+crates/grida_dev/scripts/package-lock.json
+
### Python ###
# Byte-compiled / optimized / DLL files
__pycache__/
diff --git a/crates/grida/examples/tool_gen_bench_fixture.rs b/crates/grida/examples/tool_gen_bench_fixture.rs
index 1bb4c12901..06b50f62e5 100644
--- a/crates/grida/examples/tool_gen_bench_fixture.rs
+++ b/crates/grida/examples/tool_gen_bench_fixture.rs
@@ -19,8 +19,6 @@ use fixture_helpers::*;
use grida::cg::color::CGColor;
use grida::cg::fe::*;
use grida::cg::stroke_width::StrokeWidth;
-use grida::cg::types::*;
-use grida::node::schema::*;
use std::collections::HashMap;
/// 100×100 grid of rectangles (10 000 nodes).
diff --git a/crates/grida/examples/tool_sk_svgdom.rs b/crates/grida/examples/tool_sk_svgdom.rs
deleted file mode 100644
index 41e9c66698..0000000000
--- a/crates/grida/examples/tool_sk_svgdom.rs
+++ /dev/null
@@ -1,84 +0,0 @@
-//! Purpose: To use Skia's builtin svgdom, and its rendering accuracy (this is for testing only).
-//!
-//! Usage:
-//! cargo run --example tool_sk_svgdom -- [-o ]
-
-use clap::Parser;
-use skia_safe::{surfaces, svg, Data, EncodedImageFormat};
-use std::fs;
-use std::path::PathBuf;
-
-#[derive(Parser, Debug)]
-#[command(author, version, about = "Render SVG using Skia's native SVG module")]
-struct Cli {
- /// Input SVG file path
- input: PathBuf,
-
- /// Output PNG file path
- #[arg(short, long)]
- output: Option,
-}
-
-fn main() {
- let cli = Cli::parse();
-
- if !cli.input.exists() {
- eprintln!(
- "Error: Input file '{}' does not exist.",
- cli.input.display()
- );
- std::process::exit(1);
- }
-
- // 1. Load SVG data
- let svg_data = fs::read(&cli.input).expect("Failed to read input file");
- let data = Data::new_copy(&svg_data);
-
- // 2. Load SVG into DOM
- let dom =
- svg::Dom::from_bytes(&data, skia_safe::FontMgr::default()).expect("Failed to parse SVG");
-
- // 3. Determine output size (default to 400x400 if not specified in SVG, or use a reasonable default)
- // Note: svg::Dom doesn't easily expose intrinsic size in all versions without container size.
- // We'll try to get a container size or default to 800x600.
- // 3. Determine output size
- let root = dom.root();
- let width = root.width().value.ceil() as i32;
- let height = root.height().value.ceil() as i32;
-
- // Ensure positive dimensions
- let width = width.max(1);
- let height = height.max(1);
-
- // 4. Create a Surface to render into
- let mut surface =
- surfaces::raster_n32_premul((width, height)).expect("Failed to create surface");
- let canvas = surface.canvas();
-
- // Optional: Clear with white background? Or transparent?
- // canvas.clear(skia_safe::Color::WHITE);
-
- // 5. Render
- dom.render(canvas);
-
- // 6. Save
- let output_path = if let Some(path) = cli.output {
- path
- } else {
- let file_stem = cli.input.file_stem().expect("Invalid input filename");
- std::env::temp_dir().join(file_stem).with_extension("png")
- };
-
- let image = surface.image_snapshot();
- let data = image
- .encode(None, EncodedImageFormat::PNG, 100)
- .expect("Failed to encode image");
-
- fs::write(&output_path, data.as_bytes()).expect("Failed to write PNG data");
-
- println!(
- "Successfully rendered '{}' to '{}'",
- cli.input.display(),
- output_path.display()
- );
-}
diff --git a/crates/grida/src/htmlcss/collect.rs b/crates/grida/src/htmlcss/collect.rs
index 4adffc5154..a142efa8a7 100644
--- a/crates/grida/src/htmlcss/collect.rs
+++ b/crates/grida/src/htmlcss/collect.rs
@@ -518,8 +518,8 @@ fn detect_img_element(node: &DemoNode) -> ReplacedContent {
/// (components/script/dom/svg/svgsvgelement.rs): the `