0.6.10 - deterministic waiting, safe self-healing, token controls, domain understanding by divshekhar · Pull Request #2 · syrin-labs/iris

divshekhar · 2026-06-18T11:35:51Z

What this is

The 0.6.10 feature release — live-verified, not 1.0.0 (see plan/V1-ROADMAP.md for the 1.0 gate). Bumps all 10
publishable @syrin/iris-* packages 0.5.0 → 0.6.10.

Highlights (full list in CHANGELOG `[0.6.10]`)

Deterministic waiting (M3) — kills the #1 flake cause (sleeps)

New { kind: "settled", quietMs } predicate: passes once network + structural-DOM activity is quiet for quietMs. Ambient
dom.text/animation churn (count-ups, spinners) is ignored so an animated page still settles.
iris_act_and_wait with no until now auto-settles — "act, then wait for quiet" in one call, the documented replacement
for a fixed sleep.

Self-healing (M5) — heal the locator, never the intent

iris_flow_heal apply:true re-replays the healed flow and re-asserts its success consequence; if a rebound locator resolves
but the consequence no longer fires, the write is refused (status:consequence_broken, file untouched).

Token efficiency (M2) — every read tool self-budgets

iris_query gains limit + count_only; iris_network/iris_console gain limit + a cost:{bytes,tokens} hint
(most-recent-N, non-silent truncation).

Domain understanding (M4)

iris_domain now reports mustHold per flow — the success consequence that must hold — so an agent can answer "what are the
critical flows and what must hold for each?" from the model alone.

Rigor
success-oracle thesis, now enforced at the live-assert layer too).

Reliability

Browser observers (network/route/console) now restore the true original global on teardown instead of a bound copy.

Verification

Unit: build 10/10, lint 9/9, typecheck 18/18, unit tests 16/16 packages (~978 tests).
Live e2e on this build: full Playwright battery 14/14 (Vite/React demo + Next.js 15 + api) → no regression; a new
new-features-test.mjs spec 7/7 positively exercises the new surfaces in a real browser — including settled resolving
against the demo's count-up animations (the exact ambient-churn condition the fix targets).

Summary by CodeRabbit

New Features
- Deterministic settling predicates for reliable page waits
- iris_act_and_wait auto-settles when until is omitted
- iris_query now supports limit and count_only pagination controls
- Network and console tools now support limit parameter
- New iris_domain tool for inspecting app's testable surface
- New iris init command for guided project setup with Vite/Next/Claude/Cursor integration
Bug Fixes
- Browser teardown now fully restores patched globals to original identities
- iris_flow_heal re-verifies success consequences; refuses to persist if broken
Documentation
- Updated getting-started guides for new setup command and Vite plugin integration
- Added token-efficiency guidance for diffed snapshots

Collapse Iris integration from a 6-step manual process into `npx @syrin/iris init` plus a dev-server restart, and add a Vite plugin that makes the whole React integration a single config line. @syrin/iris-vite-plugin (exposed as @syrin/iris/vite): - iris() with apply:'serve' so Vite drops it from `vite build` entirely — production-leak protection is structural, not a user-managed env gate - reuses the babel plugin to stamp data-iris-source, and injects a dev-only iris.connect() via transformIndexHtml (no entry-file edit needed) - options: port/session/token/sourceMapping/inject; non-default port is baked into the injected connect() url - integration test runs Vite's real resolveConfig to prove the plugin is filtered out of the build pipeline and present in serve iris init: - detects framework (Next/Vite/HTML), React major, and package manager - merges an iris entry into .mcp.json without clobbering, runs the dependency install, and for Vite patches the config with iris() - bail-to-manual safety: only auto-patches the obvious shape, otherwise prints the exact paste-in snippet rather than half-editing a build config; Next config wrap + layout mount are intentionally manual - pure modules (detect/mcp-config/vite-config/plan) + a thin injectable IO shell, fully unit-tested; --dry-run, --no-mcp, --no-install, --yes, --port Docs: getting-started.md now leads with the init fast path and recommends the Vite plugin. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…script An inline <script type="module"> injected via transformIndexHtml is not run through Vite's import resolution, so its bare `import '@syrin/iris'` failed in the browser with "Failed to resolve module specifier". Found by driving the real demo app with a headless browser. Serve the connect code as a real module (resolveId + load) referenced by a <script src>, so Vite resolves the import normally. Verified end-to-end against the demo: session connects, the react adapter loads, iris_inspect returns the component source file:line, and a real `vite build` emits zero Iris artifacts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…entry under pnpm Three issues found by a deep end-to-end stress test (adversarial config shapes + a faithful npm-tarball install driven by a real browser): 1. A .mcp.json with comments/trailing commas (jsonc — what the docs show and the MCP convention) crashed init with an uncaught JSON.parse SyntaxError. Now it bails to a manual instruction and never rewrites the file (preserving comments). 2. `iris init --port N` wrote the port into .mcp.json but injected a bare iris(), so the browser SDK connected to 4400 while the bridge listened on N. The port now flows into iris({ port: N }) in the Vite patch and into the Next/HTML connect snippets, so bridge and SDK always agree. 3. The CLI bin silently no-opped under pnpm installs: the bin shim runs the symlinked node_modules/@syrin/iris path as argv[1] while ESM import.meta.url is the realpath, so the entry guard was false and main() never ran — breaking the documented `npx @syrin/iris init` / `mcp`. The guard now also compares the realpath of argv[1]. Verified end-to-end: packed the real tarball, installed it into a fresh Vite + React 19 app, ran `iris init` via the .bin shim, drove a headless browser — the plugin connected the session and iris_inspect returned src/App.tsx:8 — and a real `vite build` emitted zero Iris artifacts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…oject The bridge + MCP server is a single process that serves every project, so it should be registered once — not re-added to a .mcp.json in each repo. `iris init` now runs `claude mcp add iris -s user` (idempotent: probes `claude mcp get iris` first, skips if present) instead of writing a project .mcp.json. Per project, only the SDK (the iris() Vite plugin / connect call) is added. - new init/mcp.ts: builds the `claude mcp add -s user` command, an existence probe, an availability probe, and a manual fallback (printed when the claude CLI is absent, with a Cursor-style global-config snippet too) - run.ts gains a quiet `probe` IO capability; gathers claudeCli + mcpExists - removed init/mcp-config.ts (the project .mcp.json JSON merge) — no longer used - --port flows into the global registration too, so bridge and SDK agree - docs (getting-started, integrate-with-claude-code): register once, globally Verified end-to-end in an isolated CLAUDE_CONFIG_DIR: init writes the iris server to the user config (no project .mcp.json), and a re-run reports ALREADY instead of erroring. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

`iris init` now registers Iris once, globally, for every supported agent it detects — not just Claude: - Claude Code: `claude mcp add iris -s user` (as before) - Cursor: merges an `iris` entry into ~/.cursor/mcp.json (Cursor has no CLI, but its global MCP config is a small dedicated file — safe to merge, unlike ~/.claude.json). Idempotent; never clobbers an existing entry; bails to manual on unparseable jsonc. Detection is per-agent (claude CLI present / ~/.cursor exists), so each gets its own step and only installed agents are touched. If neither is found, a single manual instruction is printed. --port flows into both registrations. The init IO gained homeDir() and absolute-path support for writing the global Cursor file. Verified end-to-end in an isolated HOME: with Cursor present, init writes ~/.cursor/mcp.json (and registers Claude too); a re-run reports ALREADY for both. Confirmed the real ~/.claude.json and ~/.cursor configs were untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…nly flows Iris now tells an agent, the moment it saves a flow, whether that flow actually asserts an observable consequence — or whether it will pass even when the feature is broken. iris_flow_save returns assertions.grade: - asserted — a step or the success end-condition asserts a signal/network consequence (a wrong element can't fake it) - presence-only — only element-presence checks; a locator healed to the WRONG element can still pass (the failure self-healing vendors admit) - assertion-free — acts but asserts nothing observable For non-"asserted" grades it returns a warning telling the agent to add a consequence assertion (iris_annotate assert-signal / assert-net / success-state). Grounded in: Fowler "Assertion-Free Testing" (100% coverage, 0 assertions), Kent C. Dodds "make your test fail", mabl/qate.ai (healed-wrong-locator ships bugs green), and AI agents agreeing with human pass/fail only ~68% of the time — so the flow must carry a real oracle, not rely on the agent eyeballing success. First slice of V1-ROADMAP M1 (intent-anchored assertions). Pure classifier + 7 unit tests; wired into iris_flow_save. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ed oracle) flow.success was checked only by the @syrin/iris-test CI spec runner — the live iris_flow_replay MCP tool an agent calls never evaluated it, so a replay reported ok even when the end-condition (the actual intent) was never met. Move the success oracle (successToPredicate + assertSuccess) down into iris-server as flow-success.ts so ONE implementation is shared by both the MCP tool and the spec runner — no divergent oracle. @syrin/iris-test/success-assert now re-exports it (its 12 tests still pass against the moved code). iris_flow_replay now, after every step runs clean, asserts the success end-condition as a real consequence: a signal/net success that never fires makes the replay status:error "intent not satisfied" even though all locators resolved — catching the regression a healed-but-wrong locator ships green (mabl/qate.ai). Dynamic (LLM-output) success fields are skipped, symmetric with the step layer. A `success` result row is appended so the agent sees the verdict. V1-ROADMAP M1 slice 2 (corrected: shared oracle, not a duplicate in replayFlow). Discovered + fixed via the full suite catching a turbo-cached masking failure. 8 new server tests; iris-server 488, iris-test 97; all 16 tasks green uncached. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…before huge reads High-volume read results now carry cost:{ bytes, tokens } (tokens estimated at ~chars/4) so an agent can re-scope BEFORE spending context on a large body — re-scope a snapshot (mode:interactive/status or a tighter `scope`) or narrow a query (`name`/`scope`). Pure helpers in output-budget.ts (estimateTokens, sizeCost, withSizeCost) reuse the existing JSON-byte idiom; the wrapper is pure and unit-tested, the handlers just .then(withSizeCost). Grounded in: by ~step 15 a screenshot/Playwright-MCP agent accrues 60–80K tokens of stale accessibility-tree data and starts hallucinating selectors that don't exist; a single screenshot is up to ~4,784 tokens. Giving the agent a cheap size signal up front is the first lever of V1-ROADMAP M2 (token efficiency). 11 output-budget tests (+8); 493 server tests; all 16 tasks green uncached. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…what changed After the first snapshot, an agent can pass diff:true to get back only the delta since its last look of the same scope/mode — mode:delta with added/removed lines, or mode:unchanged — instead of the full accessibility tree again. A route change automatically resets to a full snapshot (a cross-page delta would be meaningless), so the agent never sees a misleading diff. Why: screenshot/Playwright-MCP agents accrue 60–80K tokens of stale a11y data over a session and start hallucinating selectors that no longer exist. Returning only the change set attacks both the token cost AND that stale-context failure in one move. The agent-facing cost is the MCP result, so the delta is computed server-side, reusing the same normalize+diff the baseline layer uses. Pure decision (snapshotDelta) + a route-invalidated, bounded SnapshotCache + applySnapshotDelta shaper — all unit-tested without a browser. V1-ROADMAP M2. 10 new tests; 503 server tests; all 16 tasks green uncached. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

A reproducible, regression-guarding benchmark (runs the shipped sizeCost / estimateTokens / applySnapshotDelta, not mocks) over a representative 150-row operational dashboard — the shape that makes full snapshots expensive. Measured: full re-snapshot = 4,246 tok (16,983 bytes) diff (1 row changed) = 60 tok → 99% saved unchanged re-snapshot = 17 tok Asserts the diff is <10% of a full re-read and ≥90% saved, so the M2 win can't silently regress. This is the data behind the "token efficient" claim — the absolute figure uses the ~chars/4 heuristic, the relative saving is robust. V1-ROADMAP M0 (first numbers; real-app capture is the follow-up). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…efore testing New tool + pure builder that synthesizes every saved flow and the registered capabilities into a compact domain model an agent reads BEFORE driving the app: each flow with its assertion grade and the signals/testids it exercises, coverage counts, and — the differentiator — GAPS: declared signals/testids that NO flow asserts (untested intent), plus flows that assert no consequence. A one-line summary headlines it. Why: automation "checks the DOM, not the intent" — tests pass while integrated bugs ship (Bolton; Fowler). Pointing the agent straight at "you declared signal order:placed but no flow asserts it" is domain understanding + rigor in one read, instead of crawling the whole app or reading all the source. Reuses the flow assertion classifier so the two reinforce each other. Pure buildDomainModel (no IO) + a thin iris_domain tool reading .iris/flows/ + .iris/contract.json (no browser). V1-ROADMAP M4 first slice. 5 domain-model tests; 511 server tests; all 16 tasks green uncached. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ver the intent) When two present testids tie at the minimum edit distance, the prior nearest-pick was an arbitrary lexical tiebreak — exactly how a self-healing rebind lands on the WRONG element and ships a bug green (the failure mabl/qate.ai admit). Drift now carries `ambiguous:true` in that case, and the heal proposal layer refuses to auto-rebind an ambiguous drift: it still surfaces the drift + a nearest candidate for a human/agent to choose, but never applies a coin-flip. - protocol Drift gains optional `ambiguous` - flow-replay: pure `nearestIsAmbiguous` (≥2 candidates at min distance) sets it - heal.proposeRebindWith refuses when drift.ambiguous (regardless of confidence) Pairs with the M1 replay success oracle (which catches a wrong rebind at run time): this stops the wrong rebind from ever being proposed. V1-ROADMAP M5. 6 new tests; 517 server tests; all 16 tasks green uncached. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…grades, heal) Eight iterations shipped capabilities the human/agent docs didn't mention — features are invisible without docs. Surfaced across the three read surfaces: - token-efficiency.md: diffed snapshots section with the measured numbers (full 4,246 tok vs diff 60 tok, ~99% saved) + the cost-preview note. - agent-cheatsheet.md: "read iris_domain first" in Start-here, iris_domain in the core tool set, and a token note covering diff:true, cost, assertion grades, and ambiguous-heal refusal. - usage.md: iris_snapshot diff:true + cost documented; new iris_domain reference entry (flows + coverage + untested-intent gaps). Docs only; no code change. V1-ROADMAP M8 (discoverability slice). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… quality) iris_domain now folds .iris/project.json run history into the model and returns `riskRanked` (flow names worst-first) plus a per-flow `risk:{ level, reason, lastStatus }`. Risk is the WORSE of two signals: - run history: last run errored/drifted = high; passed-but-with-console/network errors = medium; passed clean = low; never run = unknown. - assertion quality: an assertion-free flow is medium EVEN when it passed — a green run of a flow that asserts nothing is false confidence. Test-priority order is high > medium > unknown (unvalidated) > low, so the agent tests the genuinely riskiest journeys first instead of re-running known-good ones. Synthesizes "domain-aware" + "rigorous": a flow that looks green but can't catch a regression surfaces as risk. Pure flow-risk.ts; iris_domain loads the history. V1-ROADMAP M4 (deepening). 13 new tests (flow-risk + domain); 528 server tests; all 16 tasks green uncached. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Completes the risk-ranking work: the one-line `summary` now names the flow to test first ("test first: <name> (<reason>)") when run history flags a high/medium risk flow — so the single most actionable fact (where the real regression risk is) is in the headline, not buried in the arrays. Omitted when the top flow is only low risk (no false alarm). Pure; 2 new tests; 529 server tests; all 16 tasks green uncached. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…utputSchema The handler returned bare flow-name strings while its declared outputSchema promises {name, path, createdAt?} objects — so a schema-validating MCP client rejected the result ("Expected object, received string"). Caught for real while driving the live demo via MCP. The unit test had codified the wrong (string[]) shape, so it passed while the contract was violated — corrected too. Now maps each name → {name, path: flowPath(irisRoot, name)}. 529 server tests; all gates green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…endation) Observed live driving the demo: a single login flooded iris_observe with 319 events / ~37KB because the dashboard's count-up animations emit a dom.text per frame — the agent only avoided the token hit because it knew to pass filters. Now costHint adds a `recommendation` when a timeline is large (>=80 events or >=8KB) telling the agent to pass filters:[...] or max_events next time. Surfaced in iris_observe's cost output schema. Backed by real session data; V1-ROADMAP M2. 3 new tests; 532 server tests; all 16 tasks green uncached. (No daemon restart — the live demo session is untouched.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…rdown The network observer stored window.fetch.bind(window) and assigned that bound copy back on teardown, so window.fetch was never restored to its original identity — violating the "fully reversible" contract in types.ts. Keep the true original for restore; use a separate window-bound copy only for invocation (fetch throws "Illegal invocation" on the wrong `this`). Adds the first test coverage for installNetwork, pinning GET->500 emission (the case a live MCP session appeared to miss; the observer itself is sound, so that miss was a long-session/window artifact, not a code bug), POST method capture, reject-path status:0, and teardown identity. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…eardown Same bind-on-restore bug audited from the network observer fix: route.ts and console.ts stored bound copies (history.pushState.bind / console[m].bind) and assigned those back on teardown, so the globals were never restored to their original identity — breaking the "reversible" contract. Keep the true originals for restore; use bound copies only for invocation. Adds first-ever observer-level tests for both (emit + teardown identity). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…-based flakiness) The #1 flake cause in agent-driven UI testing is async-wait + concurrency handled with fixed sleeps. Adds a { kind: 'settled', quietMs } predicate: the page is settled when no network/DOM/animation activity has occurred for quietMs (default 500). "No activity in the last N ms" is relative to now, so the buffer clock is injected via PredicateSession.elapsed() (CLAUDE.md rule 7) and the wait loop's poll interval flips it to pass once activity stops. Usable today via iris_wait_for and iris_act_and_wait's `until` (PredicateSchema already gates both), composes inside allOf with the consequence you expect, and both tool descriptions now steer agents to it over raw sleeps. FlowReplaySession gains elapsed(); test fakes across server + the public @syrin/iris-test package updated to match. All 16 packages green, 537 server tests (+5 settled cases). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… (M3) Makes "act then wait for the page to go quiet" a single zero-config call: omit `until` on iris_act_and_wait and it now waits for the { kind: 'settled' } predicate (network/DOM/animation idle) instead of requiring a predicate — the documented alternative to a fixed sleep. Explicit predicates are unchanged; to assert a consequence AND settle, allOf them. Tool + field descriptions updated to steer agents here. Adds onEvent to the live-control test fake so the wait loop is exercised. 538 server tests green (+1). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…5 — heal the locator, never the intent) iris_flow_heal apply:true rewrote a drifted testid to its nearest match and wrote it to disk WITHOUT checking that the healed flow still does what it was testing. A rebind can resolve to a real but wrong element (a look-alike control) that no longer triggers the flow's success signal/net — persisting that ships a green test that asserts nothing. The maintenance-tax research is explicit: locator-only healing covers <1/3 of failures and can ship bugs. Now, before writing, apply re-replays the healed flow in memory and re-asserts its success consequence. If the consequence no longer fires, the write is REFUSED (status:consequence_broken, file untouched) and the proposal is still surfaced for a human. Flows with no declared success heal but say so loudly (the rebind couldn't be verified — add a success-state assertion). Extracts the pure applyHealChanges (shared by FlowStore.heal's on-disk writer and the in-memory verification so both rewrite identically). 541 server tests green (+3: verified-heal, refused-broken-consequence, unverified-no-success). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

A broad query (by=role value=button) on a busy page returns every descriptor, spending agent context on data it may not need. Adds two opt-in trims to the result the AGENT sees: - count_only: return just { count }, drop the element array (when you only need "how many match?"). - limit: keep the first N descriptors; if more matched, carry total + truncated:true so the trim is never silent — the agent narrows with name/scope rather than assuming it saw everything. Pure, result-shape-tolerant paginateQueryResult (non-{elements} results pass through). 548 server tests green (+7). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Both tools returned every matching call/log uncapped and carried no size hint — unlike iris_observe/iris_query/iris_snapshot. A buggy page spams the console and a wide `since` window returns many calls, both spending agent context. Adds a `limit` (most-recent-N via the already-tested applyEventBudget, reporting total + droppedOldest so a trim is never silent) and a `cost:{bytes,tokens}` hint so the agent can self-budget — consistent with the other read tools. 551 server tests green (+3). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…w (M4) iris_domain told an agent that a flow asserts SOMETHING (boolean) but not WHAT its success depends on. The M4 bar is that the agent can answer "what are the critical flows and what must hold for each?" from Iris alone — the "what must hold" was missing. Each DomainFlowSummary now carries mustHold: the human label of the flow's success consequence (signal name / net URL), or undefined when the flow declares no success (then asserts:false — it tests nothing observable). 552 server tests green (+1). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…control, safe-heal, and domain features Records this session's user-facing changes under [Unreleased]: the settled predicate + act_and_wait auto-settle (M3), iris_query/network/console token controls + cost hints (M2), iris_domain mustHold (M4), consequence-verified flow_heal (M5), and the observer teardown-identity fix. (Also folds in a pre-existing cosmetic prettier reflow of the 0.5.0 section.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…nsequence (rigor) The success-oracle thesis ("green means the feature worked") was enforced for saved flows (classifyFlowAssertions) but not for ad-hoc iris_assert. An agent that asserts only { element }/{ text } presence gets a green verdict even though a healed-but-wrong element or a stale render can satisfy it (Fowler, Assertion- Free Testing; Dodds, Make Your Test Fail). iris_assert now attaches `advice` to a PASSING presence-only assertion, steering toward a { signal }/{ net } consequence (or an allOf with one). Never on a failing verdict or when a consequence is already asserted. Pure isPresenceOnlyAssertion walks allOf/anyOf/not. 563 server tests green (+11). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…rts, and read caps Surfaces tonight's habits in the canonical quick-reference so they're discoverable to agents and humans, not just buried in tool descriptions: - Never sleep — iris_act_and_wait with no `until` settles; { kind:"settled" } waits for network/DOM/animation idle; allOf a consequence with settled. - Assert a consequence ({signal}/{net}) over presence ({element}/{text}); a passing presence-only assert returns `advice`. - Cap broad reads: iris_query limit/count_only, iris_network/iris_console limit + cost. - flow_heal refuses to write a rebind that breaks the success consequence. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Self-review caught a real bug in tonight's settled predicate: it counted dom.text and animation events as activity, so any page with an ambient effect — a count-up counter, spinner, or pulsing dot — emits an event every frame and NEVER goes quiet, making act->settle time out at 4s on a healthy page. Observed live in iter 18: one login flooded 319 dom.text events from the dashboard's count-up animations. Same trap that deprecated Playwright's networkidle. SETTLE_ACTIVITY is now network + structural DOM (added/removed/attr) only — the real "app is doing work" signals. Ambient text/anim churn is ignored; for an animation-gated outcome, assert the specific consequence instead. Tool descriptions, cheatsheet, and CHANGELOG corrected to "network + DOM idle". 564 server tests green (+1 ambient-animation regression test). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ignal false pass) Self-review of the M5 heal verification found a real honesty gap: assertSuccess evaluated the success predicate against the WHOLE session buffer (no since floor). In flow_heal this is acute — one heal call does two replays (the pre-heal drift replay's prefix, then the verify replay), so a success signal emitted by the first could satisfy the verify even when the rebound step never fired it: a false-positive heal, the exact thing the feature exists to prevent. flow_replay had the same staleness across runs in one session. assertSuccess now takes a `since` floor (WaitForSignal gains the param, already honored by waitForPredicate); flow_replay and the heal verify each capture the cursor before their replay so only that replay's events count — the same since-floor honesty principle used by iris_assert/wait_for. 565 server tests green (+1 floor regression test); flow-replay test fake gains elapsed(). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Positively exercises tonight's additions end-to-end in a real browser (the existing battery only proves no-regression): iris_query count_only/limit, the settled predicate + act_and_wait auto-settle, and the iris_assert presence-only advice. The settled check is run against the demo's count-up animations on purpose — the exact ambient dom.text churn that the iter-29 fix must tolerate; it now settles (quietForMs ~307) instead of timing out. 7/7 live. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ontrols, domain mustHold Bumps all 10 publishable @syrin/iris-* packages 0.5.0 -> 0.6.10 and promotes the CHANGELOG [Unreleased] section to [0.6.10]. This is the live-verified feature release (NOT 1.0.0 — see plan/V1-ROADMAP.md for the 1.0 gate): full e2e battery 14/14 + a new-features spec 7/7 green on a real browser across the Vite/React demo, Next.js 15, and the api. Cleanup: removed a throwaway headless-launcher script (untracked) and its leaked browser process. No dead code to remove otherwise — noUnusedLocals + no-unused-vars are enforced and green; knip's "unused files" are all intentional dynamically-spawned e2e specs and standalone asset scripts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-06-18T11:36:05Z

Warning

Review limit reached

@divshekhar, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 45 minutes and 26 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 08274d45-919f-4fb3-a8e8-888c7bcc4f0e

📥 Commits

Reviewing files that changed from the base of the PR and between 3b0e8e0 and 777a60b.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (2)

packages/vite-plugin/package.json
pre-commit.sh

📝 Walkthrough

Walkthrough

This PR adds new Iris tool behaviors, flow and domain analysis, iris init, a Vite plugin package and export path, browser observer teardown fixes, end-to-end and unit coverage, documentation updates, and release/version metadata for v0.6.10.

Changes

Core release changes

Layer / File(s)	Summary
Tool runtime controls and predicates `packages/server/src/tools/*`, `packages/server/src/events/predicate.ts`, `packages/server/src/session/output-budget.ts`, `apps/e2e/specs/new-features-test.mjs`, `docs/agent-cheatsheet.md`, `docs/token-efficiency.md`, `docs/usage.md`	Adds `settled` predicate support, snapshot diff responses, query pagination, cost metadata, observe recommendations, passing presence-only assert advice, and network or console limit handling, with tests and docs.
Flow grading and domain summaries `packages/server/src/flows/flow-classify.ts`, `packages/server/src/flows/flow-success.ts`, `packages/server/src/domain/`, `packages/server/src/index.ts`, `packages/test/src/`	Adds flow assertion grading, success assertion helpers, domain-model and risk-ranking builders, the `iris_domain` tool, richer flow save/list outputs, public re-exports, and shared test-side usage.
Replay success verification and heal safeguards `packages/protocol/src/`, `packages/server/src/flows/flow-replay.ts`, `packages/server/src/flows/heal.ts`, `packages/server/src/flows/flow-tools.ts`, `packages/server/src/flows/flows.ts`, `packages/server/src/flows/test.ts`	Adds replay time-floor support, ambiguous drift reporting, reusable heal application, replay success verification, and refusal to persist healed flows when the declared consequence no longer passes.
Vite plugin package and SDK export `packages/vite-plugin/*`, `packages/iris/src/vite.ts`, `packages/iris/package.json`, `packages/iris/tsup.config.ts`, `docs/getting-started.md`	Introduces the Vite plugin package, exposes `@syrin/iris/vite`, adds tests and README coverage, updates build references, and documents plugin-based Vite and React wiring.
Init command and project wiring `packages/server/src/cli.ts`, `packages/server/src/init/*`, `docs/getting-started.md`, `docs/integrate-with-claude-code.md`	Adds the `iris init` CLI flow, project detection, MCP registration planning for Claude and Cursor, install and framework-specific wiring steps, Node-backed execution, and updated setup guidance.
Browser observer restoration and release metadata `packages/browser/src/observers/`, `CHANGELOG.md`, `packages//package.json`	Restores original browser global identities after console, fetch, and route observer teardown, adds regression tests, updates the changelog, and bumps package versions to `0.6.10`.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as iris init
  participant Runner as runInit
  participant Planner as buildPlan
  participant Claude as claude mcp add
  participant Cursor as ~/.cursor/mcp.json
  participant Project as vite.config.ts

  CLI->>Runner: init options
  Runner->>Planner: gather inputs and build plan
  Planner-->>Runner: MCP, install, framework steps
  Runner->>Claude: register iris MCP when planned
  Runner->>Cursor: merge iris server config when planned
  Runner->>Project: patch Vite config when planned

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~105 minutes

Poem

🐇 I sniffed through tools and trails anew,
Found quieter waits and snapshots too.
A Vite burrow bloomed beside the lane,
While heals now test their hops again.
With tidy paws, the globals came back home.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 46.75% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main features of this release: deterministic waiting, self-healing, token controls, and domain understanding are all core changes reflected in the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/iris-init-vite-plugin

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…6j-xcff + 2 moderates) and add build+audit to pre-commit audit: vite-plugin's devDependency vite "^5" resolved to 5.4.21, flagged by 1 high (server.fs.deny bypass) + 2 moderate (optimized-deps path traversal, launch-editor NTLM disclosure) advisories (all vite <=6.4.2). Bumped to "^7" so it dedupes to the already-in-tree patched vite 7.3.5; peer stays >=4. `pnpm audit --audit-level high` now reports no known vulnerabilities; vite-plugin builds + 12 tests pass under vite 7. pre-commit: the hook was missing two CI steps. It now mirrors .github/workflows/ci.yml one-for-one — added `pnpm build` (step 2) and `pnpm audit --audit-level high` (step 7), so a green commit is a green CI run and a high+ vuln is caught before push, not in CI. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 15

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

docs/getting-started.md (1)
443-443: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Align the troubleshooting package name with Step 3.

This line points to @syrin/iris-babel-plugin, while Step 3 instructs @syrin/iris/babel. Keeping both names in one guide is confusing and can misdirect setup.
Suggested doc fix
-- Wire up `@syrin/iris-babel-plugin` (Step 3). Without it, only component identity is available.
+- Wire up `@syrin/iris/babel` (Step 3). Without it, only component identity is available.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/getting-started.md` at line 443, The troubleshooting section mentions
`@syrin/iris-babel-plugin` but Step 3 of the guide references
`@syrin/iris/babel`. Update the package name reference at line 443 to use
`@syrin/iris/babel` instead to match the package name specified in Step 3,
ensuring consistency throughout the documentation and preventing setup
confusion.

🟡 Minor comments (8)

packages/browser/src/observers/console.test.ts-27-36 (1)

27-36: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert forwarding explicitly in the first test.

The test name says forwarding is verified, but only event emission is asserted. Add an assertion that the original console.error path is invoked so this regression is actually covered.

Suggested patch

 it('emits CONSOLE_ERROR and still forwards to the original console', () => {
+  const errorSpy = vi.spyOn(console, 'error').mockImplementation(() => {});
   const { emit, events } = collect();
   teardown = installConsole(emit);

   console.error('boom', 42);

   expect(events).toHaveLength(1);
   expect(events[0]?.type).toBe(EventType.CONSOLE_ERROR);
   expect(events[0]?.data.message).toBe('boom 42');
+  expect(errorSpy).toHaveBeenCalledWith('boom', 42);
+  errorSpy.mockRestore();
 });

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/browser/src/observers/console.test.ts` around lines 27 - 36, The
test verifies event emission but does not assert that the original console.error
method is actually being forwarded. Add a spy or mock of the original
console.error method before calling installConsole, then after the console.error
call, add an assertion to verify that the original console.error was invoked
with the correct arguments ('boom' and 42) to ensure the forwarding behavior is
actually covered by the test.

CHANGELOG.md-7-7 (1)

7-7: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add the missing link definition for 0.6.10.

The heading uses [0.6.10] reference-link syntax, but there is no matching link definition at the bottom, so this version entry won’t hyperlink.

📝 Proposed fix

+[0.6.10]: https://github.com/syrin-labs/iris/releases/tag/v0.6.10
 [0.5.0]: https://github.com/syrin-labs/iris/releases/tag/v0.5.0
 [0.4.0]: https://github.com/syrin-labs/iris/releases/tag/v0.4.0

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@CHANGELOG.md` at line 7, The heading for version 0.6.10 uses markdown
reference-link syntax with [0.6.10], but there is no corresponding link
definition at the bottom of the CHANGELOG.md file to complete the reference. Add
a link definition at the end of the file following the pattern of other version
links in the document to match the [0.6.10] reference in the heading and enable
proper hyperlinking for this version entry.

docs/usage.md-267-267 (1)

267-267: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add mustHold? to the documented iris_domain flow shape.

Line 267 omits mustHold?, but buildDomainModel sets it when a flow has success (see packages/server/src/domain/domain-model.ts Line 117). The docs currently under-specify the returned payload.

Doc fix

-- `iris_domain({})` → `{ flowCount, flows: [{ name, steps, grade, asserts, signals, testids, warning?, risk? }], declared: { testids, signals, stores }, coverage: { asserted, presenceOnly, assertionFree }, gaps: { unassertedFlows, declaredUntestedSignals, declaredUntestedTestids }, riskRanked, summary }`
+- `iris_domain({})` → `{ flowCount, flows: [{ name, steps, grade, asserts, signals, testids, mustHold?, warning?, risk? }], declared: { testids, signals, stores }, coverage: { asserted, presenceOnly, assertionFree }, gaps: { unassertedFlows, declaredUntestedSignals, declaredUntestedTestids }, riskRanked, summary }`

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/usage.md` at line 267, The `iris_domain({})` return type documentation
at line 267 of docs/usage.md is missing the `mustHold?` optional property from
the flow object shape. Add `mustHold?` to the documented flow properties in the
return type definition to match what `buildDomainModel` actually sets in the
packages/server/src/domain/domain-model.ts file when a flow has a success
condition.

packages/server/src/flows/flows.heal.test.ts-85-93 (1)

85-93: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make the heal test session cursor-aware so stale-signal protection is actually exercised.

heal(..., apply: true) verifies success using a since floor from session.elapsed(). Here, eventsSince() ignores the cursor and elapsed() is always 0, so buffered signals can satisfy verification even when they should be considered stale.

Suggested test-harness fix

-  eventsSince(): IrisEvent[] {
-    return this.events;
+  eventsSince(cursor: number): IrisEvent[] {
+    return this.events.filter((event) => event.t >= cursor);
   }
@@
   elapsed(): number {
-    return 0;
+    return 1;
   }
@@
 function signalEvent(name: string): IrisEvent {
-  return { t: 0, type: EventType.SIGNAL, sessionId: 's', data: { name, data: {} } };
+  return { t: 1, type: EventType.SIGNAL, sessionId: 's', data: { name, data: {} } };
 }

Also applies to: 96-99

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/flows/flows.heal.test.ts` around lines 85 - 93, The test
mock session needs to be made cursor-aware to properly exercise stale-signal
protection. Modify the mock session to track a cursor position and elapsed time:
make eventsSince() accept and use a cursor parameter to filter events rather
than returning all events, and make elapsed() return an actual tracked value
(either incrementing or controllable) instead of always returning 0. This will
ensure that heal(..., apply: true) verification with the since floor from
session.elapsed() properly validates that buffered signals outside the time
window are considered stale.

packages/server/src/tools/snapshot-cost.test.ts-6-11 (1)

6-11: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove design-doc identifiers from comments, test title, and log labels.

M0/M2 markers are internal design-doc style identifiers and should not appear in comments/test strings.

✏️ Suggested edit

- * M0 — a reproducible measurement of the M2 token wins, run over the REAL shipped functions
+ * Reproducible measurement of token savings, run over the real shipped functions
@@
-describe('M0 — snapshot token cost (M2 wins, measured on real functions)', () => {
+describe('Snapshot token cost regression (measured on real functions)', () => {
@@
-      `[M0] full re-snapshot=${String(fullTokens)} tok  diff=${String(deltaTokens)} tok  saved=${String(savedPct)}%`,
+      `[snapshot-cost] full re-snapshot=${String(fullTokens)} tok  diff=${String(deltaTokens)} tok  saved=${String(savedPct)}%`,
@@
-    console.log(`[M0] unchanged re-snapshot=${String(unchangedTokens)} tok`);
+    console.log(`[snapshot-cost] unchanged re-snapshot=${String(unchangedTokens)} tok`);
@@
-      `[M0] cost preview: ${String(preview.tokens)} tok / ${String(preview.bytes)} bytes`,
+      `[snapshot-cost] cost preview: ${String(preview.tokens)} tok / ${String(preview.bytes)} bytes`,

As per coding guidelines, "Never include internal tracking tags in comments, file names, directory names, or test descriptions. Do not use design-doc reference codes (letter + digit patterns like N5, G4, M8, P2, F1, R1) or internal version strings."

Also applies to: 39-40, 59-60, 78-78, 87-88

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/tools/snapshot-cost.test.ts` around lines 6 - 11, Remove
the internal design-doc identifiers M0 and M2 from the comment block in the
snapshot-cost.test.ts file. Specifically, remove "M0 — a reproducible
measurement of the" and replace it with a direct description, and remove "M2"
references so the comment describes the measurement approach without internal
tracking codes. Apply the same fix to all other locations in the file where
these design-doc reference patterns appear (at lines 39-40, 59-60, 78, and
87-88), ensuring all letter-digit patterns like N5, G4, M8, etc. are removed
from comments, test titles, and log labels throughout the file.

Source: Coding guidelines

apps/e2e/specs/new-features-test.mjs-1-1 (1)

1-1: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove the internal changelog tag from the file comment.

[Unreleased] in an inline comment violates the internal-tag prohibition.
✏️ Suggested edit
-// Live verification of the features added in the [Unreleased] CHANGELOG section, against the real
+// Live verification of the features added in the CHANGELOG section, against the real
As per coding guidelines, "Never include internal tracking tags in comments, file names, directory names, or test descriptions."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/e2e/specs/new-features-test.mjs` at line 1, The file comment at the
start of new-features-test.mjs contains the internal changelog tag [Unreleased]
which violates the coding guideline prohibiting internal tracking tags in
comments. Remove the reference to [Unreleased] and the CHANGELOG section from
the comment, keeping only the core description of what the file does for live
feature verification.
Source: Coding guidelines

packages/server/src/tools/assert-grade.ts-40-46 (1)

40-46: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Mixed predicates are incorrectly flagged as presence-only.

At Line 44-46 and Line 60, allOf/anyOf with { kind: 'element' } plus a non-presence predicate like { kind: 'settled' } still returns presence-only advice. That contradicts the contract in this file and causes false advice emission.

Proposed fix

 interface PredicateKinds {
   /** A signal/net leaf is present — proves the app did something a wrong element cannot fake. */
   consequence: boolean;
   /** An element/text leaf is present — a weak presence check. */
   presence: boolean;
+  /** Any non-presence, non-consequence leaf (route/console/animation/settled). */
+  other: boolean;
 }

 function walk(predicate: Predicate): PredicateKinds {
   switch (predicate.kind) {
     case 'signal':
     case 'net':
-      return { consequence: true, presence: false };
+      return { consequence: true, presence: false, other: false };
     case 'element':
     case 'text':
-      return { consequence: false, presence: true };
+      return { consequence: false, presence: true, other: false };
     case 'route':
     case 'console':
     case 'animation':
     case 'settled':
-      // Observable but not the weak presence pattern we nudge — neither flags the advice.
-      return { consequence: false, presence: false };
+      return { consequence: false, presence: false, other: true };
     case 'allOf':
     case 'anyOf': {
       const subs = predicate.predicates.map(walk);
       return {
         consequence: subs.some((s) => s.consequence),
         presence: subs.some((s) => s.presence),
+        other: subs.some((s) => s.other),
       };
     }
     case 'not':
       return walk(predicate.predicate);
   }
 }

 export function isPresenceOnlyAssertion(predicate: Predicate): boolean {
   const kinds = walk(predicate);
-  return kinds.presence && !kinds.consequence;
+  return kinds.presence && !kinds.consequence && !kinds.other;
 }

Also applies to: 58-60

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/tools/assert-grade.ts` around lines 40 - 46, The walk
function's handling of allOf and anyOf predicates incorrectly marks mixed
predicates (containing both presence-only and non-presence predicates) as
presence-only. In the return statement for these cases, the presence property is
computed using subs.some() which returns true if ANY sub-predicate is
presence-only; this should instead use subs.every() to return true only when ALL
sub-predicates are presence-only. This ensures that a combined predicate is only
flagged as presence-only when all of its constituent predicates are
presence-only, fixing the false advice emission issue that contradicts the
file's contract. Apply the same fix to both the allOf/anyOf case block starting
around line 42 and the similar logic at line 58-60.

packages/server/src/init/mcp.ts-42-45 (1)

42-45: ⚠️ Potential issue | 🟡 Minor

Extract CLI command strings as named constants.

The function uses inline strings ('get', 'add', 'user', '-s', '--', '--version') for CLI arguments. Per the coding guidelines, all domain/wire/UI strings must be named constants. Define constants for these command tokens (e.g., const MCP_GET_SUBCOMMAND = 'get', const MCP_ADD_SUBCOMMAND = 'add', const SCOPE_USER = 'user', etc.) and reference them instead.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/init/mcp.ts` around lines 42 - 45, The
claudeExistsProbe() function and other functions in this file use inline string
literals for CLI command arguments (such as 'get', 'add', 'user', '-s', '--',
'--version') instead of named constants. Define named constants at the top of
the file for each of these CLI command tokens (e.g., MCP_GET_SUBCOMMAND,
MCP_ADD_SUBCOMMAND, SCOPE_USER, etc.) following the same naming pattern as the
existing CLAUDE_CLI, MCP_SUBCOMMAND, and MCP_SERVER_NAME constants, then replace
all inline string occurrences throughout the file with references to these new
constants.

🧹 Nitpick comments (10)

packages/server/src/flows/flow-success.ts (1)

30-75: ⚡ Quick win

Extract domain/wire string literals into named constants.

This segment introduces multiple free domain/wire literals ('signal', 'net', 'element', 'allOf', 'testid', 'role', 'name', 'net', 'success'). Please promote them to named constants to match repository rules.

Proposed refactor

+const PredicateKind = {
+  signal: 'signal',
+  net: 'net',
+  element: 'element',
+  allOf: 'allOf',
+} as const;
+
+const ElementQueryKey = {
+  testid: 'testid',
+  role: 'role',
+  name: 'name',
+} as const;
+
+const SuccessLabelFallback = {
+  net: 'net',
+  success: 'success',
+} as const;
+
 export function successLabel(success: FlowExpect): string {
   if (success.signal !== undefined) return success.signal;
-  if (success.net !== undefined) return success.net.urlContains ?? success.net.method ?? 'net';
-  return success.element?.testid ?? success.element?.name ?? success.element?.role ?? 'success';
+  if (success.net !== undefined)
+    return success.net.urlContains ?? success.net.method ?? SuccessLabelFallback.net;
+  return (
+    success.element?.testid ??
+    success.element?.name ??
+    success.element?.role ??
+    SuccessLabelFallback.success
+  );
 }
 ...
-      const query: Record<string, string> = {};
-      if (testid !== undefined) query['testid'] = testid;
-      if (element.role !== undefined) query['role'] = element.role;
-      if (element.name !== undefined) query['name'] = element.name;
-      if (Object.keys(query).length > 0) parts.push({ kind: 'element', query });
+      const query: Record<string, string> = {};
+      if (testid !== undefined) query[ElementQueryKey.testid] = testid;
+      if (element.role !== undefined) query[ElementQueryKey.role] = element.role;
+      if (element.name !== undefined) query[ElementQueryKey.name] = element.name;
+      if (Object.keys(query).length > 0) parts.push({ kind: PredicateKind.element, query });

As per coding guidelines, “All domain/wire/UI strings must be named constants, never free strings in code.”

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/flows/flow-success.ts` around lines 30 - 75, Extract all
hardcoded string literals from the successLabel and successToPredicate functions
into named constants at the module level. Replace all occurrences of 'signal',
'net', 'element', 'allOf', 'testid', 'role', 'name', and 'success' string
literals with these named constants throughout both functions to comply with the
domain/wire string literal guidelines.

Source: Coding guidelines

packages/server/src/domain/domain-model.test.ts (1)

59-67: ⚡ Quick win

Add a regression test for “anchor signal only” coverage.

Please add a case where a flow references a signal anchor but does not assert expect.signal/success.signal, and verify declaredUntestedSignals still flags it. That protects the stated “asserted signal” contract.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/domain/domain-model.test.ts` around lines 59 - 67, Add a
new test case in the test file after the existing test that verifies when a flow
defines a signal anchor (via the signal property in flow()) but does not assert
it with expect.signal or success.signal, the signal is still properly flagged in
declaredUntestedSignals. Use buildDomainModel, flow, and contract functions
similar to the existing test pattern to construct a scenario where a flow
references a signal without asserting it, then verify that signal appears in the
gaps.declaredUntestedSignals array to protect the stated asserted signal
contract.

packages/server/src/domain/flow-risk.test.ts (1)

29-52: ⚡ Quick win

Add a PRESENCE_ONLY risk test case.

Current coverage skips the FlowAssertionGrade.PRESENCE_ONLY branch, so risk-priority regressions there won’t be caught.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/domain/flow-risk.test.ts` around lines 29 - 52, The test
suite for flowRisk is missing coverage for the FlowAssertionGrade.PRESENCE_ONLY
case, which could allow regressions to go undetected. Add a new test case in the
describe block that exercises the PRESENCE_ONLY grade by calling flowRisk with
FlowAssertionGrade.PRESENCE_ONLY as the first argument and a run result as the
second argument (similar to the existing test cases like the assertion-free flow
test), then assert the expected RiskLevel that should be returned for this grade
to ensure the branch is properly tested and documented.

packages/server/src/events/predicate.ts (1)

363-373: ⚡ Quick win

Extract newly introduced wire strings to named constants.

Line 363 and Line 372 introduce new inline domain/wire strings (note and failureReason). Please hoist these to named constants to match the repository rule and keep message contracts centralized.

As per coding guidelines, "All domain/wire/UI strings must be named constants, never free strings in code."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/events/predicate.ts` around lines 363 - 373, The inline
strings `'no activity to settle'` (used as the note value) and the failureReason
string template (used in the failureReason property) need to be extracted to
named constants following the repository's coding guidelines. Define these
strings as named constants at the module level or in a centralized constants
location, then replace the inline string literals with references to those
constants. Ensure the failureReason constant properly handles the dynamic values
by using a template string with the placeholders for lastType, quietForMs, and
quietMs.
Source: Coding guidelines

packages/server/src/events/predicate.test.ts (1)

228-311: ⚡ Quick win

Consolidate repeated predicate/event literals into named test constants.

The new settled suite adds many inline domain/wire strings (e.g., predicate kinds, signal names, URLs, messages). Please move these to file-level constants for compliance and easier fixture reuse.

As per coding guidelines, "All domain/wire/UI strings must be named constants, never free strings in code."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/events/predicate.test.ts` around lines 228 - 311, Extract
all inline domain/wire/UI strings from the settled predicate test suite (the
describe block starting at line 228) into file-level named constants. This
includes predicate kinds like 'settled', 'signal', and 'allOf', signal names
like 'deploy:shipped' and 'x', URLs like '/api/x' and '/api/deploy', event
property names and values, and failure reason strings like 'not settled'.
Replace all occurrences of these inline literals throughout the test cases with
references to the corresponding named constants to ensure compliance with the
coding guideline that all domain/wire/UI strings must be named constants.
Source: Coding guidelines

packages/server/src/tools/tools.near-miss.test.ts (1)

79-127: ⚡ Quick win

Extract newly added event/tool literals to shared constants.

The new token-budget tests add inline domain/wire strings (URLs, messages, tool-facing literals). Please promote them to named constants for compliance and consistency with other fixtures.

As per coding guidelines, "All domain/wire/UI strings must be named constants, never free strings in code."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/tools/tools.near-miss.test.ts` around lines 79 - 127, The
token budget tests in the describe block contain hardcoded string literals for
URLs, console messages, and log levels that should be extracted to named
constants. Identify all inline strings in the test cases including the URLs
'/1', '/2', '/3' used in iris_network tests, the messages 'a', 'b', 'c' used in
iris_console tests, and the level string 'error' used in the iris_console test.
Create named constants for each of these values and replace the inline strings
throughout the test methods (in the ev() calls, handler calls, and expect
assertions) with references to these constants to ensure compliance with the
coding guideline that all domain/wire strings must be named constants.
Source: Coding guidelines

packages/server/src/tools/assert-grade.test.ts (1)

5-60: ⚡ Quick win

Apply the string-constant rule in this new test suite.

This file introduces many inline domain/wire strings in predicates and assertions. Please define named constants and reuse them across cases.

As per coding guidelines, "All domain/wire/UI strings must be named constants, never free strings in code."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/tools/assert-grade.test.ts` around lines 5 - 60, The test
suite in isPresenceOnlyAssertion contains many inline string literals for
predicate kinds (element, text, signal, net, allOf, not, route, settled,
console) and test data values (button, Saved, order:placed, /api/order, Done,
Welcome, /success, error, etc.). Define named constants at the top of the test
file for all these domain/wire/UI strings and replace each inline string literal
throughout all test cases in the describe block with references to the
corresponding constant. This ensures consistency and makes the strings reusable
across the test suite.
Source: Coding guidelines

packages/server/src/session/output-budget.ts (1)

46-46: ⚡ Quick win

Extract the recommendation message into a named constant.

This introduces a free domain/wire string inline in a .ts file; move it to a named constant and format it from parameters.

♻️ Suggested change

+const LARGE_TIMELINE_RECOMMENDATION_TEMPLATE =
+  'large timeline ({events} events, ~{tokens} tokens) — pass filters:[...] (e.g. ["signal","net"]) or max_events to scope your next call and cut tokens';
+
 export function costHint(payload: unknown, events: number, droppedOldest = 0): CostHint {
   const json = JSON.stringify(payload) ?? '';
   const bytes = json.length;
   const base: CostHint = droppedOldest > 0 ? { events, bytes, droppedOldest } : { events, bytes };
   if (events >= LARGE_TIMELINE_EVENTS || bytes >= LARGE_TIMELINE_BYTES) {
-    base.recommendation = `large timeline (${String(events)} events, ~${String(estimateTokens(json))} tokens) — pass filters:[...] (e.g. ["signal","net"]) or max_events to scope your next call and cut tokens`;
+    base.recommendation = LARGE_TIMELINE_RECOMMENDATION_TEMPLATE
+      .replace('{events}', String(events))
+      .replace('{tokens}', String(estimateTokens(json)));
   }
   return base;
 }

As per coding guidelines, "All domain/wire/UI strings must be named constants, never free strings in code."

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/session/output-budget.ts` at line 46, Extract the
hardcoded recommendation message string currently assigned to
base.recommendation into a named constant at the module level. The constant
should contain the template with placeholders for the dynamic values (events
count and estimated tokens), and then reference this constant in the assignment
while still providing the formatted parameters. This ensures the domain/wire
string is not a free string in the code but follows the guideline of being a
named constant.

Source: Coding guidelines

packages/server/src/init/run.test.ts (1)

61-67: ⚡ Quick win

Add a regression test for malformed package.json.

The suite checks the missing-file case but not invalid JSON content. A targeted test here will prevent reintroducing the crash path.

🧪 Suggested test case

 describe('runInit', () => {
   it('errors cleanly without a package.json', () => {
     const io = memoryIo({});
     const r = runInit(OPTS, io);
     expect(r.ok).toBe(false);
     expect(io.lines.join('\n')).toContain('No package.json');
   });

+  it('errors cleanly with an invalid package.json', () => {
+    const io = memoryIo({ 'package.json': '{ "name": "bad-json",' });
+    const r = runInit(OPTS, io);
+    expect(r.ok).toBe(false);
+    expect(io.lines.join('\n')).toContain('Invalid package.json');
+  });
+
   it('registers iris globally via the claude CLI (not a project .mcp.json) and patches vite', () => {

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/init/run.test.ts` around lines 61 - 67, Add a new
regression test case in the same test suite after the existing "errors cleanly
without a package.json" test. Create a test that verifies the runInit function
handles malformed JSON content gracefully. Use memoryIo to provide a
package.json file with invalid JSON syntax (such as incomplete or syntactically
broken JSON), call runInit with OPTS and the io object, and verify that the
result has ok set to false and that io.lines contains an appropriate error
message indicating the JSON parsing failure. This test will prevent crashes when
package.json exists but contains invalid JSON.

packages/server/src/init/vite-config.test.ts (1)

30-34: ⚡ Quick win

Add a regression case for “import exists, plugin missing”.

Current idempotency coverage only verifies fully patched input. Add a test where the config already imports @syrin/iris/vite but plugins lacks iris(); expected result should still be APPLY.

Suggested test

+  it('does not treat import-only configs as already patched', () => {
+    const src = `import { defineConfig } from 'vite';
+import { iris } from '`@syrin/iris/vite`';
+import react from '`@vitejs/plugin-react`';
+
+export default defineConfig({
+  plugins: [react()],
+});
+`;
+    const r = patchViteConfig(src);
+    expect(r.kind).toBe(VitePatchKind.APPLY);
+  });

Also applies to: 56-61

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/init/vite-config.test.ts` around lines 30 - 34, Add a new
test case after the existing idempotency test in the test file to cover the
regression scenario where a Vite config already has the import statement for
`@syrin/iris/vite` but is missing the iris() plugin in the plugins array. Create a
test config string that includes the import line but lacks the iris() function
call in the plugins configuration, then call patchViteConfig on this config and
assert that the returned kind is VitePatchKind.APPLY, confirming that the
function correctly identifies and applies the patch for this incomplete state.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/e2e/specs/new-features-test.mjs`:
- Around line 54-91: The test code lacks proper error handling and resource
cleanup. If any await statement throws an error before reaching b.close() and
server.close(), resources will leak and hang the CI. Wrap all the test logic
(from chromium.launch through the final chk call) in a try block, and move the
b.close() and server.close() calls into a finally block that runs regardless of
success or failure. Keep the process.exit call after the finally block to ensure
it only executes after cleanup is complete.

In `@packages/server/src/domain/domain-model.ts`:
- Around line 174-190: Extract all hardcoded strings in the coverage status
message building code into named constants at the module or class level. This
includes the "No saved flows yet" message, the template strings used for
constructing the parts array such as "flow"/"flows", "test first: ", "declared
signal(s) no flow asserts (", and "flow(s) assert no consequence". Replace each
free string in the code with a reference to its corresponding named constant to
ensure consistency and maintainability of domain strings throughout the
codebase.
- Around line 77-85: The flowSignals function incorrectly includes anchor
signals in addition to asserted signals, which causes flows that only reference
a signal anchor to be treated as tested even when they don't actually assert or
expect that signal. Remove the line that adds anchor signals to the set (the
condition checking if step.anchor.kind === AnchorKind.SIGNAL) so that only
signals that are explicitly asserted through expect.signal or success.signal are
counted as covered by the flow.

In `@packages/server/src/domain/domain-tools.ts`:
- Around line 23-37: The outputSchema.flows object definition is missing the
optional risk field that is actually returned by the handler's buildDomainModel
function call, causing a schema contract mismatch for consumers. Add an optional
risk field to the Zod schema object within the flows definition (around the
z.object block), using the same pattern as the other optional fields like
mustHold and warning, declaring it as z.string().optional() or with an
appropriate type based on what buildDomainModel returns.
- Around line 17-19: Extract all free string literals from the tool definition
(including the long description string in the description field, and any strings
at the locations mentioned in lines 31-33 and 57-58) into named constants at the
top of the file, then replace each free string with a reference to its
corresponding constant. This ensures all domain/wire/UI strings follow the
repository policy of being named constants rather than inline literals.

In `@packages/server/src/domain/flow-risk.ts`:
- Around line 64-75: The gradeRisk function currently maps
FlowAssertionGrade.PRESENCE_ONLY to RiskLevel.LOW, which incorrectly ranks it as
equivalent to consequence-asserting flows. Instead, change the return value for
the PRESENCE_ONLY condition to return a higher risk level (such as
RiskLevel.MEDIUM) to properly reflect that presence-only assertions represent a
weaker oracle and should be prioritized higher in triage according to the
module's risk prioritization strategy.

In `@packages/server/src/flows/flow-tools.ts`:
- Around line 109-117: The assertions schema object in the outputSchema (which
contains fields like grade, hasConsequenceAssertion, totalSteps,
consequenceSteps, weakSteps, and warning) is missing the successIsConsequence
field that is actually returned by the classifyFlowAssertions function on line
143. Add the successIsConsequence field to the z.object() schema definition with
an appropriate zod type (likely z.boolean()) to match the payload being returned
and prevent strict schema consumers from rejecting the response.
- Around line 246-247: The condition for `stepsClean` includes `steps.length >
0`, which prevents success verification when a flow has zero replayed steps.
This allows flows with declared `success` values but no steps to bypass the
`assertSuccess` check. Remove the `steps.length > 0` requirement from the
`stepsClean` variable definition so that success verification occurs regardless
of whether there are any steps, ensuring that any declared success consequence
is properly validated. Apply this same fix to the similar pattern at lines
483-484.

In `@packages/server/src/init/cursor.ts`:
- Around line 39-44: The parseConfig function currently returns a success result
with an empty config object when encountering non-object JSON roots or null
values, which incorrectly allows automatic rewrites that could overwrite
existing configuration. Modify the parseConfig function to return a result
indicating MANUAL handling mode instead of ok: true when the parsed JSON root is
not an object or is null. Additionally, apply the same MANUAL handling approach
for invalid mcpServers shapes in the configuration validation logic (in the
range around lines 54-67) to prevent unintended rewrites of existing
configuration content.

In `@packages/server/src/init/mcp.ts`:
- Around line 38-50: Extract the inline CLI command tokens ('add', 'user',
'get', '--version') into named constants at the module level, following the
no-free-strings rule. Define constants for each token (e.g., similar to how
CLAUDE_CLI, MCP_SUBCOMMAND, and MCP_SERVER_NAME are defined), then replace all
occurrences of these hardcoded strings in the command builder function and the
claudeExistsProbe and claudeAvailableProbe functions with references to the
newly created constants. This ensures all domain/wire strings are captured as
named constants throughout the module.

In `@packages/server/src/init/plan.ts`:
- Around line 68-255: Extract the inline string literals for step titles and
details throughout the file into named constants at the top, following the
pattern of CLAUDE_MCP_TITLE and CURSOR_MCP_TITLE. This applies to strings in the
installStep, viteSteps, nextSteps, and buildPlan functions including titles like
"Install dependency", "Vite plugin", "IrisDev component", "Next config
(withIris)", "Mount IrisDev", "Connect snippet", and their corresponding detail
messages. Replace each inline string literal with the appropriate constant
reference to ensure all domain/UI strings are centralized and follow the
repository's coding guidelines.

In `@packages/server/src/init/run.ts`:
- Around line 70-71: The JSON.parse call in the gatherPlanInput function at line
71 can throw an exception if the pkgRaw string contains malformed JSON, causing
the CLI to crash. Wrap the JSON.parse(pkgRaw) call in a try-catch block to
handle the parse error gracefully, and either throw a custom error with a
user-friendly message or return an error state that the caller can handle.
Similarly, check the related code around lines 157-166 where similar JSON
parsing may occur and apply the same error handling pattern to ensure all JSON
parsing in the runInit path handles malformed input without crashing.

In `@packages/server/src/init/vite-config.ts`:
- Around line 49-57: The ALREADY detection in patchViteConfig function is too
broad because checking only if the source includes IRIS_MARKER can match
occurrences in comments or imports without verifying the iris plugin is actually
in the plugins array. Replace the simple source.includes(IRIS_MARKER) check with
a more robust validation that confirms the iris() plugin call is present in the
plugins array, similar to how PLUGINS_ARRAY.test(source) validates the plugins
array structure. This ensures ALREADY is only returned when the plugin injection
is truly complete.

In `@packages/server/src/tools/tools.ts`:
- Line 1299: The tools.ts file exceeds the repository's 500-line file limit at
1299 lines. Refactor by extracting related tool definition groups (such as
snapshot tools, query tools, assert tools, and network-console tools) into
separate dedicated modules. Then import these extracted tool groups back into
tools.ts and compose them together in the DOMAIN_TOOLS spread operator to
maintain the same public API while keeping each file within the 500-line limit.
- Around line 339-356: The snapshot cache key generation in the handler function
uses a fallback value of 'default' when sessionId is not provided, causing
different sessions without an explicit sessionId to share the same cache entry
in the process-wide SNAPSHOT_CACHE. This creates privacy and correctness issues
as unrelated sessions can get deltas computed from another session's prior
snapshot. Remove the fallback to 'default' in the sessionId field passed to
applySnapshotDelta and ensure each session has a uniquely identifiable sessionId
rather than silently defaulting multiple sessions to the same cache key value.

---

Outside diff comments:
In `@docs/getting-started.md`:
- Line 443: The troubleshooting section mentions `@syrin/iris-babel-plugin` but
Step 3 of the guide references `@syrin/iris/babel`. Update the package name
reference at line 443 to use `@syrin/iris/babel` instead to match the package
name specified in Step 3, ensuring consistency throughout the documentation and
preventing setup confusion.

---

Minor comments:
In `@apps/e2e/specs/new-features-test.mjs`:
- Line 1: The file comment at the start of new-features-test.mjs contains the
internal changelog tag [Unreleased] which violates the coding guideline
prohibiting internal tracking tags in comments. Remove the reference to
[Unreleased] and the CHANGELOG section from the comment, keeping only the core
description of what the file does for live feature verification.

In `@CHANGELOG.md`:
- Line 7: The heading for version 0.6.10 uses markdown reference-link syntax
with [0.6.10], but there is no corresponding link definition at the bottom of
the CHANGELOG.md file to complete the reference. Add a link definition at the
end of the file following the pattern of other version links in the document to
match the [0.6.10] reference in the heading and enable proper hyperlinking for
this version entry.

In `@docs/usage.md`:
- Line 267: The `iris_domain({})` return type documentation at line 267 of
docs/usage.md is missing the `mustHold?` optional property from the flow object
shape. Add `mustHold?` to the documented flow properties in the return type
definition to match what `buildDomainModel` actually sets in the
packages/server/src/domain/domain-model.ts file when a flow has a success
condition.

In `@packages/browser/src/observers/console.test.ts`:
- Around line 27-36: The test verifies event emission but does not assert that
the original console.error method is actually being forwarded. Add a spy or mock
of the original console.error method before calling installConsole, then after
the console.error call, add an assertion to verify that the original
console.error was invoked with the correct arguments ('boom' and 42) to ensure
the forwarding behavior is actually covered by the test.

In `@packages/server/src/flows/flows.heal.test.ts`:
- Around line 85-93: The test mock session needs to be made cursor-aware to
properly exercise stale-signal protection. Modify the mock session to track a
cursor position and elapsed time: make eventsSince() accept and use a cursor
parameter to filter events rather than returning all events, and make elapsed()
return an actual tracked value (either incrementing or controllable) instead of
always returning 0. This will ensure that heal(..., apply: true) verification
with the since floor from session.elapsed() properly validates that buffered
signals outside the time window are considered stale.

In `@packages/server/src/init/mcp.ts`:
- Around line 42-45: The claudeExistsProbe() function and other functions in
this file use inline string literals for CLI command arguments (such as 'get',
'add', 'user', '-s', '--', '--version') instead of named constants. Define named
constants at the top of the file for each of these CLI command tokens (e.g.,
MCP_GET_SUBCOMMAND, MCP_ADD_SUBCOMMAND, SCOPE_USER, etc.) following the same
naming pattern as the existing CLAUDE_CLI, MCP_SUBCOMMAND, and MCP_SERVER_NAME
constants, then replace all inline string occurrences throughout the file with
references to these new constants.

In `@packages/server/src/tools/assert-grade.ts`:
- Around line 40-46: The walk function's handling of allOf and anyOf predicates
incorrectly marks mixed predicates (containing both presence-only and
non-presence predicates) as presence-only. In the return statement for these
cases, the presence property is computed using subs.some() which returns true if
ANY sub-predicate is presence-only; this should instead use subs.every() to
return true only when ALL sub-predicates are presence-only. This ensures that a
combined predicate is only flagged as presence-only when all of its constituent
predicates are presence-only, fixing the false advice emission issue that
contradicts the file's contract. Apply the same fix to both the allOf/anyOf case
block starting around line 42 and the similar logic at line 58-60.

In `@packages/server/src/tools/snapshot-cost.test.ts`:
- Around line 6-11: Remove the internal design-doc identifiers M0 and M2 from
the comment block in the snapshot-cost.test.ts file. Specifically, remove "M0 —
a reproducible measurement of the" and replace it with a direct description, and
remove "M2" references so the comment describes the measurement approach without
internal tracking codes. Apply the same fix to all other locations in the file
where these design-doc reference patterns appear (at lines 39-40, 59-60, 78, and
87-88), ensuring all letter-digit patterns like N5, G4, M8, etc. are removed
from comments, test titles, and log labels throughout the file.

---

Nitpick comments:
In `@packages/server/src/domain/domain-model.test.ts`:
- Around line 59-67: Add a new test case in the test file after the existing
test that verifies when a flow defines a signal anchor (via the signal property
in flow()) but does not assert it with expect.signal or success.signal, the
signal is still properly flagged in declaredUntestedSignals. Use
buildDomainModel, flow, and contract functions similar to the existing test
pattern to construct a scenario where a flow references a signal without
asserting it, then verify that signal appears in the
gaps.declaredUntestedSignals array to protect the stated asserted signal
contract.

In `@packages/server/src/domain/flow-risk.test.ts`:
- Around line 29-52: The test suite for flowRisk is missing coverage for the
FlowAssertionGrade.PRESENCE_ONLY case, which could allow regressions to go
undetected. Add a new test case in the describe block that exercises the
PRESENCE_ONLY grade by calling flowRisk with FlowAssertionGrade.PRESENCE_ONLY as
the first argument and a run result as the second argument (similar to the
existing test cases like the assertion-free flow test), then assert the expected
RiskLevel that should be returned for this grade to ensure the branch is
properly tested and documented.

In `@packages/server/src/events/predicate.test.ts`:
- Around line 228-311: Extract all inline domain/wire/UI strings from the
settled predicate test suite (the describe block starting at line 228) into
file-level named constants. This includes predicate kinds like 'settled',
'signal', and 'allOf', signal names like 'deploy:shipped' and 'x', URLs like
'/api/x' and '/api/deploy', event property names and values, and failure reason
strings like 'not settled'. Replace all occurrences of these inline literals
throughout the test cases with references to the corresponding named constants
to ensure compliance with the coding guideline that all domain/wire/UI strings
must be named constants.

In `@packages/server/src/events/predicate.ts`:
- Around line 363-373: The inline strings `'no activity to settle'` (used as the
note value) and the failureReason string template (used in the failureReason
property) need to be extracted to named constants following the repository's
coding guidelines. Define these strings as named constants at the module level
or in a centralized constants location, then replace the inline string literals
with references to those constants. Ensure the failureReason constant properly
handles the dynamic values by using a template string with the placeholders for
lastType, quietForMs, and quietMs.

In `@packages/server/src/flows/flow-success.ts`:
- Around line 30-75: Extract all hardcoded string literals from the successLabel
and successToPredicate functions into named constants at the module level.
Replace all occurrences of 'signal', 'net', 'element', 'allOf', 'testid',
'role', 'name', and 'success' string literals with these named constants
throughout both functions to comply with the domain/wire string literal
guidelines.

In `@packages/server/src/init/run.test.ts`:
- Around line 61-67: Add a new regression test case in the same test suite after
the existing "errors cleanly without a package.json" test. Create a test that
verifies the runInit function handles malformed JSON content gracefully. Use
memoryIo to provide a package.json file with invalid JSON syntax (such as
incomplete or syntactically broken JSON), call runInit with OPTS and the io
object, and verify that the result has ok set to false and that io.lines
contains an appropriate error message indicating the JSON parsing failure. This
test will prevent crashes when package.json exists but contains invalid JSON.

In `@packages/server/src/init/vite-config.test.ts`:
- Around line 30-34: Add a new test case after the existing idempotency test in
the test file to cover the regression scenario where a Vite config already has
the import statement for `@syrin/iris/vite` but is missing the iris() plugin in
the plugins array. Create a test config string that includes the import line but
lacks the iris() function call in the plugins configuration, then call
patchViteConfig on this config and assert that the returned kind is
VitePatchKind.APPLY, confirming that the function correctly identifies and
applies the patch for this incomplete state.

In `@packages/server/src/session/output-budget.ts`:
- Line 46: Extract the hardcoded recommendation message string currently
assigned to base.recommendation into a named constant at the module level. The
constant should contain the template with placeholders for the dynamic values
(events count and estimated tokens), and then reference this constant in the
assignment while still providing the formatted parameters. This ensures the
domain/wire string is not a free string in the code but follows the guideline of
being a named constant.

In `@packages/server/src/tools/assert-grade.test.ts`:
- Around line 5-60: The test suite in isPresenceOnlyAssertion contains many
inline string literals for predicate kinds (element, text, signal, net, allOf,
not, route, settled, console) and test data values (button, Saved, order:placed,
/api/order, Done, Welcome, /success, error, etc.). Define named constants at the
top of the test file for all these domain/wire/UI strings and replace each
inline string literal throughout all test cases in the describe block with
references to the corresponding constant. This ensures consistency and makes the
strings reusable across the test suite.

In `@packages/server/src/tools/tools.near-miss.test.ts`:
- Around line 79-127: The token budget tests in the describe block contain
hardcoded string literals for URLs, console messages, and log levels that should
be extracted to named constants. Identify all inline strings in the test cases
including the URLs '/1', '/2', '/3' used in iris_network tests, the messages
'a', 'b', 'c' used in iris_console tests, and the level string 'error' used in
the iris_console test. Create named constants for each of these values and
replace the inline strings throughout the test methods (in the ev() calls,
handler calls, and expect assertions) with references to these constants to
ensure compliance with the coding guideline that all domain/wire strings must be
named constants.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 47fd35ff-a94f-48c2-880e-ff8bf8781c13

📥 Commits

Reviewing files that changed from the base of the PR and between ff54d8d and 3b0e8e0.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (89)

CHANGELOG.md
apps/e2e/specs/new-features-test.mjs
docs/agent-cheatsheet.md
docs/getting-started.md
docs/integrate-with-claude-code.md
docs/token-efficiency.md
docs/usage.md
packages/babel-plugin/package.json
packages/browser/package.json
packages/browser/src/observers/console.test.ts
packages/browser/src/observers/console.ts
packages/browser/src/observers/network.test.ts
packages/browser/src/observers/network.ts
packages/browser/src/observers/route.test.ts
packages/browser/src/observers/route.ts
packages/eslint-plugin/package.json
packages/iris/package.json
packages/iris/src/vite.ts
packages/iris/tsup.config.ts
packages/next/package.json
packages/protocol/package.json
packages/protocol/src/constants.ts
packages/protocol/src/types.ts
packages/react/package.json
packages/server/package.json
packages/server/src/cli.test.ts
packages/server/src/cli.ts
packages/server/src/domain/domain-model.test.ts
packages/server/src/domain/domain-model.ts
packages/server/src/domain/domain-tools.ts
packages/server/src/domain/flow-risk.test.ts
packages/server/src/domain/flow-risk.ts
packages/server/src/events/predicate.test.ts
packages/server/src/events/predicate.ts
packages/server/src/flows/flow-classify.test.ts
packages/server/src/flows/flow-classify.ts
packages/server/src/flows/flow-replay.dynamic.test.ts
packages/server/src/flows/flow-replay.test.ts
packages/server/src/flows/flow-replay.ts
packages/server/src/flows/flow-success.test.ts
packages/server/src/flows/flow-success.ts
packages/server/src/flows/flow-tools.ts
packages/server/src/flows/flows.heal.test.ts
packages/server/src/flows/flows.ts
packages/server/src/flows/heal-proposal.test.ts
packages/server/src/flows/heal.ts
packages/server/src/flows/tools.flow-replay.test.ts
packages/server/src/flows/tools.flows.test.ts
packages/server/src/index.ts
packages/server/src/init/cursor.test.ts
packages/server/src/init/cursor.ts
packages/server/src/init/detect.test.ts
packages/server/src/init/detect.ts
packages/server/src/init/mcp.test.ts
packages/server/src/init/mcp.ts
packages/server/src/init/node-io.ts
packages/server/src/init/plan.test.ts
packages/server/src/init/plan.ts
packages/server/src/init/run.test.ts
packages/server/src/init/run.ts
packages/server/src/init/snippets.ts
packages/server/src/init/vite-config.test.ts
packages/server/src/init/vite-config.ts
packages/server/src/session/output-budget.test.ts
packages/server/src/session/output-budget.ts
packages/server/src/session/tools.live-control.test.ts
packages/server/src/tools/assert-grade.test.ts
packages/server/src/tools/assert-grade.ts
packages/server/src/tools/query-paginate.test.ts
packages/server/src/tools/query-paginate.ts
packages/server/src/tools/snapshot-cost.test.ts
packages/server/src/tools/snapshot-delta.test.ts
packages/server/src/tools/snapshot-delta.ts
packages/server/src/tools/tool-names.ts
packages/server/src/tools/tools.assert-advice.test.ts
packages/server/src/tools/tools.near-miss.test.ts
packages/server/src/tools/tools.ts
packages/test/package.json
packages/test/src/flow-spec.test.ts
packages/test/src/register.test.ts
packages/test/src/success-assert.test.ts
packages/test/src/success-assert.ts
packages/vite-plugin/README.md
packages/vite-plugin/package.json
packages/vite-plugin/src/build.integration.test.ts
packages/vite-plugin/src/index.test.ts
packages/vite-plugin/src/index.ts
packages/vite-plugin/tsconfig.json
tsconfig.json

coderabbitai · 2026-06-18T11:50:21Z

+const b = await chromium.launch({ headless: true });
+const p = await b.newPage();
+await p.goto('http://localhost:4310/?session=demo', { waitUntil: 'networkidle' });
+for (let i = 0; i < 200 && server.bridge.sessions.count() === 0; i++) await sleep(50);
+
+console.log('\n=== Iris × new features (:4310) ===');
+chk('dashboard SDK connected', server.bridge.sessions.count() > 0);
+
+// count_only — just the match count, no descriptors.
+const co = await T('iris_query', { by: 'role', value: 'button', count_only: true });
+chk('iris_query count_only returns a count, drops elements', typeof co.count === 'number' && co.count >= 1 && co.elements === undefined, `count=${co.count}`);
+
+// limit — cap descriptors; when more matched, total + truncated flag it.
+const lim = await T('iris_query', { by: 'role', value: 'button', limit: 1 });
+const moreThanOne = (co.count ?? 0) > 1;
+chk('iris_query limit caps descriptors (truncated when more)', (lim.elements?.length ?? 0) <= 1 && (!moreThanOne || (lim.truncated === true && lim.total === co.count)), `returned=${lim.elements?.length}, total=${lim.total ?? 'n/a'}`);
+
+// Auth (pre-filled) → dashboard with its count-up animations.
+await T('iris_act_and_wait', { ref: await refOf('testid', 'login-submit'), action: 'click', until: { kind: 'signal', name: 'auth:granted' }, timeout_ms: 5000 });
+chk('login → dashboard', (await refOf('testid', 'nav-deployments')) !== null);
+
+// settled wait — the dashboard's count-up counters emit dom.text every frame; settle must STILL
+// resolve (the ambient-animation fix). Pre-fix this would time out at 4s with pass:false.
+const settled = await T('iris_wait_for', { predicate: { kind: 'settled', quietMs: 300 }, timeout_ms: 4000 });
+chk('settled resolves despite count-up animation churn', settled.pass === true, JSON.stringify(settled.evidence ?? {}));
+
+// act_and_wait with NO `until` → auto-settle after a nav click; verdict carries settled evidence.
+const aw = await T('iris_act_and_wait', { ref: await refOf('testid', 'nav-deployments'), action: 'click' });
+chk('act_and_wait (no until) auto-settles', aw.verdict?.pass === true && aw.verdict?.evidence?.settled === true, JSON.stringify(aw.verdict?.evidence ?? {}));
+
+// presence-only advice — a PASSING element assertion is nudged toward a consequence.
+const adv = await T('iris_assert', { predicate: { kind: 'element', query: { testid: 'deploy-list' } } });
+chk('iris_assert presence-only attaches advice', adv.pass === true && typeof adv.advice === 'string' && adv.advice.includes('consequence'), adv.advice ? 'advice present' : 'no advice');
+
+console.log(`\n${fail === 0 ? '✅ NEW FEATURES VERIFIED' : '❌ FAILED'} (${pass} passed, ${fail} failed)`);
+await b.close();
+await server.close();
+process.exit(fail === 0 ? 0 : 1);


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guarantee teardown with try/finally around the e2e run.

Any thrown await in the flow skips b.close()/server.close(), which can leak resources and hang CI jobs.

🛠️ Suggested hardening

-const b = await chromium.launch({ headless: true }); -const p = await b.newPage(); -await p.goto('http://localhost:4310/?session=demo', { waitUntil: 'networkidle' }); -for (let i = 0; i < 200 && server.bridge.sessions.count() === 0; i++) await sleep(50); +let b; +let p; +try { + b = await chromium.launch({ headless: true }); + p = await b.newPage(); + await p.goto('http://localhost:4310/?session=demo', { waitUntil: 'networkidle' }); + for (let i = 0; i < 200 && server.bridge.sessions.count() === 0; i++) await sleep(50); @@ -console.log(`\n${fail === 0 ? '✅ NEW FEATURES VERIFIED' : '❌ FAILED'} (${pass} passed, ${fail} failed)`); -await b.close(); -await server.close(); + console.log(`\n${fail === 0 ? '✅ NEW FEATURES VERIFIED' : '❌ FAILED'} (${pass} passed, ${fail} failed)`); +} finally { + await p?.close(); + await b?.close(); + await server.close(); +} process.exit(fail === 0 ? 0 : 1);

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/e2e/specs/new-features-test.mjs` around lines 54 - 91, The test code lacks proper error handling and resource cleanup. If any await statement throws an error before reaching b.close() and server.close(), resources will leak and hang the CI. Wrap all the test logic (from chromium.launch through the final chk call) in a try block, and move the b.close() and server.close() calls into a finally block that runs regardless of success or failure. Keep the process.exit call after the finally block to ensure it only executes after cleanup is complete.

coderabbitai · 2026-06-18T11:50:21Z

+function flowSignals(flow: FlowFile): string[] {
+  const set = new Set<string>();
+  for (const step of flatten(flow.steps)) {
+    if (step.anchor.kind === AnchorKind.SIGNAL) set.add(step.anchor.name);
+    if (step.expect?.signal !== undefined) set.add(step.expect.signal);
+  }
+  if (flow.success?.signal !== undefined) set.add(flow.success.signal);
+  return [...set];
+}


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Signal-gap computation currently over-credits non-asserting flows.

On Line 80, flowSignals includes anchor signals, but declaredUntestedSignals is defined as signals that no flow asserts. A flow that only references a signal anchor (without expect.signal/success.signal) will be treated as covered, which can suppress real intent gaps.

Suggested fix

function flowSignals(flow: FlowFile): string[] { const set = new Set<string>(); for (const step of flatten(flow.steps)) { - if (step.anchor.kind === AnchorKind.SIGNAL) set.add(step.anchor.name); if (step.expect?.signal !== undefined) set.add(step.expect.signal); } if (flow.success?.signal !== undefined) set.add(flow.success.signal); return [...set]; }

Also applies to: 133-136

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/server/src/domain/domain-model.ts` around lines 77 - 85, The flowSignals function incorrectly includes anchor signals in addition to asserted signals, which causes flows that only reference a signal anchor to be treated as tested even when they don't actually assert or expect that signal. Remove the line that adds anchor signals to the set (the condition checking if step.anchor.kind === AnchorKind.SIGNAL) so that only signals that are explicitly asserted through expect.signal or success.signal are counted as covered by the flow.

coderabbitai · 2026-06-18T11:50:21Z

+    return 'No saved flows yet — record the critical journeys (iris_record_start) so the agent learns the app.';
+  }
+  const parts = [
+    `${String(flowCount)} flow${flowCount === 1 ? '' : 's'}: ${String(coverage.asserted)} asserted, ${String(coverage.presenceOnly)} presence-only, ${String(coverage.assertionFree)} assertion-free`,
+  ];
+  if (topRisk !== undefined) {
+    parts.push(`test first: ${topRisk.name} (${topRisk.reason})`);
+  }
+  if (gaps.declaredUntestedSignals.length > 0) {
+    parts.push(
+      `${String(gaps.declaredUntestedSignals.length)} declared signal(s) no flow asserts (${gaps.declaredUntestedSignals.join(', ')})`,
+    );
+  }
+  if (gaps.unassertedFlows.length > 0) {
+    parts.push(`${String(gaps.unassertedFlows.length)} flow(s) assert no consequence`);
+  }
+  return parts.join('. ') + '.';


🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Extract runtime/domain strings into named constants.

Lines 174–190 introduce multiple free strings in tool-facing output. This violates the repo rule and makes wording changes harder to maintain consistently.

As per coding guidelines, “All domain/wire/UI strings must be named constants, never free strings in code.”

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/server/src/domain/domain-model.ts` around lines 174 - 190, Extract all hardcoded strings in the coverage status message building code into named constants at the module or class level. This includes the "No saved flows yet" message, the template strings used for constructing the parts array such as "flow"/"flows", "test first: ", "declared signal(s) no flow asserts (", and "flow(s) assert no consequence". Replace each free string in the code with a reference to its corresponding named constant to ensure consistency and maintainability of domain strings throughout the codebase.

Source: Coding guidelines

coderabbitai · 2026-06-18T11:50:21Z

+    description:
+      'Read the app domain model BEFORE testing: every saved flow with its assertion grade, the consequence that MUST hold for it (mustHold = what it actually tests), the anchors/signals it exercises, plus GAPS — declared signals/testids that NO flow asserts (untested intent), and flows that assert no observable consequence. Use this to decide what to test and where the real risk is, instead of crawling the whole app. Reads .iris/flows/ + .iris/contract.json (no browser needed).',
+    inputSchema: {},


🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Move tool-facing strings to named constants.

This file introduces multiple free strings in tool description and schema metadata. Please extract them to constants to satisfy the repo string policy.

As per coding guidelines, “All domain/wire/UI strings must be named constants, never free strings in code.”

Also applies to: 31-33, 57-58

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/server/src/domain/domain-tools.ts` around lines 17 - 19, Extract all free string literals from the tool definition (including the long description string in the description field, and any strings at the locations mentioned in lines 31-33 and 57-58) into named constants at the top of the file, then replace each free string with a reference to its corresponding constant. This ensures all domain/wire/UI strings follow the repository policy of being named constants rather than inline literals.

Source: Coding guidelines

coderabbitai · 2026-06-18T11:50:21Z

+        z.object({
+          name: z.string(),
+          steps: z.number(),
+          grade: z.string(),
+          asserts: z.boolean(),
+          mustHold: z
+            .string()
+            .optional()
+            .describe(
+              'The success consequence that must hold for this flow (what it actually tests).',
+            ),
+          warning: z.string().optional(),
+          signals: z.array(z.string()),
+          testids: z.array(z.string()),
+        }),


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

outputSchema.flows is missing the optional risk field returned by the handler.

The handler returns buildDomainModel(...) (Line 71), and DomainFlowSummary includes risk when run history exists. But outputSchema.flows doesn’t declare it, creating a wire-contract mismatch for schema-aware consumers.

Suggested schema addition

flows: z.array( z.object({ name: z.string(), steps: z.number(), grade: z.string(), asserts: z.boolean(), @@ warning: z.string().optional(), signals: z.array(z.string()), testids: z.array(z.string()), + risk: z + .object({ + level: z.string(), + reason: z.string(), + lastStatus: z.string().optional(), + }) + .optional(), }), ),

Also applies to: 61-72

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/server/src/domain/domain-tools.ts` around lines 23 - 37, The outputSchema.flows object definition is missing the optional risk field that is actually returned by the handler's buildDomainModel function call, causing a schema contract mismatch for consumers. Add an optional risk field to the Zod schema object within the flows definition (around the z.object block), using the same pattern as the other optional fields like mustHold and warning, declaring it as z.string().optional() or with an appropriate type based on what buildDomainModel returns.

coderabbitai · 2026-06-18T11:50:21Z

+const CLAUDE_MCP_TITLE = 'MCP server (Claude, global)';
+const CURSOR_MCP_TITLE = 'MCP server (Cursor, global)';
+
+function claudeMcpStep(input: PlanInput): Step | null {
+  if (!input.claudeCli) return null;
+  if (input.mcpExists) {
+    return {
+      title: CLAUDE_MCP_TITLE,
+      target: MCP_TARGET,
+      status: StepStatus.ALREADY,
+      detail: 'iris already registered (install once, used by every project)',
+    };
+  }
+  const cmd = claudeAddCommand(input.options.port);
+  return {
+    title: CLAUDE_MCP_TITLE,
+    target: MCP_TARGET,
+    status: StepStatus.APPLY,
+    detail: 'register iris globally for all projects',
+    exec: { command: cmd.command, args: cmd.args, fallback: cmd.display },
+  };
+}
+
+function cursorMcpStep(input: PlanInput): Step | null {
+  if (!input.cursorPresent) return null;
+  const r = mergeCursorConfig(input.cursorConfig, input.options.port);
+  if (r.status === CursorMergeStatus.ALREADY) {
+    return {
+      title: CURSOR_MCP_TITLE,
+      target: input.cursorConfigPath,
+      status: StepStatus.ALREADY,
+      detail: 'iris already in Cursor global config',
+    };
+  }
+  if (r.status === CursorMergeStatus.MANUAL) {
+    return {
+      title: CURSOR_MCP_TITLE,
+      target: input.cursorConfigPath,
+      status: StepStatus.MANUAL,
+      detail: `couldn't parse ${input.cursorConfigPath} — add this server by hand:\n  "iris": ${JSON.stringify(cursorServerEntry(input.options.port))}`,
+    };
+  }
+  return {
+    title: CURSOR_MCP_TITLE,
+    target: input.cursorConfigPath,
+    status: StepStatus.APPLY,
+    detail: 'register iris in Cursor global config',
+    write: { path: input.cursorConfigPath, content: r.content },
+  };
+}
+
+/** One global registration per detected agent (Claude + Cursor). Falls back to a manual note. */
+function mcpSteps(input: PlanInput): Step[] {
+  if (!input.options.mcp) {
+    return [
+      {
+        title: 'MCP server (global)',
+        target: MCP_TARGET,
+        status: StepStatus.SKIP,
+        detail: '--no-mcp',
+      },
+    ];
+  }
+  const steps = [claudeMcpStep(input), cursorMcpStep(input)].filter((s): s is Step => s !== null);
+  if (steps.length > 0) return steps;
+  // No supported agent detected — print the one-time global instructions.
+  return [
+    {
+      title: 'MCP server (global)',
+      target: MCP_TARGET,
+      status: StepStatus.MANUAL,
+      detail: mcpManual(input.options.port),
+    },
+  ];
+}
+
+function installStep(input: PlanInput): Step {
+  const pm = input.detection.packageManager;
+  const command = installCommand(pm, IRIS_PACKAGE);
+  if (!input.options.install) {
+    return {
+      title: 'Install dependency',
+      target: 'package.json',
+      status: StepStatus.MANUAL,
+      detail: command,
+    };
+  }
+  const parts = installCommandParts(pm, IRIS_PACKAGE);
+  return {
+    title: 'Install dependency',
+    target: 'package.json',
+    status: StepStatus.APPLY,
+    detail: command,
+    exec: { command: parts.command, args: parts.args, fallback: command },
+  };
+}
+
+function viteSteps(input: PlanInput): Step[] {
+  const cfg = input.viteConfig;
+  const port = input.options.port;
+  if (cfg === null) {
+    return [
+      {
+        title: 'Vite plugin',
+        target: 'vite.config',
+        status: StepStatus.MANUAL,
+        detail: viteManual(port),
+      },
+    ];
+  }
+  const patch = patchViteConfig(cfg.source, port);
+  if (patch.kind === VitePatchKind.ALREADY) {
+    return [
+      {
+        title: 'Vite plugin',
+        target: cfg.path,
+        status: StepStatus.ALREADY,
+        detail: 'iris() already in plugins',
+      },
+    ];
+  }
+  if (patch.kind === VitePatchKind.MANUAL) {
+    return [
+      {
+        title: 'Vite plugin',
+        target: cfg.path,
+        status: StepStatus.MANUAL,
+        detail: `${patch.reason}\n\n${viteManual(port)}`,
+      },
+    ];
+  }
+  return [
+    {
+      title: 'Vite plugin',
+      target: cfg.path,
+      status: StepStatus.APPLY,
+      detail: 'add iris() to plugins (also injects connect())',
+      write: { path: cfg.path, content: patch.code },
+    },
+  ];
+}
+
+function nextSteps(input: PlanInput): Step[] {
+  const configFile = input.nextConfigFile ?? 'next.config.mjs';
+  const devFile: Step = input.nextIrisDevExists
+    ? {
+        title: 'IrisDev component',
+        target: NEXT_IRIS_DEV_PATH,
+        status: StepStatus.ALREADY,
+        detail: 'file exists',
+      }
+    : {
+        title: 'IrisDev component',
+        target: NEXT_IRIS_DEV_PATH,
+        status: StepStatus.APPLY,
+        detail: 'create dev-only connect component',
+        write: { path: NEXT_IRIS_DEV_PATH, content: nextIrisDevFile(input.options.port) },
+      };
+  return [
+    devFile,
+    {
+      title: 'Next config (withIris)',
+      target: configFile,
+      status: StepStatus.MANUAL,
+      detail: nextConfigManual(configFile),
+    },
+    {
+      title: 'Mount IrisDev',
+      target: 'app/layout.tsx',
+      status: StepStatus.MANUAL,
+      detail: NEXT_LAYOUT_MANUAL,
+    },
+  ];
+}
+
+export function buildPlan(input: PlanInput): Plan {
+  const steps: Step[] = [...mcpSteps(input), installStep(input)];
+  if (input.detection.framework === Framework.VITE) {
+    steps.push(...viteSteps(input));
+  } else if (input.detection.framework === Framework.NEXT) {
+    steps.push(...nextSteps(input));
+  } else {
+    steps.push({
+      title: 'Connect snippet',
+      target: 'index.html',
+      status: StepStatus.MANUAL,
+      detail: htmlManual(input.options.port),
+    });


🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Extract remaining step title/detail literals into named constants.

This file still has many inline domain/UI strings in plan steps; centralizing them as constants will align with the repo rule and reduce drift in CLI messaging.

As per coding guidelines, All domain/wire/UI strings must be named constants, never free strings in code.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/server/src/init/plan.ts` around lines 68 - 255, Extract the inline string literals for step titles and details throughout the file into named constants at the top, following the pattern of CLAUDE_MCP_TITLE and CURSOR_MCP_TITLE. This applies to strings in the installStep, viteSteps, nextSteps, and buildPlan functions including titles like "Install dependency", "Vite plugin", "IrisDev component", "Next config (withIris)", "Mount IrisDev", "Connect snippet", and their corresponding detail messages. Replace each inline string literal with the appropriate constant reference to ensure all domain/UI strings are centralized and follow the repository's coding guidelines.

Source: Coding guidelines

coderabbitai · 2026-06-18T11:50:21Z

+function gatherPlanInput(options: InitOptions, io: InitIo, pkgRaw: string): PlanInput {
+  const pkg: unknown = JSON.parse(pkgRaw);


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle malformed package.json without crashing iris init.

Line 71 can throw during parse, and that exception is currently uncaught in the runInit path. This turns a recoverable input error into a hard CLI failure.

💡 Proposed fix

export function runInit(options: InitOptions, io: InitIo): InitResult { const pkgRaw = io.readFile(PACKAGE_JSON); if (pkgRaw === null) { io.print('No package.json found. Run `iris init` from your project root.'); return { ok: false, applied: 0, manual: 0 }; } - const plan = buildPlan(gatherPlanInput(options, io, pkgRaw)); + let plan: Plan; + try { + plan = buildPlan(gatherPlanInput(options, io, pkgRaw)); + } catch { + io.print('Invalid package.json. Fix JSON syntax and re-run `iris init`.'); + return { ok: false, applied: 0, manual: 0 }; + } const failed = options.dryRun ? new Set<string>() : applyEffects(plan, io); return report(plan, options.dryRun, failed, io); }

Also applies to: 157-166

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/server/src/init/run.ts` around lines 70 - 71, The JSON.parse call in the gatherPlanInput function at line 71 can throw an exception if the pkgRaw string contains malformed JSON, causing the CLI to crash. Wrap the JSON.parse(pkgRaw) call in a try-catch block to handle the parse error gracefully, and either throw a custom error with a user-friendly message or return an error state that the caller can handle. Similarly, check the related code around lines 157-166 where similar JSON parsing may occur and apply the same error handling pattern to ensure all JSON parsing in the runInit path handles malformed input without crashing.

coderabbitai · 2026-06-18T11:50:21Z

+export function patchViteConfig(source: string, port?: number): VitePatch {
+  if (source.includes(IRIS_MARKER)) {
+    return { kind: VitePatchKind.ALREADY };
+  }
+  if (!PLUGINS_ARRAY.test(source)) {
+    return { kind: VitePatchKind.MANUAL, reason: NO_PLUGINS_REASON };
+  }
+  return { kind: VitePatchKind.APPLY, code: insertImport(insertPlugin(source, port)) };
+}


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

ALREADY detection is too broad and can skip required plugin injection.

Checking only @syrin/iris/vite presence can return ALREADY when the config has the import (or even just a comment) but no iris(...) in plugins, leaving setup incomplete.

Suggested fix

const IRIS_MARKER = '`@syrin/iris/vite`'; +const IRIS_PLUGIN_IN_PLUGINS = /plugins\s*:\s*\[[\s\S]*?\biris\s*\(/; function insertImport(source: string): string { + if (source.includes(IRIS_MARKER)) return source; const matches = [...source.matchAll(IMPORT_LINE)]; const last = matches[matches.length - 1]; if (last?.index === undefined) { return `${VITE_IMPORT}\n${source}`; } const end = last.index + last[0].length; return `${source.slice(0, end)}\n${VITE_IMPORT}${source.slice(end)}`; } export function patchViteConfig(source: string, port?: number): VitePatch { - if (source.includes(IRIS_MARKER)) { + if (source.includes(IRIS_MARKER) && IRIS_PLUGIN_IN_PLUGINS.test(source)) { return { kind: VitePatchKind.ALREADY }; }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/server/src/init/vite-config.ts` around lines 49 - 57, The ALREADY detection in patchViteConfig function is too broad because checking only if the source includes IRIS_MARKER can match occurrences in comments or imports without verifying the iris plugin is actually in the plugins array. Replace the simple source.includes(IRIS_MARKER) check with a more robust validation that confirms the iris() plugin call is present in the plugins array, similar to how PLUGINS_ARRAY.test(source) validates the plugins array structure. This ensures ALREADY is only returned when the plugin injection is truly complete.

coderabbitai · 2026-06-18T11:50:21Z

+    handler: (deps, args) => {
+      const sessionId = asString(args['sessionId']);
+      const mode = asString(args['mode']) ?? SnapshotMode.FULL;
+      return commandOrThrow(deps, sessionId, IrisCommand.SNAPSHOT, {
        scope: args['scope'],
-        mode: args['mode'] ?? SnapshotMode.FULL,
-      }),
+        mode,
+      }).then((raw) =>
+        withSizeCost(
+          applySnapshotDelta(
+            raw,
+            {
+              sessionId: sessionId ?? 'default',
+              scope: asString(args['scope']) ?? '',
+              mode,
+              diff: args['diff'] === true,
+            },
+            SNAPSHOT_CACHE,
+          ),


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Snapshot diff cache key can collide across sessions when sessionId is omitted.

At Line 350, sessionId ?? 'default' is used with a process-wide SNAPSHOT_CACHE. Different sessions that omit sessionId can share one cache key and get deltas computed from another session’s prior snapshot, which is both incorrect and a privacy boundary risk.

Proposed fix

handler: (deps, args) => { - const sessionId = asString(args['sessionId']); + const requestedSessionId = asString(args['sessionId']); + const resolvedSessionId = deps.sessions.resolve(requestedSessionId).id; const mode = asString(args['mode']) ?? SnapshotMode.FULL; - return commandOrThrow(deps, sessionId, IrisCommand.SNAPSHOT, { + return commandOrThrow(deps, resolvedSessionId, IrisCommand.SNAPSHOT, { scope: args['scope'], mode, }).then((raw) => withSizeCost( applySnapshotDelta( raw, { - sessionId: sessionId ?? 'default', + sessionId: resolvedSessionId, scope: asString(args['scope']) ?? '', mode, diff: args['diff'] === true, }, SNAPSHOT_CACHE, ), ), ); },

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/server/src/tools/tools.ts` around lines 339 - 356, The snapshot cache key generation in the handler function uses a fallback value of 'default' when sessionId is not provided, causing different sessions without an explicit sessionId to share the same cache entry in the process-wide SNAPSHOT_CACHE. This creates privacy and correctness issues as unrelated sessions can get deltas computed from another session's prior snapshot. Remove the fallback to 'default' in the sessionId field passed to applySnapshotDelta and ensure each session has a uniquely identifiable sessionId rather than silently defaulting multiple sessions to the same cache key value.

coderabbitai · 2026-06-18T11:50:21Z

  },
  // iris_capabilities (live | fromDisk) + iris_contract_save. See contract-tools.ts.
  ...CONTRACT_TOOLS,
+  ...DOMAIN_TOOLS,


🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

tools.ts exceeds the repository’s 500-line file limit.

This file now reaches at least Line 1299. Please split tool definitions into smaller modules (for example: snapshot/query/assert/network-console groups) and compose them in TOOLS to stay within the limit.

As per coding guidelines, **/*.{ts,tsx,js,jsx} requires that no file exceeds 500 lines.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/server/src/tools/tools.ts` at line 1299, The tools.ts file exceeds the repository's 500-line file limit at 1299 lines. Refactor by extracting related tool definition groups (such as snapshot tools, query tools, assert tools, and network-console tools) into separate dedicated modules. Then import these extracted tool groups back into tools.ts and compose them together in the DOMAIN_TOOLS spread operator to maintain the same public API while keeping each file within the 500-line limit.

Source: Coding guidelines

divshekhar and others added 30 commits June 18, 2026 02:04

divshekhar and others added 2 commits June 18, 2026 16:52

coderabbitai Bot reviewed Jun 18, 2026

View reviewed changes

divshekhar merged commit ad2ff44 into main Jun 18, 2026
3 checks passed

		function gatherPlanInput(options: InitOptions, io: InitIo, pkgRaw: string): PlanInput {
		const pkg: unknown = JSON.parse(pkgRaw);

Conversation

divshekhar commented Jun 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this is

Highlights (full list in CHANGELOG [0.6.10])

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

divshekhar commented Jun 18, 2026 •

edited by coderabbitai Bot

Loading

Highlights (full list in CHANGELOG `[0.6.10]`)

coderabbitai Bot commented Jun 18, 2026 •

edited

Loading