Releases · syrin-labs/iris

20 Jun 09:22

v0.8.0

9dd7f44

v0.8.0 Latest

Latest

[0.8.0] — 2026-06-20

The "developers love it" release. 0.7.0 won the agent; 0.8.0 wins the human — the dev who watches the
agent work, points at what's wrong, and trusts the green.

Added

Human review marks — "annotate the bug where you see it" (packages/browser, packages/server,
packages/protocol). A dev-only "Flag a bug" button rides with the presenter: the human toggles
it, clicks the element that looks wrong, types what's wrong, and Iris drops a numbered pin + emits a
HUMAN_MARK. The mark carries the element's re-resolvable anchor (the same durable address a
recorded flow uses) and the source file:line — so the agent fixes the exact element and code,
not a guess. The agent drains marks with the new iris_review tool: each pending mark comes with
a ready-to-act fix hint (Open src/Checkout.tsx:42 and fix: <note>. Then iris_review { resolve: m1 }),
reading never consumes a mark, and resolve retires it once fixed. Off the deterministic benchmark
path (human-driven) — pnpm bench unchanged.
First-run readiness + loop intro — iris_wait_ready (packages/server). Call it right after
init: it blocks until the app's SDK connects (returns instantly if a session already exists, so zero
latency on the happy path and on the benchmark), or times out with a recovery hint. Smooths the
most common first-5-minutes footgun — the agent's first real call racing the WebSocket connect. Its
ready response also carries a one-line loop guide (look → act → observe → assert → regress, plus
the human-flag → iris_review loop), so a fresh agent learns how to drive Iris on its first call
without reading docs. Pure, injected clock/sleep; off the benchmark path.
Deterministic visual regression — iris_viewport (packages/server). Pin the driven page to a
fixed viewport size (clamped to sane bounds) so a screenshot baseline is reproducible across machines
— the last missing piece of CI-stable visual diffing, alongside the already-shipped iris_visual_diff
masks (neutralize volatile regions) and a frozen clock (iris_clock). Drive-only, additive; off the
benchmark path. Provider-driven and tested via a fake page like iris_network_mock.
CDP network mock / intercept — iris_network_mock (packages/server). On a driven page
(iris drive), stub a request deterministically: return a 500, force offline (abort), or delay a
response — so "verify the app handles a failed payment" is one declared rule, no backend changes. The
matcher is pure (first rule whose url-substring + optional method matches wins → fulfill/abort/continue)
and the Playwright page.route wiring is driven in tests with a fake Page/Route. Needs a driven
browser; returns a recommendation to iris drive otherwise. Off the agent/benchmark path.
iris status shows sessions + health at a glance (packages/server). The daemon exposes a
local GET /status; iris status now reports each connected tab (url, throttled, stale, pending
human marks) and the session count — not just "running: pid". The plan's "no more pkill in a README"
daemon DX. Local-only, off the agent/benchmark path.
Actionable error recovery (packages/server). Every tool error returned to the agent now carries
a recovery hint when the failure is recognized — the no-session footgun, multiple/unknown sessions,
a throttled tab, a missing baseline/recording, the pairing-token config — so the first 5 minutes never
dead-end on "what do I do now?". Conservative: an unrecognized error gets no invented advice.
The panel always reflects the agent's real state — iris_yield (packages/server,
packages/browser, packages/protocol). A human watching the browser must never see "live" when the
agent has actually stopped. The agent signals its turn boundary with iris_yield({ mode: "waiting" })
(done responding, will resume on your next message) or { mode: "ask", note } (blocked, needs your
answer — the question shows on the panel); the session is revived automatically on the agent's next
call. Taught as the mandatory last step in the session lease, the loop guide, and the skill — and it's
agent-independent (Codex / OpenCode / Claude / Hermes). The panel renders each handback distinctly
via a PRESENTER tone: waiting = calm teal ✋, ask = amber ❓ pulse, agent crashed/disconnected =
amber ⚠ pulse, a clean end = calm green. When the last agent's MCP connection drops, the daemon ends
every session and pushes the "switch to your terminal" notice (verified end-to-end through a SIGKILL-ed
agent). Off the benchmark path.
Don't lose a panel prompt in the death-race (packages/server, packages/protocol). If the human
types a message into the panel at the exact moment the agent stops, it would land in a dead inbox; now
both the agent-detach and idle paths fold any unread note into the end banner — quoted and labeled
Undelivered (paste into your terminal): "…" — so the words are surfaced back, not silently dropped.
Replay a saved flow from the panel — no agent (packages/browser, packages/server,
packages/protocol). The daemon pushes the saved-flow names to the HUD on connect; the human clicks
▶ on a flow and it re-runs with no agent in the loop — the page animates via the normal replay path
and the ✓ / ⚠ drift / ✗ verdict lands in the same activity log they watch the agent in. The dev plays
the regression suite directly. Off the benchmark path (a panel-driven control, not a tool).

Changed

Internal cohesion split (no behavior change): SessionManager moved to its own
session-manager.ts, and the on-disk-artifact constants to flow-constants.ts, bringing both
parent files back under the 500-line cap. All public import paths unchanged (re-exported).

Fixed

Panel composer is now multi-line (packages/browser). The HUD message box was a single-line
<input> that sent on any Enter; it's a <textarea> now — Enter sends, Shift+Enter inserts a
newline, and it auto-grows to fit.
Flag mode keeps the right cursors (packages/browser). In "Flag a bug" mode every element showed
the crosshair, including the Flag button and its popover — which are clickable; they keep the pointer
cursor now. And the hover outline that boxes the element under the cursor no longer snaps jumpily: it
waits for the cursor to rest (~130 ms), then glides into place on an ease and fades in.

Assets 2

18 Jun 14:34

divshekhar

v0.6.10

ad2ff44

v0.6.10

[0.6.10] — 2026-06-18

Added

Deterministic waiting — the settled predicate (packages/server). A new predicate
{ kind: "settled", quietMs } passes once network + structural-DOM activity has been quiet for
quietMs (default 500ms); ambient dom.text/animation churn (count-ups, spinners) is ignored so
an animated page can still settle. Usable in iris_wait_for and iris_assert, and composable inside
allOf with the consequence you expect. Replaces fixed sleeps — the #1 cause of flaky agent tests.
iris_act_and_wait auto-settle (packages/server). Omit until and the tool waits for the page
to settle instead of requiring a predicate — "act, then wait for quiet" is now a single zero-config
call, the documented alternative to a sleep.
iris_query token controls (packages/server) — limit (cap returned descriptors; reports
total + truncated so a trim is never silent) and count_only (return just the match count).
iris_network / iris_console token controls (packages/server) — limit (keep the most
recent N matches, reporting total + droppedOldest) and a cost:{bytes,tokens} hint, matching the
other read tools so the agent can self-budget everywhere.
iris_domain mustHold per flow (packages/server) — each flow now reports the success
consequence that must hold for it (signal name / net URL), so an agent can answer "what are the
critical flows and what must hold for each?" from the domain model alone.

Changed

Self-healing now verifies the consequence before persisting (packages/server). iris_flow_heal
with apply:true re-replays the healed flow and re-asserts its success consequence; if a rebound
locator resolves but the flow no longer satisfies its intent, the write is refused
(status:consequence_broken, file untouched). It heals the locator, never the intent.

Fixed

Browser observers fully restore patched globals on teardown (packages/browser). The network,
route, and console observers stored a bound copy and assigned it back on teardown, so window.fetch
/ history.pushState / console.* were never restored to their original identity. They now keep the
true original for restore and a bound copy only for invocation.

Assets 2

15 Jun 22:01

divshekhar

v0.5.0

ea42054

v0.5.0

[0.5.0] — 2026-06-15

Added

iris mcp — smart proxy with auto-start (packages/server). Run iris mcp --drive <url> and you're
done: it starts the daemon if one isn't running, waits for it to be ready, then bridges Claude Code's stdin/stdout to the daemon's SSE endpoint. Users no longer manage the daemon manually.
iris mcp --drive <url> / iris serve --drive <url> — pass a URL and Iris launches its own
Playwright browser at that URL, giving the agent full autonomous control without relying on the user's open browser tab.
iris mcp --headed / --headed flag — opt in to a visible browser window so you can watch exactly what the agent is doing.
Three new update MCP tools (packages/server):
- iris_version_info — returns the installed version, execution kind (npx / global / local), and
  whether a newer version is available on npm.
- iris_apply_update — upgrades Iris in place; requires confirm: true to actually run.
- iris_rollback — downgrades to the previous version; requires confirm: true.
Presenter mode (packages/browser, packages/server) — iris.connect({ present: true }) mounts a
dev-only HUD overlay that the agent can control: iris_narrate shows a caption, iris_highlight
draws a ring around any element. The HUD is excluded from snapshots and tree-shaken in production.
Unified SKILL.md at repo root — a single skill file auto-detects mode: setup wizard on first
run (no .iris.json), live-app testing on every run after. Covers Claude Code, OpenCode, Codex CLI, Cursor, Windsurf, VS Code, and Zed MCP config formats.
.iris.json project config — written after first-run setup; persists port, headed,
framework, and harnesses so subsequent runs need zero questions.
dev:iris script in apps/demo — second Vite dev server on port 4310, isolated from the user's normal dev port.

Fixed

All-throttled session auto-selection (packages/server). When every connected tab is hidden
(e.g. user is in VS Code with Chrome on another desktop), SessionManager.resolve() now picks the session with the freshest heartbeat instead of throwing "multiple sessions connected".
Presenter HUD shows on bridge connect — the overlay now mounts as soon as the SDK connects to the bridge, not only after the first iris_narrate call.
iris_narrate MCP schema validation — relaxed the output schema so the tool no longer rejects responses from narration calls.
iris_inspect / iris_clock output schemas — relaxed to pass through extra fields instead of stripping them, fixing spurious validation errors.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

[0.8.0] — 2026-06-20

Added

Changed

Fixed

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

[0.6.10] — 2026-06-18

Added

Changed

Fixed

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

[0.5.0] — 2026-06-15

Added

Fixed

Uh oh!

Releases: syrin-labs/iris

v0.8.0

[0.8.0] — 2026-06-20

Added

Changed

Fixed

Uh oh!

v0.6.10

[0.6.10] — 2026-06-18

Added

Changed

Fixed

Uh oh!

v0.5.0

[0.5.0] — 2026-06-15

Added

Fixed

Uh oh!