Skip to content

Add Desktop Pet beta companion#2916

Open
franksong2702 wants to merge 6 commits into
nesquena:masterfrom
franksong2702:franksong2702/desktop-pet-beta
Open

Add Desktop Pet beta companion#2916
franksong2702 wants to merge 6 commits into
nesquena:masterfrom
franksong2702:franksong2702/desktop-pet-beta

Conversation

@franksong2702
Copy link
Copy Markdown
Contributor

Thinking Path

  • Hermes WebUI is already strong as a browser surface for interactive agent work, but long-running sessions often become background activity: the browser is covered, the user is in another desktop space, or the session is waiting for approval/clarification.
  • This PR introduces Desktop Pet as an ambient desktop attention layer, not a second chat client. The browser remains the source of truth for sessions, auth, settings, unread state, approvals, clarify prompts, and replies.
  • The first slice is intentionally conservative: source-only, disabled by default, loopback-bound, and documented as a desktop beta. The pet is not installed, launched, or shown until the user opts in through Settings -> Appearance or /pet wakeup.
  • The implementation risk is mostly around native-window behavior, runtime identity, browser navigation, stale WebView assets, and click-through areas, so this PR includes targeted regression coverage and a manual beta acceptance checklist.

What Changed

Product surface

  • Added an optional Desktop Pet beta for desktop WebUI users.
  • Added Settings -> Appearance Desktop Pet (Beta) as a feature-first long-term preference, separate from whether a native pet process is currently running.
  • Added /pet wakeup and /pet sleep slash commands. /pet wakeup uses the same install-before-launch path as Settings when the native shell is missing, keeps setup feedback alive during long first builds, and only reports success after launch succeeds.
  • The default bundled skin is keeper, displayed to users as May.
  • Added attention bubbles for running sessions, ready/completed sessions, approvals, clarify choices and custom clarify replies, and focused inline replies.
  • Added a first-launch Welcome Card for the no-active-session case, while real running/ready/action-required bubbles take priority over onboarding.
  • Added bundled keeper / May, courier, and shiba skins for review.

Runtime and bridge

  • Added pet-owned routes in api/pet_routes.py for skins, attention state, preference sync, install/launch/status/close, native shell registration, and session navigation.
  • Added standalone /pet and /pet/bubbles pages backed by vanilla JS.
  • Added a thin Tauri shell in desktop-pet/ with separate pet and bubble windows.
  • Added static/pet_bridge.js so an existing WebUI tab can consume pet navigation commands and keep the browser session as the source of truth.
  • Added sanitized loopback browser fallback for cold starts or cases where no live WebUI tab consumes the command.
  • Registered the native shell PID and WebUI base URL in the active state directory so 8787, 8788, and other loopback runtimes are not confused.

Native-window and interaction details

  • Kept bubble windows above the pet window and prevented hidden transparent bubble windows from intercepting clicks.
  • Made bubbles follow pet movement, including the macOS child-window path used during drag and the multi-display placement path.
  • Added edge placement, top/bottom fallback, overflow handling, +N, and Latest controls.
  • Expanded the visible green badge hit target, made the hover dismiss X visually larger, and suppressed native checkbox ghost rendering in the Settings switch during launch.
  • Added immediate opening feedback when a bubble is clicked so the user is not left wondering whether the click landed.
  • Added pet-specific asset cache-busting so WebView reloads do not silently keep stale Desktop Pet JS/CSS after WebUI updates.

Documentation

  • Added docs/desktop-pet.md for beta status, product behavior, runtime model, boundaries, local development, troubleshooting, and follow-ups.
  • Updated desktop-pet/README.md to clarify native-shell ownership and local development flow.
  • Updated README.md, TESTING.md, and CHANGELOG.md with release-facing Desktop Pet notes and PR evidence expectations.

Why It Matters

Desktop Pet is meant to make Hermes better at the parts of agent work that are not purely foreground chat:

  • It gives users a lightweight way to notice when an agent is running, blocked, ready, or asking for a decision.
  • It preserves the WebUI as the full-control surface instead of fragmenting session state into a separate desktop client.
  • It makes long-running sessions feel more continuous on desktop systems, especially when the browser is not frontmost.
  • It opens a path toward a more ambient Hermes experience while keeping the first reviewable slice small enough to reason about.

The product direction is broader than the initial mascot: Desktop Pet can become the desktop attention layer for Hermes sessions. This PR only ships the foundation: status, session return path, approval/clarify/reply interactions, safe native lifecycle, and clear beta boundaries.

Screenshots

Settings entry and first-launch progress:

Desktop Pet Beta setting Desktop Pet setup progress

Default May pet and collapsed badge:

Default May desktop pet Collapsed Desktop Pet badge count

Welcome and session bubbles:

Desktop Pet Welcome Card Desktop Pet running and ready bubbles

Approval, clarify, and inline reply interactions:

Desktop Pet approval bubble expanded on hover Desktop Pet clarify choices and Other input Desktop Pet inline quick reply

Overflow controls:

Desktop Pet overflow more button Desktop Pet Latest button after scrolling

Verification

  • git diff --cached --check
  • node --check static/panels.js static/sessions.js static/commands.js static/pet_bridge.js static/desktop_pet/bubbles.js static/desktop_pet/pet.js static/sw.js static/i18n.js
  • python -m pytest tests/test_pet_routes.py tests/test_desktop_pet_static.py tests/test_desktop_pet_regressions.py tests/test_desktop_pet_slash_command.py -q -> 66 passed
  • cargo fmt --manifest-path desktop-pet/src-tauri/Cargo.toml -- --check
  • cargo clippy --manifest-path desktop-pet/src-tauri/Cargo.toml -- -D warnings
  • npm --prefix desktop-pet audit --audit-level=moderate -> found 0 vulnerabilities

Manual/local evidence:

  • macOS local QA covered Settings opt-in, from-zero setup, Welcome Card behavior, bubble click/open feedback, green badge hit target, hover dismiss X, running/ready bubbles, approval/clarify hover interactions, inline reply, overflow controls, drag following, and multi-display bubble placement.
  • Screenshots above were generated from the isolated 8788 runtime. Settings screenshots use the real WebUI page; pet/bubble screenshots use the real /pet and /pet/bubbles pages with mocked attention data to cover all states deterministically.

Risks / Follow-ups

  • Windows host validation is still required. The code is structured as macOS/Windows desktop beta work, but the current manual evidence is macOS-led.
  • Packaging, signing, notarization, release distribution, and auto-update are intentionally outside this first slice.
  • This is not a mobile or tablet feature.
  • Maintainers may want to trim the bundled skin set before release.
  • Future product work could refine skin policy, OS login startup, richer desktop notifications, and whether Desktop Pet should graduate from beta into a packaged companion app.

Model Used

OpenAI GPT-5 via Codex assisted with implementation review, documentation, screenshot preparation, and PR text drafting.

@franksong2702 franksong2702 marked this pull request as ready for review May 25, 2026 08:22
@franksong2702 franksong2702 marked this pull request as draft May 25, 2026 08:38
@franksong2702 franksong2702 marked this pull request as ready for review May 25, 2026 09:53
@nesquena-hermes nesquena-hermes added hold ux User experience / visual polish labels May 25, 2026
@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Holding for now — this is a substantial new feature (58 files, +13K LOC including new Tauri-based desktop pet shell). I want to give it a focused review pass rather than batching it with smaller fixes.

Specific things I'll be checking:

  1. The desktop-pet/ Tauri subproject — does it ship as a separate optional install, or does it become part of the default build pipeline?
  2. New api/pet_routes.py — does it follow the same auth + CSRF + path-validation patterns as the other api/ routes?
  3. The 11 PNG screenshots in docs/images/desktop-pet/ — confirm they're checked into the repo (not just referenced).
  4. ServiceWorker changes in static/sw.js — what's the cache invalidation story for the new desktop-pet/ assets on update?

Thanks @franksong2702 — flagging this for @nesquena's eyes before merge. No action needed from you right now.

@franksong2702
Copy link
Copy Markdown
Contributor Author

Thanks for the focused review pass. Quick confirmations on the four areas you called out:

  1. desktop-pet/ stays optional/source-only in this slice. Starting WebUI alone does not launch it; it only starts from Settings -> Appearance or /pet wakeup, and docs/desktop-pet.md / desktop-pet/README.md call out that there is no signed installer, release bundle, or default build-pipeline integration yet.
  2. api/pet_routes.py keeps the pet control surface loopback-only for launch/status/register/close/navigation/open-session paths, browser-originating POSTs use the normal WebUI CSRF token headers, and browser fallback session URLs are sanitized to loopback http(s) hosts only.
  3. The 11 PR screenshots are committed under docs/images/desktop-pet/2026-05-25-pr-screenshots/ rather than only referenced externally.
  4. static/sw.js intentionally does not pre-cache the optional /pet pages, static/desktop_pet/*, or pet spritesheets. The service worker keeps only core shell assets in SHELL_ASSETS (plus the narrow static/pet_bridge.js hook), uses the git-versioned cache name, bypasses API/health/stream requests, and network-firsts shell assets so updates invalidate through the existing app-shell path.

Happy to split, trim bundled skins/screenshots, or adjust the beta boundary if that would make the first review pass easier.

@franksong2702 franksong2702 force-pushed the franksong2702/desktop-pet-beta branch from 9aed189 to 4f59c44 Compare May 26, 2026 01:13
When a WebUI tab is polling the bridge, _queue_and_focus now skips
_reuse_existing_pet_browser_tab (which changes the URL and causes a
full page reload through Loading Session...).  Instead the bridge
receives the queued command and calls loadSession(sid) for a smooth
in-page transition.  Hard URL reuse is kept as the fallback for cold
starts (no live bridge tab) and for ack timeouts (bridge polled but
didn't ack within 1.6 s).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hold ux User experience / visual polish

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants