Skip to content

feat(app): add Voice Mode extension#1888

Draft
benjaminshafii wants to merge 4 commits into
devfrom
feat/voice-mode
Draft

feat(app): add Voice Mode extension#1888
benjaminshafii wants to merge 4 commits into
devfrom
feat/voice-mode

Conversation

@benjaminshafii
Copy link
Copy Markdown
Member

@benjaminshafii benjaminshafii commented May 22, 2026

Summary

  • Add a Voice Mode OpenWork extension with a right-side Realtime voice panel.
  • Add host-side OpenAI Realtime client-secret minting using the existing env store.
  • Prefer user-saved env-store Realtime keys over stale process env keys.
  • Keep Voice Mode panel and runtime state open across session changes so the voice timeline survives navigation/new sessions.
  • Extend OpenWork UI MCP with inspect/assert/wait tools for deterministic evals.
  • Add Voice Mode eval docs covering transcript injection, Realtime text tool calls, audio-buffer transcription, and session-transition continuity.

Verification

  • pnpm --filter @openwork/app typecheck
  • pnpm --filter openwork-server typecheck
  • pnpm --filter openwork-ui-mcp check
  • git diff --check
  • Isolated Electron/CDP eval on local feature worktree: connected OpenWork UI Control, verified opencode.jsonc MCP config, reloaded MCP, verified UI MCP can navigate/open Voice Mode.
  • Real OpenAI Realtime eval with local OPENAI_API_KEY env: verified Realtime connection, model called openwork_execute_action, composer changed to REPEAT CHAT EVAL PASSED.
  • Direct OpenAI Realtime client-secret verification with temporary key: returned 200 and a client secret.
  • Realtime audio-buffer eval: injected PCM16 hello-world audio and verified Realtime transcription produced Hello world.
  • Exact session-transition CDP eval: opened Voice Mode, injected FIRST VOICE MESSAGE MUST REMAIN, created a new session, verified the first voice message still remained visible in the Voice timeline, then injected SECOND SESSION MESSAGE into the new session composer.

Notes

  • Did not include API keys or eval temp artifacts in the commit.
  • Chromium fake microphone flags were silent locally, so deterministic audio eval uses Realtime input_audio_buffer injection.

@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 22, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
openwork-app Ready Ready Preview, Comment May 22, 2026 4:31am
openwork-den Ready Ready Preview, Comment May 22, 2026 4:31am
openwork-den-worker-proxy Ready Ready Preview, Comment May 22, 2026 4:31am
openwork-landing Ready Ready Preview, Comment, Open in v0 May 22, 2026 4:31am
openwork-share Ready Ready Preview, Comment May 22, 2026 4:31am

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant