This is a transcript-with-commentary of a Claude Code session that built gstack × AgentCall: a dashboard that turns every gstack specialist (CEO, CSO, Eng Manager, QA, etc.) into a voice agent that joins your Google Meet with a unique 3D avatar, hears the room, and responds in character.
Most coding-agent transcripts show the agent writing code in isolation. This one is different. We built the product inside the product.
- Founder (Anand) joins a Google Meet from his laptop.
- Claude Code joins the same Meet as a voice bot via the AgentCall skill.
- The team (Anand + John K G + Anoop) talks; Claude hears the transcript via AgentCall's STT, plans, edits files, and replies with TTS — all inside the meeting.
- Specialists from the product (CEO, Staff Engineer) get dispatched into the same call to weigh in on their domain. They use the product on itself.
The transcript below is the engineering record of that session — every real bug we hit, every fix we shipped, the security audit we ran in parallel, and the docs we wrote in parallel. The meeting is the dev loop.
If you want to see the recursion most clearly: we are pitching a tool that lets a team build software in their meeting, and we built it inside a meeting, and the founder asked the in-meeting CEO bot to critique the pitch deck, and CEO's feedback shaped the next commit. That happened.
Bring your GStack team into the meeting.
GStack (Garry Tan, YC) is a Claude Code skill pack: 18 named specialists — CEO, CSO, Eng Manager, QA, Designer, Release Engineer, etc. — each one a slash-command persona inside Claude Code. Powerful, but trapped in text.
AgentCall is voice infrastructure for AI agents: a bot can join Google Meet / Zoom / Teams with a custom avatar and TTS, hear the participants, and speak back.
gstack × AgentCall marries them. Dashboard at localhost:8765:
paste a Meet URL, click "Dispatch CEO Review," and within ~30s the CEO
specialist joins your meeting with the gstack mustache-character avatar,
introduces itself, then engages in character on whatever the team is
discussing. Six pre-built team presets ("Founding Team," "Design Team,"
"QA & Ship") let you drop a whole virtual department in at once.
The wedge is the recursion: a one-developer team can be a real team. CEO challenges your strategy in the standup; CSO catches the auth bug in the code review; QA pushes back when you say "ship it." Same voice flow as a real meeting.
The full transcript spans ~12 hours of intermittent collaboration over several days. The artifacts that survived:
| Component | Lines | What it does |
|---|---|---|
index.html |
2200+ | Dashboard. Specialist grid, team presets, search, brief textarea, dispatch dock, session history, recall buttons. Pure stdlib, zero JS frameworks. |
server.py |
720 | Stdlib HTTP server. GET / dashboard, GET /avatars/<id>.svg, POST /dispatch, POST /recall. Spawns specialist_runner.py per dispatched specialist. |
specialist_runner.py |
540 | Wraps bridge.py/bridge-visual.py. Reads outbox from intelligence bus, writes inbox. Cross-bot speech lock with watchdog. |
avatars/gen.py + 18× *.svg |
90 + assets | DiceBear-generated 3D-character avatars per specialist with role-color backgrounds. Generated deterministically from specialist id. |
avatar-page/ |
static | Bot's video feed — runs in FirstCall's headless browser. Reads ?name= from tunnel URL, picks the right avatar, plays TTS via Web Audio API. |
recap-page/ |
static | Auxiliary page the CEO bot can screenshare during a recap. |
vendor/bridge.py, vendor/bridge-visual.py |
31KB + 47KB | Vendored AgentCall bridges. Patched for websockets>=13 API compat (ws.closed → ws.state). |
scripts/launch.sh + launch-visual.sh + kill-session.sh |
~150 | Launcher shell scripts. Spawn bridge in background, tail cmds file as stdin, redirect output to per-session jsonl. |
README.md, ARCHITECTURE.md, CONTRIBUTING.md, SECURITY.md |
800+ | Written by sub-agents in parallel during the session. |
Eight real failures during the session. Each one started from a user saying "it's broken" out loud in the meeting; ended with code committed and the next dispatch verifying the fix.
Symptom. CEO bot joined the meeting in avatar mode but the video tile was blank. Avatar server log was empty — no requests from FirstCall's headless browser.
Diagnosis.
$ lsof -iTCP:3000 -sTCP:LISTEN -n -P
Python 12593 anand IPv4 TCP 127.0.0.1:3000 (LISTEN) ← our avatar server
node 33669 anand IPv6 TCP *:3000 (LISTEN) ← Next.js dev server
A Next.js dev server from an unrelated project was bound to *:3000 on
IPv6. macOS resolves localhost to the IPv6 listener first, so the
AgentCall tunnel was proxying to the Next.js process — which served its
own 404 page back to FirstCall's headless browser. Our avatar SVG never
got loaded.
Fix. Killed PID 33669; from then on _ensure_avatar_server() in
server.py:632-668 logs the listener and warns if a stranger holds the
port.
Verification. Avatar log immediately showed
GET /?name=CEO&ws=wss://api.agentcall.dev/v1/calls/.../ws/ui?call_token=ct_...
followed by GET /avatars/plan-ceo-review.svg — and the gold mustache
character rendered in the meeting tile.
Symptom. Even after fix #1, the avatar tile was a circular ring with no character inside it.
Diagnosis. avatar-page/index.html returned /avatars/<id>.svg —
absolute path. AgentCall's tunnel serves our page under
https://<tunnel>.conn.agentcall.dev/k/<key>/ui/. An absolute
/avatars/... from FirstCall's browser resolves to the AgentCall domain
root, not our localhost.
Fix. avatar-page/index.html:178 — switched to a relative
avatars/${id}.svg so the URL resolves through the tunnel back to our
local server.
Verification. Screenshot of the meeting taken via screenshot
command on the bridge: CEO's tile now shows the character.
Symptom. tts.done events fired, the avatar's voice.state cycled
through "speaking", but participants heard nothing.
Diagnosis. agentcall-audio.js created AudioContext lazily inside
playChunk(). Modern Chrome and FirstCall's headless browser both
default it to state === "suspended" because there's no user gesture.
A suspended context happily queues AudioBufferSourceNode instances
that never play.
Fix. Two-line change.
// avatar-page/agentcall-audio.js:88-101
this.ctx = new (window.AudioContext || window.webkitAudioContext)({
sampleRate: this.sampleRate,
});
if (this.ctx.state === "suspended") {
this.ctx.resume(); // ← the actual fix
}Plus a primer block in avatar-page/index.html:152-167 that constructs
the AudioContext on DOMContentLoaded and calls .resume() even
before any audio chunk arrives.
Verification. Next CEO dispatch — the meeting transcript itself
echoed the bot's greeting (speaker: "CEO", text: "Hi, I'm the CEO from GStack...") — proof that audio reached the meeting and STT picked it up.
Symptom. CEO joined a Meet with a lobby. tts.done fired ~20s
after spawn, but the participant never heard a greeting — and after
admission, no greeting fired either.
Diagnosis. specialist_runner.py had a 20s timeout-fallback that
called greet_once("timeout-fallback") regardless of whether the bot
had reached call.bot_ready. While stuck in the waiting room, the
bridge sent the TTS to the AgentCall server, which acknowledged it, but
there was no meeting audio context yet — so AgentCall returned
tts.done immediately and dropped the audio on the floor. After
admission, greeted = True short-circuited the real greeting.prompt
handler.
Fix. specialist_runner.py:268-296. Greeting only fires after
bot_ready=True. The fallback now polls every 2s, only triggers once
the bot is in the room, then waits 3s for the natural greeting.prompt
before firing.
Verification. Next dispatch's orchestrator log:
greeting (call.bot_ready): "Hi, I'm the CEO from gstack. ..."
[bridge] sending tts.speak
{"event": "tts.done"}
reason now call.bot_ready instead of timeout-fallback. Audio
reached the meeting.
Symptom. Bridge orchestrator log showed
AttributeError: 'ClientConnection' object has no attribute 'closed'
in a background asyncio task. Tunnel and main WS still worked, so the
bot could join — but ping-pong was broken and after ~30s the tunnel
silently drifted.
Diagnosis. vendor/bridge-visual.py:496-503 did
while not self._ws.closed. websockets ≥ 13 removed the .closed
attribute and replaced it with .state (a State enum).
Fix. Compat shim that feature-detects.
def _is_open(ws):
if ws is None: return False
if hasattr(ws, "closed"): return not ws.closed
state = getattr(ws, "state", None)
return state is None or getattr(state, "name", "") == "OPEN"Plus: vendored both bridges into vendor/ so future plugin
updates don't silently overwrite the patch. scripts/launch.sh and
scripts/launch-visual.sh prefer the vendored copy first.
Symptom. Anand: "there were two CEOs at the same time and they were just talking at the same time. It was a disturbance."
Cause. Each bridge has its own VAD (waits for human silence before TTS), but VAD doesn't see other bots. Two specialists responding to the same trigger talked over each other.
Fix. Filesystem-based cross-bot speech lock at
/tmp/gstack-intelligence-<uid>/speaking.lock. Format: "<pid> <ts>".
Acquire blocks up to 12s for the lock to clear. Self-healing — if the
holder PID is dead OR the lock is older than TTS_MAX_HOLD = 12s, the
next acquirer steals it. A per-runner watchdog thread also
force-releases its own stale lock if the bridge crashes mid-TTS.
specialist_runner.py:347-410. Verified by dispatching CEO + Eng Manager
and asking them the same question — they answered serially.
Found by: sub-agent we kicked off mid-call to do an OWASP/STRIDE pass.
Cause. server.py accepts JSON POST with no Origin / Sec-Fetch-Site
check. A <form enctype="text/plain"> from a phishing tab can be
shaped as parseable JSON (json.loads ignores Content-Type) and
trigger /dispatch against localhost:8765 while the dev is on a
hostile page. The dev's machine then joins an attacker-controlled Meet
URL on the dev's API key.
Fix. server.py:481-503 — _csrf_ok() rejects any POST whose
Origin is not localhost:8765/127.0.0.1:8765, and any Sec-Fetch-Site
that's not same-origin. Curl/python clients (no Origin, no SFS)
still work for the runner's own internal calls.
Verified.
$ curl -X POST -H 'Origin: https://evil.example' http://127.0.0.1:8765/dispatch -d '{}'
{"error": "cross-origin request blocked"} HTTP 403
Found by: same security sub-agent.
Cause. validate_meet_url accepted any https://* URL. Combined
with #7 (or any future trigger), an attacker could repeatedly drive the
dev's API key against arbitrary hostnames.
Fix. server.py:319-345 — host allow-list of
meet.google.com, zoom.us (+ subdomains), teams.microsoft.com,
teams.live.com, webex.com. https://attacker.example now returns
HTTP 400.
While we kept iterating on the dashboard live, we kicked off two sub-agent tasks in the background and kept the meeting flowing:
Sub-agent 1 — Security audit (SECURITY.md). OWASP Top 10 + STRIDE
pass over the codebase. 9 findings, every one with a file:line, a
concrete attacker scenario, and a paste-ready fix. Top 3:
- CSRF on
/dispatchand/recall(HIGH) — fixed (#7 above). meetUrlvalidator accepts any URL (HIGH) — fixed (#8 above).- Intelligence-bus outbox world-writable in
/tmp(MEDIUM) — fixed by moving to/tmp/gstack-intelligence-<uid>/mode 0700.
We also fixed PID-spoof on /recall (subprocess argv check before
SIGTERM) and env-leak (subprocess env scrubbed to a 15-key allow-list,
so unrelated dev secrets — AWS, GitHub tokens — no longer reach
vendored third-party code).
Sub-agent 2 — Docs (README.md + ARCHITECTURE.md + CONTRIBUTING.md).
~620 lines of release-quality documentation. ASCII diagrams of the
process tree, avatar tunnel, and file bus. Lifecycle timeline. Component
walkthroughs with "what / why / could be replaced." Four extension
points spelled out. Vendored-bridge update flow with the diff -u
against upstream baked in.
Both sub-agents finished while we were still mid-call. No context switch on our end. Real parallel coding-agent work.
Anand ┌────────────────────────────────────────────────────┐
───► │ Google Meet — meet.google.com/wzx-edwn-chd │
Speaks │ │
into mic │ [Anand] [Claude] [CEO bot] [Eng Mgr bot] │
└─────────▲─────────────▲───────────────▲────────────┘
│ │ │
AgentCall WSS tunnel + TTS audio injection (mode webpage-av)
│ │ │
┌─────────────────────────┴───┐ ┌───────┴───┐ ┌─────────┴─────────┐
│ vendor/bridge-visual.py │ │ runner │ │ runner │
│ ⇡ tunnels port 3000 ──┬───►│ │ (CEO) │ │ (Eng Manager) │
│ ⇡ subscribes /ws/ui │ │ └─┬─────────┘ └─┬─────────────────┘
│ ⇡ sends tts.speak │ │ │ │
└─────────────────────────│────┘ │ outbox tail │ outbox tail
│ ▼ ▼
│ /tmp/gstack-intelligence-<uid>/
│ ├── inbox.jsonl (transcripts in)
│ ├── outbox/<id>.jsonl (replies out)
│ └── speaking.lock (cross-bot mutex)
│ ▲
┌─────────────────────────▼─┐ │ Claude Code session reads
│ avatar-page/ (port 3000) │ │ inbox, decides reply, writes
│ index.html + audio.js │ │ outbox per specialist
│ + 18 character SVGs │ │
└───────────────────────────┘ │
│
┌────────────────────────────────────────┴──────────┐
│ server.py (port 8765) │
│ GET / dashboard POST /dispatch │
│ GET /avatars/<id>.svg POST /recall │
│ ⇡ CSRF guard, meetUrl host allow-list, env-scrub│
└───────────────────────────────────────────────────┘
The CEO bot — running on this codebase — was asked by the founder to
critique the pitch during the build session. Verbatim from the live
transcript (/tmp/gstack-intelligence-503/inbox.jsonl):
Wedge: Pick one painful job. Note-taking, dev standups, sales recaps, customer success — own one before going wide.
Frame: "Bring your own AI teammate." Zoom owns the meeting; you own the brain that joins.
One-liner: "Ship code with your AI team in the meeting." A verb, not a feature.
Demo: 90-second Loom showing one developer doing the work of five.
Three to nail before raising:
- Distribution — How does a dev hear about it Tuesday and have it in standup Wednesday?
- Pricing — per-seat or per-meeting? Tells investors which game.
- Enemy — replacing the notetaker, the teammate, or the PM? Pick one. Hiding loses the round.
That advice came from the product. The next commit (recap-page/index.html)
turned it into a slide the bot can screenshare.
git clone https://github.com/pattern-ai-labs/gstack-joins-meeting
cd gstack-joins-meeting
python3 server.py # starts dashboard + avatar server
# open http://localhost:8765
# paste a Meet URL → click "Dispatch CEO Review"
# admit the bot from the Meet lobby
# the CEO joins, greets you, and engages.Prerequisites: Python 3.10+, an AGENTCALL_API_KEY at
~/.agentcall/config.json (free tier on agentcall.dev). Zero JS deps,
zero Python deps outside stdlib.
- Wire screenshare into every specialist runner (currently only Claude bridge can screenshare; CEO can speak the recap but not show it).
- Architecture simplification pass — collapse the four hand-maintained copies of the name→id mapping into one source of truth.
- Live-data pipeline for
recap-page/(currently hardcoded sample). - Hosted version of the dashboard (currently
localhost-only by design).
- Velocity. A two-person team shipped this in a few sessions — dashboard, 18 specialist personas, voice + avatar mode, security audit, full docs — using a coding agent that participated in the meetings as a teammate.
- Recursive demo. The product builds the product. Every change in the transcript above happened during a real meeting where Claude was one of the speakers.
- Real bugs, real fixes. Eight production-grade bugs caught and shipped during the session, each with a citation a reviewer can verify in the repo.
- Security & docs done in parallel — same session, sub-agents running concurrently. No "we'll get to it after launch."
This is what the dev loop looks like when your team includes specialists in the meeting. GStack × AgentCall is the tool that makes that loop the default.