🎭 feat(agents): playwright-chatgpt agent (browser-driven ChatGPT via ACP bridge) by chaizhenhua · Pull Request #62 · awakenworks/oversight

chaizhenhua · 2026-05-17T15:43:01Z

Summary

Adds a fifth agent type, playwright-chatgpt, that drives a real Chromium session against chatgpt.com via a new Node.js ACP bridge. Slots into the existing GenericCliAdapter path with zero Rust adapter changes outside this PR's own provisioning hardening — registration is a single seed manifest entry.

Authentication model: No upstream API key is used. The bridge authenticates by attaching Chromium to a pre-seeded persistent profile that has already been logged into chatgpt.com via the bootstrap flow. That profile directory is a credential-equivalent artifact and must be protected accordingly.

crates/oversight-agents/manifests/playwright-chatgpt.toml — adapter manifest (id playwright-chatgpt, detect via file_exists on built CLI, pnpm_workspace install strategy)
bridges/playwright-chatgpt/ — TypeScript Node bridge (~1.6k lines): ACP JSON-RPC server over stdio, persistent-context Chromium launcher with bootstrap guard, layered stream-complete detector (SSE [DONE] / DOM stop-button / idle / hard-timeout), versioned selectorRegistry, session resume via chatgpt://conversation/<uuid>, honest text-only capabilities, hard-timeout surfaced as error, SessionNewParams.cwd honoured for artifact paths, cancel signal propagates as stopReason: "cancelled".
crates/oversight-worker/ — provisioning + discovery hardening for in-repo builtins: default allowlist gains pnpm / node / hermes; pnpm_workspace install strategy now requires OVERSIGHT_WORKER_REPO_ROOT and refuses fast otherwise; file_exists detect and runtime spawn args resolve the same env so detect / install / runtime never disagree on where to look.

Stack

Base: master
Sibling PR (split out of this branch's history): #64 — credentials / authz / workflow improvements that were authored on the same working tree but are scope-orthogonal. Reviewing 🎭 feat(agents): playwright-chatgpt agent (browser-driven ChatGPT via ACP bridge) #62 no longer needs to scroll through those.

Review range

12 commits, all playwright-chatgpt scope:

📐 docs(superpowers) × 2 — design spec + implementation plan
🎭 feat(agents) × 2 — playwright-chatgpt adapter manifest + server-side seed
🎭 feat(bridges) × 4 — TS bridge layers (types/codec/sessions; ACP server; browser/CLI/bootstrap/fixture; scaffolding)
🧪 test(worker) × 2 — manifest round-trips + cross-language smoke against fake-chatgpt fixture
🛠️ chore(make) × 1 — bridges-* targets; make setup installs the bridge
✨ feat(worker) × 1 — embed playwright-chatgpt in embedded_builtin_manifests() fallback (also absorbs the four follow-on provisioning fixes: pnpm/node allowlist, pnpm_workspace cwd, repo-relative path resolution, hermes allowlist + test coverage)

The previous 25-commit stack squashed the bridge review-fix commits into feat(bridges): browser layer + CLI + bootstrap and the worker review-fix commits into feat(worker): mirror playwright-chatgpt in embedded_builtin_manifests fallback; six unrelated commits were extracted to #64. Net effect: same diff, half the SHAs.

Builtin auto-provision requirements

playwright-chatgpt's install steps live inside this repo (pnpm --dir bridges/playwright-chatgpt …). The worker provisioner used to default to a per-install tempdir as cwd, which silently broke the in-repo paths. Two operator-visible knobs make this explicit:

Env	Default	When to set
`OVERSIGHT_WORKER_REPO_ROOT`	unset → `pnpm_workspace` installs fail with a clear error; `file_exists` detect and runtime spawn fall back to process cwd	Set to the oversight repo root on workers that should autoinstall this builtin. Other strategies still use a tempdir.
`WORKER_PROVISION_ALLOWLIST`	now includes `pnpm` / `node` / `hermes` so the builtin is accepted out of the box	Override only to add/remove binaries.
(manual fallback)	—	Run `make bridges-install` on the worker host and skip autoinstall entirely.

Test plan

Covered by CI

cargo test -p oversight-worker --test playwright_chatgpt_manifest — manifest round-trips through GenericCliConfig::validate()
cargo test -p oversight-worker provisioning::tests::default_allowlist_accepts_every_builtin_manifests_install — walks claude/codex/hermes/playwright-chatgpt against the default allowlist
cargo test -p oversight-worker discovery::tests::detect_file_exists_resolves_relative_path_against_repo_root_env
cargo test -p oversight-worker adapters::generic::tests::resolve_runtime_arg_*
pnpm --dir bridges/playwright-chatgpt test:unit — codec / sessions / selectors / server (includes the test that pins the honest prompt capabilities + cwd-forwarding test)
pnpm --dir bridges/playwright-chatgpt test:integration — Playwright test over local fake-chatgpt fixture
cargo test -p oversight-worker --test playwright_chatgpt_smoke — real Node bridge spawned over stdio against the fake fixture, Rust asserts ACP frames
make bridges-install / bridges-build / bridges-test / bridges-test-integration targets; make setup now installs the bridge

Not covered by CI (manual until follow-up e2e runner lands)

Real chatgpt.com smoke: reviewer runs node bridges/playwright-chatgpt/dist/bootstrap.js --profile-dir /tmp/oversight-test-profile, logs in, then exercises the live site via a worker
Cross-host headless behaviour (different Chromium revisions, sandbox profiles)
Rate-limit / session-expiry handling against the live site

Risks / limitations

CI does not exercise the real chatgpt.com path. All automated coverage runs against the local fake-chatgpt fixture; live-site behaviour is verified only by manual smoke until a tests/e2e/ runner lands.
Persistent profile is credential-equivalent. The pre-seeded Chromium profile holds an authenticated web session for chatgpt.com. Treat the profile directory like a secret.
Selector drift. chatgpt.com's DOM is not a stable interface; selectorRegistry is versioned so patches ship without recompiling.
Browser + host dependency. Operators must install Playwright's bundled Chromium (or a compatible system Chromium) and configure host sandboxing themselves.
Single-profile, single-session per worker. Multi-profile pooling, attachment uploads, and explicit model-selector clicks are deferred.

Deferred / follow-ups

Real chatgpt.com e2e runner under tests/e2e/ (lifts the "not covered by CI" caveat)
Model selector clicks (currently passes ?model=<id> via URL)
Attachment upload path (attach.ts)
Multi-profile pool support
Selector drift detector with daily DOM snapshot diff

…icCliConfig

…ion test

…ts fallback

…+ opt-in setup

…heck

…nges

… emit relative paths

…uired

chaizhenhua force-pushed the chore/manifest-cli branch from 51f5729 to efa0145 Compare May 17, 2026 17:00

chaizhenhua force-pushed the feat/playwright-chatgpt-agent branch from f011572 to 988dc9a Compare May 17, 2026 17:03

Base automatically changed from chore/manifest-cli to master May 17, 2026 23:01

chaizhenhua mentioned this pull request May 18, 2026

✨ feat(worker): manifest-driven CLI adapter cutover (delete 4 per-CLI adapters) #59

Merged

5 tasks

chaizhenhua added 8 commits May 19, 2026 00:15

📐 docs(superpowers): playwright-chatgpt agent design spec

63f333d

📐 docs(superpowers): playwright-chatgpt agent implementation plan

7b10a45

🎭 feat(agents): playwright-chatgpt adapter manifest

96a039c

🎭 feat(agents): seed playwright-chatgpt manifest

6c0845b

🧪 test(worker): playwright-chatgpt manifest round-trips through Gener…

3f70fc6

…icCliConfig

🎭 feat(bridges): playwright-chatgpt scaffolding

b9edbfa

🎭 feat(bridges): ACP types, codec, sessions, selectors, errors, logger

7648e85

🎭 feat(bridges): ACP server with pluggable chat backend

3b2063e

chaizhenhua force-pushed the feat/playwright-chatgpt-agent branch from 988dc9a to b9737ce Compare May 18, 2026 16:18

chaizhenhua mentioned this pull request May 19, 2026

Credentials: dynamic model discovery + autonomous coding loop #64

Merged

5 tasks

chaizhenhua added 4 commits May 19, 2026 08:40

🎭 feat(bridges): browser layer + CLI + bootstrap + fixture + integrat…

5da4e97

…ion test

🧪 test(worker): cross-language smoke against fake-chatgpt fixture

e411739

🛠️ chore(make): bridges-install/build/test/test-integration targets

4e79e25

✨ feat(worker): mirror playwright-chatgpt in embedded_builtin_manifes…

96d70ab

…ts fallback

chaizhenhua force-pushed the feat/playwright-chatgpt-agent branch from 4a00905 to 96d70ab Compare May 19, 2026 00:42

chaizhenhua added 7 commits May 19, 2026 23:17

🛡️ feat(bridges): playwright-chatgpt experimental — disabled default …

39dd149

…+ opt-in setup

🛡️ fix(worker): default allowlist covers claude/codex/openclaw post_c…

52a36f7

…heck

✅ ci(rust): add cargo test --workspace gate for crates/migrations cha…

3a8fb7b

…nges

🐛 fix(bridges): chat integration test passes workspaceDir + artifacts…

a25714a

… emit relative paths

🐛 fix(bridges): cleanup timers/listeners on race resolve + cwd is req…

aab0717

…uired

🐛 fix(bridges): waitForAbort removes its listener on race cleanup

f44a2f6

🐛 fix(bridges): waitForStopGone selector waits race the cleanup signal

096cfcd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🎭 feat(agents): playwright-chatgpt agent (browser-driven ChatGPT via ACP bridge)#62

🎭 feat(agents): playwright-chatgpt agent (browser-driven ChatGPT via ACP bridge)#62
chaizhenhua wants to merge 19 commits into
masterfrom
feat/playwright-chatgpt-agent

chaizhenhua commented May 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chaizhenhua commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Stack

Review range

Builtin auto-provision requirements

Test plan

Covered by CI

Not covered by CI (manual until follow-up e2e runner lands)

Risks / limitations

Deferred / follow-ups

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

chaizhenhua commented May 17, 2026 •

edited

Loading