feat(plan-14): pillars 4 + 6 — computer_use IPC + OTel observability by VGIL77 · Pull Request #24 · Bitterbot-AI/bitterbot-desktop

VGIL77 · 2026-04-29T02:05:52Z

Summary

Lands two PLAN-14 pillars that were already drafted on disk but never committed.

Pillar 4 — desktop computer_use IPC (`desktop/src-tauri/src/computer_use.rs`)

Tauri command surface for OS-level computer-use actions, complementing the existing Playwright/CDP browser-use path on the Node side. Unified computer_use tool delegates browser actions to Node and OS-level (screenshot, mouse, keyboard) actions to these Tauri commands. Gateway exec-approval-manager mediates each invocation so the user approves/denies before anything fires.

Crates: xcap (cross-platform screen capture), enigo (cross-platform input synthesis), base64 (screenshot transport over JSON IPC).

Gated by BITTERBOT_COMPUTER_USE=1. Default off — the binary doesn't ship an enabled OS-control surface to users who didn't opt in. Future: gateway-mediated session capability grants replace the env-var gate.

Pillar 6 — OpenTelemetry init (`src/observability/otel.{ts,test.ts}`)

Production-grade OTel SDK initialization. Enabled only when OTEL_TRACES_EXPORTER or OTEL_EXPORTER_OTLP_ENDPOINT is set in env, matching the standard auto-config convention so any OTLP-compatible collector (Grafana Tempo, Honeycomb, Datadog, Jaeger) works out of the box with no new config surface.

When disabled, every helper is a no-op (one env-var read per call). Dynamic imports keep the SDK out of the cold-start path, so adding @opentelemetry/* deps is a separate, reversible step that doesn't break the build until completed.

Pillar 6 unblocks Pillar 5.

Test plan

Build with BITTERBOT_COMPUTER_USE unset — desktop bundle still ships, Tauri commands return errors when invoked
Set BITTERBOT_COMPUTER_USE=1 + OTEL_TRACES_EXPORTER=otlp and verify spans land in a local OTel collector
pnpm test src/observability/otel.test.ts passes

🤖 Generated with Claude Code

Lands two PLAN-14 pillars that were already drafted on disk but never committed. Pillar 4 — desktop computer_use IPC (desktop/src-tauri/src/computer_use.rs) Tauri command surface for OS-level computer-use actions, complementing the existing Playwright/CDP browser-use path on the Node side. The unified `computer_use` tool delegates browser actions to Node and OS-level (screenshot, mouse, keyboard) actions to these Tauri commands. Every action passes through the gateway's exec-approval-manager before reaching here so the user can approve/deny each invocation. Crates: - xcap — cross-platform screen capture (X11 / Wayland / macOS / Windows) - enigo — cross-platform input synthesis (mouse + keyboard) - base64 — transports screenshots over the JSON IPC bridge Gating: every command checks BITTERBOT_COMPUTER_USE=1. Default off, so the desktop binary doesn't ship an enabled OS-control surface to users who didn't opt in. Future: gateway-mediated session-level capability grants will replace the env-var gate. Pillar 6 — OpenTelemetry init (src/observability/otel.ts + .test.ts) Production-grade OTel SDK initialization, enabled only when OTEL_TRACES_EXPORTER (or OTEL_EXPORTER_OTLP_ENDPOINT) is set in the environment. Standard OTel auto-config convention so any OTLP-compatible collector (Grafana Tempo, Honeycomb, Datadog, Jaeger, etc.) works out of the box with no new config surface. When disabled, every helper is a no-op and the runtime cost is one env-var read per call to initOtel(). Dynamic imports keep the SDK out of the cold-start path for users who don't opt in, so adding the @opentelemetry/* deps is a separate, reversible step that doesn't break the build until completed. Pillar 6 unblocks Pillar 5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Pillar 6 of PLAN-14: ship the first concrete OTel instrumentation without forcing every operator onto a collector. Three wire-points: 1. Gateway boot (`startGatewayServer`) calls `initOtel()`. No-op when OTEL_TRACES_EXPORTER / OTEL_EXPORTER_OTLP_ENDPOINT is unset, so the default install path keeps zero overhead. 2. Every gateway RPC method gets one `gateway.rpc.<method>` span via `withSpan` in `handleGatewayRequest`. Captures rpc.method as an attribute, propagates exceptions to span status automatically. 3. Each pi-embedded tool execution gets a paired `agent.tool.<toolName>` span from start->end. Stored in `toolSpansById` keyed by toolCallId; awaited only on the cold end path so the start path stays hot. Adds `startSpan` to observability/otel.ts for any future paired-event instrumentation (memory ops, dream phases) where withSpan's single-fn shape doesn't fit. Deps: @opentelemetry/{api,sdk-node,exporter-trace-otlp-http,resources, semantic-conventions} added at workspace root. Dynamic-imported in otel.ts so a pre-install dev tree still builds. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Lays down the LangGraph-parity primitive that, paired with Pillar 5 long-horizon runtime, lets a 6+ hour run be branched from any prior state without touching the original timeline. Core: src/checkpoints/store.ts. - Single-table SQLite schema keyed on (thread_id, step_id) with parent_step_id enabling DAG branches. - gzip-compressed state blobs + sha256 dedup hash; idempotent on repeated saves of the same step. - ancestors() walks back to the root oldest-first for replay. - fork() copies a chosen lineage into a new thread and adds a fork_root marker so timeline UIs can render branch points. - WAL mode + busy_timeout=5000 so dashboard reads don't block writers. - Separate DB from the memory store so checkpoint volume doesn't bloat the embedding index. CLI: src/cli/checkpoints-cli.ts (registered as `bitterbot checkpoints`). - threads: list threads with last activity + step count. - list: enumerate checkpoints in a thread oldest-first. - show: print a single checkpoint's full state. - fork: branch a thread from a chosen step. - delete: drop every checkpoint in a thread. Each subcommand accepts --db override and --json. Default DB lives at ~/.bitterbot/checkpoints.sqlite (BITTERBOT_CHECKPOINT_DB overrides). Tests: src/checkpoints/store.test.ts — 6 tests covering save, idempotency, ancestor walk, fork (including non-mutation of the source thread), thread listing, and delete-by-thread isolation. This is the storage primitive only; integration with pi-embedded-runner to write user_message / assistant_message / tool_call / tool_result boundaries is the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Phase 1 of Pillar 6 #2 integration: an onAgentEvent listener writes each meaningful event to the checkpoint graph, using runId as thread_id and per-run monotonic seq as step_id. Tool start/result pairs become parent-child checkpoints; assistant text deltas are deliberately skipped to keep the timeline navigable. This converts the checkpoint store from a primitive into a working capability — `bitterbot checkpoints threads` now lists every run that produced a tool call, and `bitterbot checkpoints fork <thread> <step>` branches a fresh thread from any chosen point. Wired into startGatewayServer alongside initOtel; gated by BITTERBOT_CHECKPOINTS=1 so the default install path stays zero-overhead. Phase 2 (deferred) will dump full session snapshots from the runner at compaction/turn boundaries, enabling true replay (today's state is the event payload only — sufficient for inspection and lineage UI but not for full state reconstruction). Tests: 4 new (parented chain, partial-frame skipping, env-gating, idempotency). Combined checkpoint suite: 10 tests, all passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Headless-friendly OS control: the orchestrator daemon gains six IPC commands (screenshot, screen_size, mouse_move, mouse_click, type, key) that any agent on any platform can drive without a Tauri window in between. Browser automation continues to flow through the existing pw-tools-core path; this is the OS counterpart. Two-stage gating, by design: 1. Build-time: `cargo build --features=computer-use` opts into linking xcap + enigo. Default builds (the relay fleet, generic Linux boxes) omit the deps entirely so X11/libxdo system requirements don't leak. 2. Runtime: even on a feature-built binary, BITTERBOT_COMPUTER_USE=1 must be set before the orchestrator will act. A misconfigured node can never silently start clicking. orchestrator/src/computer.rs holds the actual wrapper (xcap for capture, enigo for input synthesis). The cfg(not(feature)) path returns a clear "feature not built" envelope so Node-side callers surface the cause. Node side: - OrchestratorBridge gains computerScreenshot / computerScreenSize / computerMouseMove / computerMouseClick / computerType / computerKey, each returning a normalized ComputerUseResult discriminated by `ok`. - A module-scoped accessor (setActiveOrchestratorBridge / getActiveOrchestratorBridge) lets agent tools reach the live bridge without threading it through every factory; the gateway registers it on startup right after `Bridge.start()`. - New unified `computer_use` agent tool routes screenshot / mouse / keyboard actions through the bridge. Wired into bitterbot-tools.ts alongside the existing browser tool. 5 unit tests, all passing. Default-feature orchestrator build verified: 1m14s, exit 0, no new deps pulled in. The relay fleet's cloud-init is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Pillar 5 lands the orchestration layer that turns the checkpoint store into a genuine long-horizon capability: a runtime that drives an agent through work → rest → dream cycles, writes a parent-chained checkpoint timeline at every phase boundary, and resumes from the latest tip. LongHorizonRuntime in src/agents/long-horizon/runtime.ts: - Phases: work (configurable workMs window), rest (cool-down), dream (one pass of the supplied dreamStep). Repeats until any of: workStep returns done, max iterations hit, wall-clock budget exhausted, or AbortSignal fires. - Checkpoint at each phase boundary using `kind: "custom"` with a `phase` metadata field, so timeline UIs can colour-code the cycle. - Test seams (`now`, `sleep`) so the cycle can be driven through fake time without real timers. - LongHorizonRuntime.resume(threadId, store) returns the latest step id from the store — the entry point for resume after restart. - Wrapped in `long_horizon.run` / `long_horizon.work_step` / `long_horizon.dream_step` OTel spans so a multi-hour run produces trace coverage at the right granularity for production debugging. Pillar 6 #1 follow-up: memory hot-path spans: - `memory.search` and `memory.dream` now run inside withSpan so a collector sees the same boundaries the agent itself sees. Search span carries query length + max-results attributes; dream span marks engine identity. Zero overhead when OTel is disabled. Tests: 5 new long-horizon tests covering work-rest-dream rotation with parent-chained checkpoints, early-done, abort, budget cap, and resume. Combined PLAN-14 suite: 26 tests, all passing. This is the final foundational piece for Pillar 5 — wiring an actual 6+ hour agent run on top is now a `workStep: () => agent.step()` composition, not an architectural project. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

VGIL77 and others added 7 commits April 28, 2026 21:59

Merge remote-tracking branch 'origin/main' into feat/plan14-pillars-4-6

296f290

VGIL77 merged commit c91e560 into main Apr 30, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(plan-14): pillars 4 + 6 — computer_use IPC + OTel observability#24

feat(plan-14): pillars 4 + 6 — computer_use IPC + OTel observability#24
VGIL77 merged 7 commits intomainfrom
feat/plan14-pillars-4-6

VGIL77 commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

VGIL77 commented Apr 29, 2026

Summary

Pillar 4 — desktop computer_use IPC (desktop/src-tauri/src/computer_use.rs)

Pillar 6 — OpenTelemetry init (src/observability/otel.{ts,test.ts})

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Pillar 4 — desktop computer_use IPC (`desktop/src-tauri/src/computer_use.rs`)

Pillar 6 — OpenTelemetry init (`src/observability/otel.{ts,test.ts}`)