You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implement the daemon process separation described in ADR-0010. The current in-process architecture allows agents to forge, suppress, or tamper with their own receipts, and concretely breaks under concurrent emitters (see comments below). This work moves signing and storage into a separate agent-receipts daemon running as its own OS user, and reduces every emitter (OpenClaw plugin, MCP proxy, SDK) to a thin fire-and-forget IPC client.
Background
See ADR-0010 for the full rationale. The short version: an agent auditing itself is not a meaningful audit. The daemon separation restores the tamper-evidence property and collapses N independent crypto/storage stacks into one shared chain.
The two comments below document concrete bugs the daemon split fixes that no smaller intervention can:
Phase 1 CI landed in ci: add daemon module workflow #326: .github/workflows/daemon.yml triggers on daemon/** or sdk/go/** and runs vet, build of ./cmd/..., and the combined unit + integration test suite with -race.
macOS peer.exe_path + macOS CI matrix + subprocess peer-cred test landed in daemon(socket): populate peer.exe_path on macOS via SYS_PROC_INFO syscall #328. peercred_darwin.go calls SYS_PROC_INFO(PROC_PIDPATHINFO) directly via unix.Syscall6 (no CGO). daemon.yml now runs [ubuntu-latest, macos-latest] with fail-fast: false, so the darwin syscall paths (SYS_PROC_INFO, LOCAL_PEERCRED, LOCAL_PEEREPID) are exercised every PR. New TestPeerCredFromSubprocess re-execs the test binary as a separate process and asserts peer.pid != os.Getpid(), closing the gap where same-process tests can't distinguish client from server. capturePeer tolerates LOCAL_PEEREPID returning ENOTCONN (rapid connect→write→close races macOS's live-pcb readout) by recording pid=0 rather than dropping the frame.
mcp-proxy / OpenClaw / SDK emitters are still unchanged — that's the next phase.
Suggested next piece
Phase 1 follow-ups are now drained on the daemon side. Three paths from here, ordered by impact:
Critical-path (recommended): resolve Open Questions 2, 3, and 4. None require code — they're design calls about chain migration policy, cutover sequencing, and uniform session_id allocation. Resolving the three is the strict gate on Section 3 (thin-emitter refactor); without them, every emitter PR risks being re-litigated mid-review. Best done as a discussion that produces a short ADR-0010 follow-up amendment or a new ADR. Right size for a single review session.
Spec change: top-level peer field on receipts. Moves peer attestation out of action.parameters_disclosure into the canonical spec field per the ADR-0010 Schema split section. Doc-only but touches spec/ and so requires explicit human approval per AGENTS.md before any agent work. Smaller and more deterministic than the OQ resolution above; good if you want a quick concrete win before the bigger design pass.
Begin Section 5 (Packaging) design. Homebrew formula + launchd plist + systemd unit + operator docs. Independent of the emitter refactor, but substantive enough to need a design pass first (operator-facing ownership policy for agentreceipts / agentreceipts-read is part of this). Could be split into its own issue.
MVP scope (first cut)
In scope for MVP:
macOS launchd + Linux systemd
Homebrew formula for daemon distribution
Single chain, single signing key, file-backed
Thin emitters across mcp-proxy + OpenClaw + Go/TS/Py SDKs
Out of scope for MVP — split into follow-up issues:
Windows Service installer + named pipes (separate issue)
.deb / .rpm packaging (separate issue)
agent-receipts tail -f read socket (already noted in ADR-0010)
Unprivileged-install fallback path: see Open Question 1 — needs a single rule for both Linux ($XDG_RUNTIME_DIR/...) and macOS (no XDG_RUNTIME_DIR) — resolved: Linux $XDG_RUNTIME_DIR/agentreceipts or /run/agentreceipts; macOS $TMPDIR/agentreceipts. Captured in ADR-0010 via docs(adr-0010): amend IPC framing and default socket paths to match shipped defaults #327.
Socket path configurable via env var (AGENTRECEIPTS_SOCKET)
Non-blocking send on emitter side; EAGAIN increments a local drop counter instead of blocking — deferred by design to the thin-emitter refactor; the daemon side has nothing to do here.
Drop counter flushed on next successful event; document the narrow loss window (emitter crash after drop, before flush) — same: emitter-side, ships with the thin-emitter refactor.
2. Daemon process (agent-receipts-daemon)
Sole owner of Ed25519 signing keys and SQLite database
Internal KeySource interface — file-backed adapter for MVP. Shape must satisfy ADR-0015 (Sign, PublicKey, Rotate, Init, Teardown) so PKCS#11 / cloud-KMS adapters land as adapters later, not as a redesign — daemon: phase 1 of ADR-0010 — standalone signing daemon foundation #322 ships Sign / PublicKey / VerificationMethod / Rotate / Init / Teardown; file-backed PEM adapter refuses keys looser than 0600.
Peer credential capture at connection-accept time:
Phase 1 stashes peer attestation in action.parameters_disclosure under peer.* keys (signature-protected) instead of the ADR's top-level peer field, because the latter requires a spec change that's out of scope per AGENTS.md. Tracked as follow-up.
RFC 8785 canonicalization (moved exclusively from emitters to daemon)
Hash-chaining and Ed25519 signing (seq, prev_hash, ts_recv, peer, id added by daemon)
In-memory ownership of (sequence, prev_hash) resumed on startup via GetChainTail(chainID) -> (seq, hash, found, err) — single ORDER BY sequence DESC LIMIT 1 query. Emitters must never allocate sequence numbers (per comment below) — added to ReceiptStore interface and SQLite *Store in sdk/go/store.
SQLite persistence with events_dropped synthetic receipt when a gap is recorded — deferred by design: this mechanism belongs with the emitter side (EAGAIN handling, dropped-counter flush) and ships with the thin-emitter refactor. See daemon/README.md "Phase 1 scope and deviations".
DB permissions: 0640 owner agentreceipts, group agentreceipts-read; public key 0644 — file-mode portion done in daemon: ship agent-receipts verify and 0644 public-key publishing #325: tightenDBFiles caps DB/WAL/SHM at 0640 after store.Open and a restrictive umask catches the new-file case at the source; daemon publishes the SPKI public key at <KeyPath>.pub (overridable via --public-key / AGENTRECEIPTS_PUBLIC_KEY) with mode 0644 on every startup, refusing to overwrite a mismatched file or a non-regular path. Owner/group ownership (agentreceipts user / agentreceipts-read group) is a packaging concern and lands with launchd / systemd / Homebrew.
3. Thin emitter refactor
Remove signing, storage, and canonicalization from @agnt-rcpt/openclaw (→ v2)
Remove signing, storage, and canonicalization from mcp-proxy (→ v2)
Remove signing, storage, and canonicalization from Go / TS / Py SDKs (→ v2)
session_id allocation rule per Open Question 4 — must be uniform across all three SDK emitters
Silent drop when daemon is not running (connect fails); EAGAIN drop counter flush on next successful send
4. Read interface
agent-receipts verify CLI reads DB and public key directly via filesystem — must work when daemon is down — daemon: ship agent-receipts verify and 0644 public-key publishing #325: new cmd/agent-receipts binary with verify subcommand. Uses sdk/go/store.OpenReadOnly (?mode=ro DSN, no schema/migration writes) so it coexists with the active daemon writer; reads the daemon-published public-key file. Stable exit codes: 0 valid, 1 broken, 2 usage error. Validates the public-key PEM/SPKI/Ed25519 shape upfront so a malformed key is ExitUsageError, not ExitChainBad.
Independent verifiability is not gated on daemon availability — daemon: ship agent-receipts verify and 0644 public-key publishing #325: integration tests TestVerifyCLIWhileDaemonRunning (daemon up + writing) and TestVerifyCLIWithDaemonStopped (daemon shut down between emit and verify) pin both halves.
5. Packaging (MVP)
Homebrew formula
launchd plist for macOS
systemd unit file for Linux (raw unit, not yet .deb/.rpm)
Dropped events are never invisible: gaps appear as events_dropped receipts in the chain
The daemon runs as its own OS user via standard service-manager integration on macOS and Linux
Regression tests for the bugs that motivated this
Two mcp-proxy instances started concurrently both emit successfully into one chain — no UNIQUE index conflict, no retry loop, no chain integrity break (regression for the concurrent-tail-allocation bug in comment 2) — covered by TestConcurrentEmittersSingleChain in daemon/integration_test.go (4 emitters × 50 frames). The shape of the test is daemon-level rather than two mcp-proxy processes; the listener-collision regression below covers the two-process angle once thin emitters land.
Two mcp-proxy instances started concurrently do not collide on a single listener port (regression for comment 1, listener case) — blocked on the thin-emitter refactor.
A sandboxed emitter (read-only filesystem access to the canonical DB path) emits successfully via the daemon socket (regression for comment 1, RO-DB case) — blocked on the thin-emitter refactor.
Existing chain migration policy. v1 users have per-emitter SQLite databases. Three options: (a) in-place migration into the daemon's DB on first run; (b) abandon old chains, daemon starts a fresh chain; (c) one-shot agent-receipts import-chain script. Solo-dev usage means (b) is cheap; pick deliberately.
SDK cutover sequencing. All three SDKs + OpenClaw + mcp-proxy in one PR/release, or phased per channel? Phased keeps PRs reviewable but means mixed v1/v2 chain state for the duration.
session_id allocation rule. Emitter generates at startup? Per agent-run? How does it survive emitter reconnect to the daemon? Affects all three SDK emitter designs and must be uniform.
In-scope for ADR-0016 (audit encryption at rest)? ADR-0016 names the daemon as the natural home for BEACON_ENCRYPTION_KEY. Default position: defer to a follow-up issue, but make it explicit so the daemon's storage layer does not need re-architecting.
CI workflow for the new daemon/ Go moduleDone (ci: add daemon module workflow #326):.github/workflows/daemon.yml triggers on daemon/** or sdk/go/** and runs vet, build of ./cmd/..., and the combined unit + integration test suite with -race.
Subprocess peer-cred test (TestPeerCredCaptured runs daemon.Run in a goroutine, so client and server share a process — can't distinguish a regression that records the listener's pid)Done (daemon(socket): populate peer.exe_path on macOS via SYS_PROC_INFO syscall #328):TestPeerCredFromSubprocess re-execs os.Args[0] with a guarded helper env var, asserts peer.pid != os.Getpid().
Operator-facing ownership policy for agentreceipts user / agentreceipts-read group — lands with packaging.
Related
ADR-0001: Ed25519 signing — key now lives only in the daemon
ADR-0002: RFC 8785 canonicalization — moves exclusively to the daemon
ADR-0004: SQLite storage — daemon is sole writer; readers use filesystem permissions
ADR-0010: this issue's substrate
ADR-0015: key rotation, BYOK, anchoring — daemon must hold its key behind a KeySource interface so 0015 lands as adapters
ADR-0016: audit encryption at rest — see Open Question 5
Summary
Implement the daemon process separation described in ADR-0010. The current in-process architecture allows agents to forge, suppress, or tamper with their own receipts, and concretely breaks under concurrent emitters (see comments below). This work moves signing and storage into a separate
agent-receiptsdaemon running as its own OS user, and reduces every emitter (OpenClaw plugin, MCP proxy, SDK) to a thin fire-and-forget IPC client.Background
See ADR-0010 for the full rationale. The short version: an agent auditing itself is not a meaningful audit. The daemon separation restores the tamper-evidence property and collapses N independent crypto/storage stacks into one shared chain.
The two comments below document concrete bugs the daemon split fixes that no smaller intervention can:
listen tcp 127.0.0.1:8082: bind: address already in use).mcp-proxyinstances racing on chain tail allocation, breaking chain integrity (UNIQUEindex rejects the second insert atseq=N+1).ADR-0015 (KeySource, BYOK, anchoring) and ADR-0016 (audit encryption at rest) both build on this ADR as substrate.
Status
agent-receipts verifyand 0644 public-key publishing #325: file-mode portion of Section 2 DB-permissions, public-key publishing at0644, and theagent-receipts verifyCLI in Section 4..github/workflows/daemon.ymltriggers ondaemon/**orsdk/go/**and runs vet, build of./cmd/..., and the combined unit + integration test suite with-race.SOCK_STREAM+ 4-byte length-prefix framing that actually shipped in daemon: phase 1 of ADR-0010 — standalone signing daemon foundation #322 (macOSAF_UNIXdoesn't implementSOCK_SEQPACKET), plus the per-user default socket paths ($TMPDIR/...on macOS,$XDG_RUNTIME_DIR/...on Linux when set,/run/...fallback). New Amendments section records both deviations.peer.exe_path+ macOS CI matrix + subprocess peer-cred test landed in daemon(socket): populate peer.exe_path on macOS via SYS_PROC_INFO syscall #328.peercred_darwin.gocallsSYS_PROC_INFO(PROC_PIDPATHINFO)directly viaunix.Syscall6(no CGO).daemon.ymlnow runs[ubuntu-latest, macos-latest]withfail-fast: false, so the darwin syscall paths (SYS_PROC_INFO,LOCAL_PEERCRED,LOCAL_PEEREPID) are exercised every PR. NewTestPeerCredFromSubprocessre-execs the test binary as a separate process and assertspeer.pid != os.Getpid(), closing the gap where same-process tests can't distinguish client from server.capturePeertoleratesLOCAL_PEEREPIDreturningENOTCONN(rapid connect→write→close races macOS's live-pcb readout) by recordingpid=0rather than dropping the frame.mcp-proxy/ OpenClaw / SDK emitters are still unchanged — that's the next phase.Suggested next piece
Phase 1 follow-ups are now drained on the daemon side. Three paths from here, ordered by impact:
session_idallocation. Resolving the three is the strict gate on Section 3 (thin-emitter refactor); without them, every emitter PR risks being re-litigated mid-review. Best done as a discussion that produces a short ADR-0010 follow-up amendment or a new ADR. Right size for a single review session.peerfield on receipts. Moves peer attestation out ofaction.parameters_disclosureinto the canonical spec field per the ADR-0010 Schema split section. Doc-only but touchesspec/and so requires explicit human approval per AGENTS.md before any agent work. Smaller and more deterministic than the OQ resolution above; good if you want a quick concrete win before the bigger design pass.agentreceipts/agentreceipts-readis part of this). Could be split into its own issue.MVP scope (first cut)
In scope for MVP:
Out of scope for MVP — split into follow-up issues:
.deb/.rpmpackaging (separate issue)agent-receipts tail -fread socket (already noted in ADR-0010)Work breakdown
1. IPC transport layer
/run/agentreceipts/events.sock(Linux) and/var/run/agentreceipts/events.sock(macOS) — daemon: phase 1 of ADR-0010 — standalone signing daemon foundation #322 shipsSOCK_STREAMwith 4-byte big-endian length-prefix framing instead ofSOCK_SEQPACKET(macOSAF_UNIXdoesn't support SEQPACKET); peer-cred works identically on stream sockets so the trust model is unchanged. ADR amended in docs(adr-0010): amend IPC framing and default socket paths to match shipped defaults #327.$XDG_RUNTIME_DIR/...) and macOS (noXDG_RUNTIME_DIR) — resolved: Linux$XDG_RUNTIME_DIR/agentreceiptsor/run/agentreceipts; macOS$TMPDIR/agentreceipts. Captured in ADR-0010 via docs(adr-0010): amend IPC framing and default socket paths to match shipped defaults #327.AGENTRECEIPTS_SOCKET)EAGAINincrements a local drop counter instead of blocking — deferred by design to the thin-emitter refactor; the daemon side has nothing to do here.2. Daemon process (
agent-receipts-daemon)KeySourceinterface — file-backed adapter for MVP. Shape must satisfy ADR-0015 (Sign,PublicKey,Rotate,Init,Teardown) so PKCS#11 / cloud-KMS adapters land as adapters later, not as a redesign — daemon: phase 1 of ADR-0010 — standalone signing daemon foundation #322 shipsSign/PublicKey/VerificationMethod/Rotate/Init/Teardown; file-backed PEM adapter refuses keys looser than0600.SO_PEERCRED(uid,gid,pid)LOCAL_PEERCRED+LOCAL_PEEREPID—ENOTCONNonLOCAL_PEEREPIDis tolerated aspid=0(race window between accept and getsockopt for fast connect→write→close emitters) per daemon(socket): populate peer.exe_path on macOS via SYS_PROC_INFO syscall #328./proc/<pid>/exe(Linux); macOS uses theSYS_PROC_INFO(PROC_PIDPATHINFO)syscall directly (the call libproc'sproc_pidpath()wraps), keeping the daemon CGO-free — Linux populated since daemon: phase 1 of ADR-0010 — standalone signing daemon foundation #322; macOS populated since daemon(socket): populate peer.exe_path on macOS via SYS_PROC_INFO syscall #328 and exercised in CI onmacos-latest.action.parameters_disclosureunderpeer.*keys (signature-protected) instead of the ADR's top-levelpeerfield, because the latter requires a spec change that's out of scope per AGENTS.md. Tracked as follow-up.(sequence, prev_hash)resumed on startup viaGetChainTail(chainID) -> (seq, hash, found, err)— singleORDER BY sequence DESC LIMIT 1query. Emitters must never allocate sequence numbers (per comment below) — added toReceiptStoreinterface and SQLite*Storeinsdk/go/store.events_droppedsynthetic receipt when a gap is recorded — deferred by design: this mechanism belongs with the emitter side (EAGAINhandling, dropped-counter flush) and ships with the thin-emitter refactor. Seedaemon/README.md"Phase 1 scope and deviations".0640owneragentreceipts, groupagentreceipts-read; public key0644— file-mode portion done in daemon: shipagent-receipts verifyand 0644 public-key publishing #325:tightenDBFilescaps DB/WAL/SHM at0640afterstore.Openand a restrictive umask catches the new-file case at the source; daemon publishes the SPKI public key at<KeyPath>.pub(overridable via--public-key/AGENTRECEIPTS_PUBLIC_KEY) with mode0644on every startup, refusing to overwrite a mismatched file or a non-regular path. Owner/group ownership (agentreceiptsuser /agentreceipts-readgroup) is a packaging concern and lands with launchd / systemd / Homebrew.3. Thin emitter refactor
@agnt-rcpt/openclaw(→ v2)mcp-proxy(→ v2)v,ts_emit,session_id,channel,tool,input,output,error,decisionsession_idallocation rule per Open Question 4 — must be uniform across all three SDK emittersEAGAINdrop counter flush on next successful send4. Read interface
agent-receipts verifyCLI reads DB and public key directly via filesystem — must work when daemon is down — daemon: shipagent-receipts verifyand 0644 public-key publishing #325: newcmd/agent-receiptsbinary withverifysubcommand. Usessdk/go/store.OpenReadOnly(?mode=roDSN, no schema/migration writes) so it coexists with the active daemon writer; reads the daemon-published public-key file. Stable exit codes:0valid,1broken,2usage error. Validates the public-key PEM/SPKI/Ed25519 shape upfront so a malformed key isExitUsageError, notExitChainBad.agent-receipts verifyand 0644 public-key publishing #325: integration testsTestVerifyCLIWhileDaemonRunning(daemon up + writing) andTestVerifyCLIWithDaemonStopped(daemon shut down between emit and verify) pin both halves.5. Packaging (MVP)
.deb/.rpm)6. Version and migration
@agnt-rcpt/openclaw,mcp-proxy, and all three SDKs (daemon is now a runtime requirement)Acceptance criteria
seqagent-receipts verifyworks with the daemon stopped — covered byTestVerifyCLIWithDaemonStopped(daemon: shipagent-receipts verifyand 0644 public-key publishing #325).events_droppedreceipts in the chainRegression tests for the bugs that motivated this
mcp-proxyinstances started concurrently both emit successfully into one chain — noUNIQUEindex conflict, no retry loop, no chain integrity break (regression for the concurrent-tail-allocation bug in comment 2) — covered byTestConcurrentEmittersSingleChainindaemon/integration_test.go(4 emitters × 50 frames). The shape of the test is daemon-level rather than twomcp-proxyprocesses; the listener-collision regression below covers the two-process angle once thin emitters land.mcp-proxyinstances started concurrently do not collide on a single listener port (regression for comment 1, listener case) — blocked on the thin-emitter refactor.pid/uid/exe_pathrecorded on the synthesised peer field for bothlinuxanddarwindiscriminators —TestPeerCredCapturedcovers same-process capture;TestPeerCredFromSubprocess(daemon(socket): populate peer.exe_path on macOS via SYS_PROC_INFO syscall #328) re-execs the test binary as a separate process and assertspeer.pid != os.Getpid(), closing the gap where same-process tests can't distinguish client from server. Both run onubuntu-latestandmacos-latestfrom daemon(socket): populate peer.exe_path on macOS via SYS_PROC_INFO syscall #328.TestResumesChainAfterRestart, exercises the fullGetChainTailwire-through.agent-receipts verifyand 0644 public-key publishing #325)agent-receipts verifysucceeds with the daemon up and writing —TestVerifyCLIWhileDaemonRunning.agent-receipts verifyand 0644 public-key publishing #325) Published public-key file is mode0644on every daemon startup —TestPublishedPublicKeyHasMode0644.agent-receipts verifyand 0644 public-key publishing #325) Fresh-write path refuses a pre-planted symlink at the public-key path; the attacker's target is unchanged —TestPublishPublicKey_FreshWriteRefusesPreCreatedSymlink.agent-receipts verifyand 0644 public-key publishing #325)agent-receipts verifyreports a malformed public key as a usage error rather than implicating the chain —TestRun_MalformedPublicKeyIsUsageError.Open questions (resolve before kickoff)
Unprivileged-install socket path on macOS.Resolved (daemon: phase 1 of ADR-0010 — standalone signing daemon foundation #322): Linux$XDG_RUNTIME_DIR/agentreceiptsor/run/agentreceipts; macOS$TMPDIR/agentreceipts. Configurable viaAGENTRECEIPTS_SOCKET.agent-receipts import-chainscript. Solo-dev usage means (b) is cheap; pick deliberately.session_idallocation rule. Emitter generates at startup? Per agent-run? How does it survive emitter reconnect to the daemon? Affects all three SDK emitter designs and must be uniform.BEACON_ENCRYPTION_KEY. Default position: defer to a follow-up issue, but make it explicit so the daemon's storage layer does not need re-architecting.Advance ADR-0010 to Accepted before kickoff.Resolved (daemon: phase 1 of ADR-0010 — standalone signing daemon foundation #322): ADR-0010 status flipped Proposed → Accepted.Phase 1 follow-ups (from #322 / #325 / #328 deviations)
ADR-0010 amendment:Done (docs(adr-0010): amend IPC framing and default socket paths to match shipped defaults #327): IPC transport section rewritten to match what shipped, with a new Amendments section recording both the framing and per-user-default-path deviations.SOCK_STREAM+ length-prefix framing instead ofSOCK_SEQPACKETpeerfield on receipts, then move peer attestation out ofaction.parameters_disclosure.macOSDone (daemon(socket): populate peer.exe_path on macOS via SYS_PROC_INFO syscall #328):peer.exe_pathviaproc_pidpath(CGO or raw libSystem call)peercred_darwin.gocallsSYS_PROC_INFO(PROC_PIDPATHINFO)directly viaunix.Syscall6— no CGO. Exercised onmacos-latestin CI.CI workflow for the newDone (ci: add daemon module workflow #326):daemon/Go module.github/workflows/daemon.ymltriggers ondaemon/**orsdk/go/**and runs vet, build of./cmd/..., and the combined unit + integration test suite with-race.AddDone (daemon(socket): populate peer.exe_path on macOS via SYS_PROC_INFO syscall #328):macos-latestto the daemon CI matrixfail-fast: falsematrix runs both Linux (SO_PEERCRED,/proc/<pid>/exe) and macOS (LOCAL_PEERCRED,LOCAL_PEEREPID,SYS_PROC_INFO(PROC_PIDPATHINFO)) peer-credential paths every PR.Subprocess peer-cred test (Done (daemon(socket): populate peer.exe_path on macOS via SYS_PROC_INFO syscall #328):TestPeerCredCapturedrunsdaemon.Runin a goroutine, so client and server share a process — can't distinguish a regression that records the listener's pid)TestPeerCredFromSubprocessre-execsos.Args[0]with a guarded helper env var, assertspeer.pid != os.Getpid().agentreceiptsuser /agentreceipts-readgroup — lands with packaging.Related
KeySourceinterface so 0015 lands as adapters