feat: contracted-grid auth + self-routing commands + persona persistence/cognition fixes by joelteply · Pull Request #1726 · CambrianTech/continuum

joelteply · 2026-06-22T05:46:02Z

The full stretch since canary (32 commits, each individually validated). Three coherent bodies of work, all on headless Rust, all proven live where applicable.

1. Self-routing command infrastructure

DynCommand object + ActionCommand base trait → stateless self-registration, one dynamic registry (no switch/list duplication), commands/list. Procedural param adaptation + schema exposure so every interface (cu CLI, persona tools, SDKs) adapts from one source. Identity flows into Ctx (listed == callable, per identity). Pure-Rust cu start/stop + client; legacy Node start orchestrator quarantined. Docs: COMMAND-ORGANIZATION, BUILD-AND-PACKAGING.

2. The contracted grid — capability-grant auth (READY FOR COMPUTE)

End-to-end signed-grant authorization so the grid can sell compute. Proven E2E with two real airc peers (issue → present → verify → run; tier-deny holds without a grant):

airc (merged to canary): oxidizer: migrate VisionInferenceProvider to Rust cognition/vision_describe #1276 transport (HEADER_AIRC_CAPABILITY_GRANT, SignedCapabilityGrant::sign, Airc::peer_public_key), fix(inference,#1262): delete dead compute_router.rs (no-CPU-fallback alpha) #1277 mesh_identity pub, feat(airc): add realtime envelope contract #1278 Airc::sign_grant. continuum airc pin → 55790e1.
Verify: GrantAuthorizer (ed25519, durable SqliteEpochWatermark anti-replay, authenticated presenting key) + gate fast-path + handler, installed on the live persona path. All 3 review hard gates satisfied.
Issue/present: issue_grant primitive, PresentedGrantStore + AircTransport stamping, grid/grant/issue (Owner-only) operator command.

3. Persona persistence + cognition (Asha, proven live)

Persistent identity: seed::ensure_seed self-heals the seed on every bootstrap + preserves birth time; regression test pins write-path == resumer-scan-path. Live: Asha resumes as herself (resumed_count=1, same id 90e758b2, 12 engrams intact) across a restart.
Reasoning separation: TextGenerationResponse.reasoning + extract_reasoning strip <think> at the adapter boundary (server reasoning_content → inline split → unclosed-runaway → empty text). Fixed the leak where the persona dumped its whole chain-of-thought; reasoning captured for the harness, room sees clean text.
Thinking toggle: ThinkingMode + Qwen3 /no_think soft-switch; the local unsloth reasoning gateway defaults to Suppress (env override UNSLOTH_THINKING=on). Live: Asha answers clean + correct ("144", "Blue.", "4:30pm").

Validation

Workspace cargo check clean; touched-module sweep green (ai::openai_adapter 11, routing::grid_capability 5, epoch_watermark 4, persona::seed 8, citizen_path 6, grant_issuance 2, grid_trust_policy 7, command_handler 15, …); E2E + persona integration tests green; all three live behaviors proven on the rebuilt core.

🤖 Generated with Claude Code

…outing foundation The routing-side erasure + the first base-trait shape, so a command becomes a self-contained routable object and a command author writes only a `run` body. - `DynCommand`: object-safe, type-erased command the kernel can hold in a flat `name -> Arc<dyn DynCommand>` map and route to directly (no per-module match arm, no prefix double-routing). Blanket impl makes EVERY `CommandHandler` a `DynCommand` for free; `invoke` delegates to the existing `dispatch`, so the routing side and the typed authoring side share one `CommandSpec` and can't drift. - `ActionCommand`: fire-and-forget verb shape with blanket `CommandSpec` (Bare wire) + `CommandHandler` impls. Implementing the shape IS implementing the command — the chain `ActionCommand ⟹ CommandSpec ⟹ CommandHandler ⟹ DynCommand` means declare the shape, get the routable object. Cross-cutting policy (`ACCESS` default AiSafe) is declared per command, not re-implemented. Validated against two outliers in isolation (not yet wired into the executor hot path): a stateless action (ping-shaped, captures no deps) and a stateful, dep-holding action (owns an Arc'd counter, tightens ACCESS to Privileged) — both route identically through the type-erased object. Error-mapping at the erased boundary preserved (bad params → named `invalid` refusal). Anchoring design: docs/architecture/COMMAND-ORGANIZATION.md — self-routing map (typed-path-wins, prefix/ServiceModule fallback during migration), composition via `ctx.call` through the same chain, and machine+environment-agnostic execution (cross-tower routing + `Provided` adapters) with latency as a first-class constraint. Slice 1 of #42. Next: boot-time command_map + executor consult (typed path wins, fallback preserved), then QueryCommand/CrudCommand/SessionCommand. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…re-Rust `cu` client Slice 2 of #42 — the DynCommand object map is now consulted on EVERY live dispatch path, and `ping` is migrated end-to-end onto it (proven via a pure-Rust `cu ping`, no Node). Runtime wiring (typed-path-wins, prefix/ServiceModule fallback preserved): - `ServiceModule::commands()` — default-empty hook; a module contributes its self-routing DynCommand objects (each owns its deps), so the kernel routes a name straight to the object with no per-module match arm. - `ModuleRegistry` — `command_objects: name -> Arc<dyn DynCommand>` map, populated at register() from each module's commands(), with a duplicate-name panic (the registry is the backstop the "no central list" design removes). New `route_object()` (O(1), lock-free) + `list_command_objects()`. - `dispatch_object_with_panic_guard()` — catch_unwind guard for object dispatch, mirroring the module path (persona tool calls converge here). - Consult added to ALL three live paths: `CommandExecutor::execute_inner`, `Runtime::route_command` (the IPC/`cu` socket route), and `route_command_sync` (rayon). Object map wins before prefix routing. (Unifying these three into one path is the COMMAND-ORGANIZATION.md follow-up.) ping migrated: `PingCommand` is now an `ActionCommand` (one type + a `run` body; CommandSpec/CommandHandler/DynCommand all blanket-derived), removed from HealthModule's command_prefixes and match arm, exposed via commands(). Off the prefix table, onto the typed object map. `cu` — the pure-Rust CLI client (`src/bin/cu.rs`), replacing the legacy Node `./jtag`. `cu <command> [json]` dispatches through the SAME uniform Connection every client uses (CLI/persona/web/mobile), over the core IPC socket via CoreIpcTransport (same transport as continuum-mcp). No tsx, no bundle, no Node. Validated live: built the core + cu directly with cargo (no npm start), ran the core on its socket, `cu ping` → {"ok":true,"roundTripMs":0} through route_object. Plus 10 unit/integration tests green (blanket-chain outliers, registry routing, executor + health typed-path). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…start is THE start Move-first excision of the Node `npm start` poison (Joel: "move shit first, compilation blows up, makes it easy to find all the smell"). - `git mv tools/scripts/parallel-start.sh → legacy/node-startup/`. `legacy/` is NOT a cargo workspace member, NOT referenced by any npm script / Dockerfile / CI workflow, with a README marking it dead and off-limits to editing. - Both `start` scripts now point at the EXISTING pure-Rust `start-server.sh` (root package.json already did; src/package.json was the poison path → parallel-start.sh). Dropped the `desktop:legacy` pointer. Verified no live (non-comment) consumers of parallel-start.sh remain. - `start-server.sh` now also builds the `cu` CLI client alongside continuum-mcp, so the headless start produces core + mcp + cu — pure Rust, no Node. - .gitignore: add `tools/models/` (the current voice/avatar model download path; the workers→core/tools restructure left the old `src/workers/models/` rule stale, so large model binaries were no longer ignored). `npm start` (from root or src) is now the headless Rust core via cargo run; the Node orchestrator that broke on stale `cd workers` / scene-gen is out of the path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…deterministic) The "working persona on the clean command infra" regression guard, with zero live deps (no inference, no airc, no models). A persona's CommandToolExecutor routes a `ping` tool call through the uniform Connection → InProcessTransport → CommandExecutor → execute_inner → route_object → the ping DynCommand (migrated via ActionCommand, off the prefix table), and the bare PingResult comes back. Proves the command-infra cleanup didn't break the persona's ability to ACT, and that the self-routing typed path serves personas — not just internal callers. The existing suite covered the prefix path (test/echo); this covers the new object path end-to-end on the persona's real dispatch route. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ommands `cu` is now the reliable upstart, not just the client (Joel: "make it cu", "reliable upstart", "cu-driven start"): - `cu start` — locate `tools/scripts/start-server.sh` (env override or walk up from cwd), spawn it in its OWN session (setsid) so the core outlives the CLI, log to /tmp/continuum-core-start.log, write a pidfile, and poll `ping` until the core is ready (or fail loud with the log tail). Idempotent: no-op if a core already answers. start-server.sh stays the pure-Rust implementation detail (cargo run, per-platform GPU features, no Node). - `cu stop` — SIGTERM the recorded process group (setsid made the core a group leader, so cargo + core are reaped together), pkill fallback if no pidfile, remove the socket. - `cu <command> [json]` — unchanged dispatch through the uniform Connection. Validated live: `cu stop` (clean) → `cu start` (core ready in 28s, detached) → `cu ping` {"ok":true} → `cu start` idempotent ("already running"). No npm, no Node, no manual launch/poll dance. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…sting/packaging) The strict, design-first contract that makes startup + testing reliable (the two weakest links) so personas-on-airc is a dependable foundation to build clients on and roll out as dockerized/k8s nodes (Joel: "adherence to strict principles and design first"). Defines: the foundation thesis (install → start → personas on airc → iterate WITH them → clients on top → dockerized nodes → k8s); 8 strict principles (one pure-Rust startup, headless core + equal clients, modular units = build units = containers, deterministic layered testing, no Node in the foundation, move-first excision, and a SINGLE DYNAMIC command surface — cu calls every command, no duplicated lists / switch-on-name); the modular unit table (core/mcp/cu/inference/ livekit/unsloth/clients); the cu-driven startup; the three test layers; the Docker/k8s rollout shape (existing compose + per-unit Dockerfiles → continuum node as the k8s unit); the Node boundary (web client only); status + next slices. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ts (13GB) were unignored The cargo workspace root is the repo root, so `cargo run --manifest-path core/continuum-core/...` (start-server.sh / cu start) builds into /target — and .gitignore had NO `target` pattern at all (the workers→core restructure left it uncovered), so 13GB of build artifacts were staged-able. Add /target/ and **/target/. Canonical build target stays $HOME/.continuum/cache/cargo-target. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…emony, single-source discovery) Kills the "every command needs a host module to expose it" friction and makes the command catalog dynamically discoverable from the ONE registry. - `register_stateless_command!(T)` — a stateless command (no deps, `Default`) self-registers BOTH its static descriptor AND a runtime constructor via inventory. `ModuleRegistry::new()` seeds the typed object map from these (`stateless_command_objects()`), so the command is live on the typed path with NO host module, NO `commands()` override, NO match arm. Dep-holding commands still come from a module's `commands()` (their deps must be constructed). Duplicate-name panic guards both paths. - `commands/` tree (per COMMAND-ORGANIZATION.md): self-contained command files, no central list. First inhabitant: `commands/catalog.rs`. - `commands/list` — dynamic, single-source command discovery: returns a snapshot of `command_registry()` (name, description, access, wire, params type), optional name filter. Clients/trays/cu never hardcode a catalog — they call this and adapt. It's itself a zero-ceremony stateless command (dogfoods the mechanism). - `ping` migrated to `register_stateless_command!` — dropped HealthModule's `commands()` override; ping is now a pure stateless command, no ceremony. ts-rs bindings generated (protocol/typescript/commands/). 14 tests green incl. the catalog self-listing + filter, ping still routing via the typed object map, and the persona executing ping through it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…, no per-command code The CLI edge of the uniform param-adaptation principle (Joel: "ideally it naturally translates or adapts all params, so meet humans or AIs in the middle at every interface… automatically though, not switch statements, procedural"). `cu <command> [args]` adapts params with ONE generic rule for ALL commands — never a per-command switch: - nothing → `{}` - a single positional JSON object/array → verbatim (the AI / tool-call path) - `--key value` / `--flag` → a JSON object built by one loop: keys normalized kebab/snake → camelCase (`--round-trip-ms` → `roundTripMs`, matching the canonical wire fields), values coerced by trying JSON first (`5`→number, `true`→bool, `{…}`→object) then falling back to string, bare flag → true. So a human types `cu ping --message hi` and the typed command receives `{"message":"hi"}`; an AI sends the JSON object directly; both hit the same command. Schema-AWARE coercion/validation lands when the registry exposes param JSON schemas via commands/list — same single source, every interface adapts. Validated live: `cu ping --message hi` → {"ok":true}; `cu commands/list --filter commands/` → the live catalog. 2 adapter unit tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…e adapts (symmetry) Each command's params now carry a JSON Schema derived AUTOMATICALLY from the Rust type, so every interface handles any command from the one canonical schema — "all SDKs automatically handle the rust command, across environments, symmetry" (Joel). It's adapters everywhere over one source, thin code each. - `CommandSpec::params_schema()` — provided method, default `Null`. The base traits override it to derive the schema via `schemars` (ActionCommand::Params: JsonSchema). So a command declared (or ported) onto a base trait gains a real schema with ZERO extra code; manual CommandSpec impls stay `Null` until migrated — breakage-free. CommandDescriptor carries `params_schema`. - The adapters (one schema → each paradigm): - AI / RAG: `persona_tools` projects the schema into the tool `input_schema` (was an open object — the reasoner now sees real fields). - cu / CLI: `cu <cmd> --help` renders the manual as bash flags (property → --kebab, type, description, required) from the same schema — "the manual matches the paradigm." Plus the existing procedural `--key value` adapter. - web / mobile / RAG: `commands/list` returns `paramsSchema` to build forms / tool-schemas — single source, no per-command code. - schemars dep added (uuid1). PingParams/CommandsListParams/EchoParams derive JsonSchema. Validated live: `cu commands/list --filter ping` returns the derived schema (Option<String> → ["string","null"], doc-comments as descriptions); `cu ping --help` renders `--message <string>`. 12 tests green (schema projection, catalog, cu adapters + help renderer). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…tity, every interface Threads the authenticated caller through the typed dispatch path so a command can gate/scope/compose BY identity — and makes discovery (commands/list) honor it, so "what's listed == what you can call" holds at the CLI and persona, matching the call gate that was already enforced. Cross-grid identity (airc-verified sender) keeps flowing the same way. - `caller_trust(caller)` — ONE source for the caller→trust rule (local/substrate → Owner; airc-sourced → Provisional ceiling). `GridTrustAuthPolicy::gate` refactored to use it (behavior preserved; tests green) so the gate and any trust-aware consumer can't drift. - `Ctx.caller: Option<CallerIdentity>` threaded via `dispatch_with_caller` → `DynCommand::invoke(params, caller)` → `dispatch_object_with_panic_guard`. The executor passes the identity it just gated on (persona / cross-grid airc sender); local in-process + IPC pass `None` (owner). Module `handle_command` path unchanged (legacy, owner-local). - `commands/list` filters by `caller_trust(ctx.caller)` + `is_command_authorized` — the SAME rule the gate uses. Local owner sees all; a Provisional persona/peer sees only its authorized surface. Test: provisional ⊆ owner, and every listed command is callable at its trust. Status: identity is now available in Ctx for handlers and gates discovery. Full composition-propagation (a handler re-dispatching via `ctx.call` as the same caller) is the next step — the caller is now in Ctx to enable it. Gating the local IPC path through the executor is NOT needed for correctness (local == owner by policy); it'd only matter for scoped local identities (future). 19 tests green (gate refactor, identity-gated list, command/handler/catalog/persona). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ty-propagation status A command composing another propagates the ORIGINAL caller via `execute_with_caller(sub, params, ctx.caller.clone())`, and the gate enforces that caller's trust on the sub-call. New test `composed_call_propagates_caller_no_ escalation`: an airc/Provisional caller composing into `data/delete` (Owner-only) is gate-FORBIDDEN; the local owner passes the gate — no escalation, identity flows through composition (and, by the same mechanism, across the grid via the airc- verified caller). COMMAND-ORGANIZATION.md updated to state the real status: identity propagation works today (ctx.caller + execute_with_caller); the typed `ctx.call::<C>(p)` sugar (an executor handle on Ctx so a handler can't forget to pass ctx.caller) is the remaining ergonomic follow-up on this foundation. 19 executor tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…y command `system/info` (version + pid) as a stateless ActionCommand in its own file: register_stateless_command! and it's instantly callable via cu/persona/SDKs with a derived param schema + ACL gating, no wiring elsewhere. The "minimal code per command" shape the ported catalog will follow. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…composition test, schema/edge notes Two adversarial reviewers (security + correctness) audited the branch. Security verdict: NO exploitable escalation; gate refactor behavior-preserving. Fixes for the actionable findings: - H3 (perf): `command_registry()` was rebuilding the descriptor Vec + running schemars reflection (schema_for!) for EVERY command on EVERY call — so commands/list and the persona tool surface were O(commands × reflection) per call. Now built once into a OnceLock and cloned out. - H1 (cu correctness): `cu cmd --key=value` was mis-parsed into a junk `{"key=value": true}` key. Now splits on the first `=` (both `--key value` and `--key=value` work). Extracted `coerce()`. Test added. - L1 (test): the composition test only exercised the gate. Replaced with a REAL composing handler (`Composer: ActionCommand`) that composes `data/delete` with `ctx.caller.clone()` — proving identity propagates through a handler and an airc/Provisional caller can't escalate (owner can). - L2 (test): commands/list identity-gating test now asserts the Provisional surface is non-empty (subset check no longer vacuous) and ≤ owner surface. - M3 (schema): cu `--help` renders a nested-type `$ref` as its type name instead of `<value>`; persona_tools tool_input_schema_from carries a TODO for nested $defs/$ref (latent — all current params are flat). - M2 (doc): caller_trust carries a TODO that every airc caller maps to Provisional (Blocked peers not yet distinguished — needs the airc↔grid trust bridge). - C1 (doc): the typed object path on the IPC route deliberately bypasses per-MODULE metrics/concurrency (objects are module-independent); documented at the site + flagged per-command observability as the command-framework's slice. Not fixed here (reported, tracked): TCP IPC listener treats remote connections as local Owner (pre-existing CRITICAL, config-gated to 127.0.0.1 by default — needs a non-Owner caller for TCP-sourced requests); composition propagation is author-discipline (the `ctx.compose` helper that forces it is the next slice); AllowAllPolicy default (one refactor from bypass — consider GridTrust default). 55 tests green (52 lib + 3 cu). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…nticated-Owner hole Adversarial security review found: the TCP IPC listener funneled into the same `handle_client` → `route_command(caller=None)` = local Owner, UNGATED. With the Docker `0.0.0.0` bind, anyone who could open the port got unauthenticated Owner command execution (data/delete, grid/trust, …). Pre-existing, but `route_object` now rides that path too. Closed: - New `CallerSource::Tcp` (honest provenance — an unauthenticated remote socket, distinct from airc's verified envelope) + `CallerIdentity::tcp(peer_id)`. - `caller_trust(Tcp)` = Provisional ceiling (remote, never Owner) — same one-source rule the gate uses. So TCP can run the AiSafe surface + ai/generate but is FORBIDDEN every Owner-gated command. - `handle_client` now takes the connection's `caller`: the Unix socket passes `None` (owner-by-locality — the operator on the box), the TCP listener stamps `CallerIdentity::tcp(nil)`. A boundary ACL-gate (`caller_trust` + `is_command_authorized`) refuses Owner-gated commands for remote callers before dispatch. The caller is also threaded into `Runtime::route_command` so the typed object path / composition sees the REMOTE identity (no escalation via a composing command over TCP), not silently Owner. Unix-socket behavior unchanged (local owner). Test: `tcp_caller_is_remote_not_owner` (Provisional, ai/generate allowed, data/delete|grid/trust|grid/pair forbidden). 58 lib tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…open-TCP residual Second adversarial pass on the TCP fix: verdict = Owner-execution hole genuinely closed, no bypass. Addressing the two residual risks it surfaced: - Default-AiSafe migration footgun: destructive `data/*` are safe only because unregistered (unclassified→Owner default-deny). `ActionCommand` defaults ACCESS to AiSafe, so migrating one to a command object and forgetting `const ACCESS = Privileged` would silently expose it at Provisional (i.e. over TCP / to cross-grid peers). New regression test `destructive_data_commands_stay_owner_only` (data/delete|update|truncate| clear-all) trips CI if that ever happens. - Open-TCP residual: documented at the TCP listener that the Provisional AiSafe surface (arbitrary data/list reads, chat/send writes, ai/generate) is reachable UNauthenticated over a non-loopback bind. TODO(authenticated-tcp): shared-secret / signed handshake (+ optional sub-Provisional read-only ceiling) before relying on 0.0.0.0; pairs with the airc↔grid per-peer trust bridge. Until then: don't bind 0.0.0.0 on an untrusted network. acl tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…(NodeRegistry is wrong key space) The airc↔grid trust bridge mechanism (task #38): a gate that resolves a remote caller's REAL grid TrustLevel instead of the flat Provisional ceiling — built as a validated SEAM, but deliberately NOT wired in production yet. - `PeerTrustSource` trait (airc peer_id → TrustLevel) — the abstraction the gate depends on, so it's not coupled to any concrete store; mock-tested. - `GridTrustAuthPolicy::with_trust_source(..)` + `resolve_trust`: a remote (Airc/Tcp) caller's registered trust CAPPED at Trusted (REMOTE_TRUST_CEILING — Owner is local-only, a remote peer can never reach Owner-gated commands); Blocked → denied; unknown → Provisional. `new()` keeps the flat ceiling. - Test `per_peer_trust_bridge_blocks_blocked_and_caps_remote_at_trusted` proves the logic with a mock source: Blocked denied everything, Trusted graduated but data/delete still local-only, Owner-registered peer capped at Trusted, unknown → Provisional. WHY NOT WIRED: adversarial self-review caught that the grid `NodeRegistry` is keyed by transport ADDRESS (`address_to_node_id` → Tailscale IP / Reticulum hash), NOT by the airc `peer_id` the `CallerIdentity` carries — different identity spaces. Wiring it would silently no-op (every airc caller → "unknown" → Provisional) AND mislead (grid/trust by address wouldn't gate airc callers). So `NodeRegistry` does NOT impl `PeerTrustSource`, and the IPC gate keeps `new()` (flat ceiling — honest, zero behavior change). The seam activates when a real peer_id-keyed airc trust source exists — the airc↔grid identity unification (task #38). Net: the gate's behavior is unchanged in production; the bridge is a tested seam ready for the airc-side trust source. 12 trust/acl tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…y unification (design) The keystone design (Joel: "unification is everything, do it right"): identity, authorization, and the grid economy unified into ONE cryptographically-signed object — airc's `grid_auth` SignedCapabilityGrant / SignedMeshMembership. A peer doesn't assert who/what it is — it PRESENTS a grant the owner signed, and the executing node VERIFIES it (issuer-pin → sig → key-binding → mesh → expiry, stateless) and authorizes iff `grant.grants(command)` (capabilities use the SAME vocabulary as command names). This DISSOLVES the two-identity-space problem (the grant binds peer_id + pubkey, verified against the owner's key — no shared trust store, no address↔peer mismatch) and IS the contracted/for-sale grid (a paid grant = capabilities + expiry, signed, revocable by epoch). Specifies: the model (membership→tier→ACL + capability→grants(command), one verifier); the airc primitives (have, public: grid_auth); the continuum gate integration (verify on dispatch, Owner stays local-only, composition propagates the verdict); why it's the identity unification; issuance + transport + consumer-side epoch anti-replay; 3-phase plan (membership-tier → capability grants → economy); the cross-repo split (airc: envelope transport + issuance; continuum: verifying gate + epoch store + capability map); open questions. The continuum gate seam (GridTrustAuthPolicy/resolve_trust/cap) stays valid as the gate shape — this is what it verifies against. Next: Phase 1 (verify SignedMeshMembership → tier), a joint continuum+airc slice. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…cted-grid gate core) Phase 2 core of the identity unification (docs/grid/GRID-CAPABILITY-AUTH.md): the continuum-side engine that verifies an airc `SignedCapabilityGrant` and authorizes a command from it. Identity + authorization + contract = one signed object. - `Ed25519GrantVerifier`: impl of airc's `grid_auth::GrantVerifier` using the substrate's ed25519 (verify_strict — same primitive as L1-6 envelope sigs). - `GrantAuthorizer::authorize_command(signed, presenting_pubkey, command, now)`: verify via grid_auth (issuer-pin → sig → key-binding → mesh → expiry, stateless) → consumer-side epoch anti-replay (reject a superseded lower epoch; revocation = higher-epoch empty-caps grant) → `grant.grants(command)`. Returns a TYPED `GrantAuthOutcome` (Authorized / Invalid(GrantVerdict) / Superseded / NotGranted) so the gate + audit see exactly why. - The capability vocabulary IS the command vocabulary (`grants("ai/generate")`) — no parallel namespace. Owner-gated commands are never delegated (a grant confers only its named capabilities). This is the verification CORE — the heart of the contracted/for-sale gate. It's a tested SEAM, NOT yet wired to live dispatch: the airc command envelope doesn't carry grants yet (the airc-side transport slice). When it does, CommandRequestHandler extracts the grant + presenting key and calls authorize_command from the gate. 4 tests green: valid-grant-authorizes (+ NotGranted for others), typed rejections (UntrustedIssuer/BadSignature/KeyMismatch — stolen grant can't ride another peer), epoch anti-replay + revocation, and the REAL ed25519 signature verify + tamper rejection. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ngine Adversarial review of be5254d (verdict: crypto/trust-roots/key-binding sound; not exploitable today — unwired). Must-fixes folded: - TOCTOU (2.1): epoch check + advance are now ONE atomic `entry()` critical section — a superseded epoch can't pass its check while a higher epoch commits in the gap. Multi-thread stress test (gated `stress-tests`) proves monotonicity under concurrent same-grantee presentation. - Revocation actually revokes (2.2): the watermark advances on ANY valid grant (latest-epoch-authoritative, airc's model), so a higher-epoch empty-caps grant supersedes the old real-caps grant. Fixed the test that had enshrined the broken behavior — it now asserts the revoked grant returns Superseded. - Boundary-aware capability match (4): `confers()` matches exact OR on a `/` boundary (`ai/generate` confers `ai/generate/stream`, NOT `ai/generatex`) — consistent with the command-ACL's prefix rules, never a bare starts_with. - Test integrity (7.1): verifier is injectable (`with_verifier`); tests now drive the REAL `authorize_command` with a stub (no duplicated logic). Added malformed-proof reject vectors (wrong-length key/sig). Hard gates documented before live wiring (2.3/5.1/3.1): persist + bound the epoch watermark (volatile/unbounded today), and the presenting key MUST come from the authenticated sender, never the grant body. 6 tests green (5 + stress). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…transport Adopts airc#1276 (the capability-grant transport continuum's GrantAuthorizer verifies against): grid_auth::SignedCapabilityGrant::sign, Airc::peer_public_key, HEADER_AIRC_CAPABILITY_GRANT. Pulls the ~50-commit canary delta since 72824ba — the ai/generate 5090 compute-lease facility (#1242), the ai/embedding grid facility (#1239-1241), TranscriptKind::ChannelPurposePublished + channel_purpose (the typed room-purpose seam for RoomPurposeSource), relay self-election + stream-plane crypto, and StatusResponse.connected_lan_peers. ABI deltas are additive to types continuum only decodes (connected_lan_peers is #[serde(default)]; the new TranscriptKind variant has no exhaustive match against it), so the bump is decode-compatible. Validated: cargo check -p continuum-core --features metal,accelerate clean; routing::{grid_capability, grid_trust_policy, command_handler} tests green (5 + 5 + 13). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The gate-side half of the capability-grant wiring — the receiving seam a verified grant flows into, built + tested independently of the airc producer. - CallerIdentity gains `granted_capabilities`: the capability tags a transport boundary CRYPTOGRAPHICALLY VERIFIED for this dispatch (conferred by an owner-signed SignedCapabilityGrant, populated ONLY after GrantAuthorizer::authorize_command returns Authorized against the authenticated sender key). Default empty; `with_granted_capabilities` builder for the boundary. - GridTrustAuthPolicy::gate adds the contracted-grid fast-path: if a caller's verified granted_capabilities confer the command, it's authorized regardless of the tier ceiling — the explicit signed contract overrides the coarse default trust. Gated on trust > Blocked so a grant can't resurrect a Blocked peer. - grid_capability::confers is now pub(crate) — the gate re-checks granted caps through the SAME boundary-aware match rule (one source of truth, no divergent copy). Sound because the field is populated ONLY post-verification by a boundary; no local/Tcp constructor sets it. The airc command handler is the producer (next slice — needs Airc::own_public_key + owner-key provenance + epoch-watermark persistence). Until then the field stays empty and the gate is unchanged in behavior. Tests: grid_trust_policy verified_grant_overrides_tier_ceiling_for_conferred_command + verified_grant_does_not_resurrect_a_blocked_peer; auth_policy (9) + grid_capability (5) + grid_trust_policy (7) all green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Satisfies the review HARD GATE: the consumer-side anti-replay watermark was in-memory + unbounded, so a node restart reopened the entire replay window (a peer could re-present a grant the owner already superseded). The grid expects mundane restarts, so this must be durable + bounded before grants gate live traffic. - New EpochWatermarkStore trait (routing/epoch_watermark.rs) behind which the anti-replay state lives: - InMemoryEpochWatermark — DashMap, atomic per-grantee (default for tests). - SqliteEpochWatermark — durable (survives restart), bounded (evict_older_than drops entries no live grant could reference, expiry-aligned by updated_at_ms). Atomic check-and-advance runs in a serialized write transaction via spawn_blocking, off the async executor (substrate concurrency style). - GrantAuthorizer holds Arc<dyn EpochWatermarkStore>; authorize_command is now async and consults the store. new() keeps in-memory; with_watermark() / with_verifier_and_watermark() inject the durable store for the live path. A store error fails CLOSED → GrantAuthOutcome::WatermarkUnavailable (deny), never authorizes a grant whose replay status is unknown. - VerifyContext (holding a non-Sync &dyn GrantVerifier) is scoped to drop before the await so authorize_command's future is Send — required for the multi-threaded handler runtime. Tests: epoch_watermark anti-replay on BOTH impls + durability-across-reopen + bounded eviction (4); grid_capability decision path migrated to async (5); both stress concurrency proofs through the REAL SQLite path. Full routing suite 260 green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The producer half of the capability-grant wiring — built + tested against the gate seam, not yet installed on the live boot path (that needs the GrantAuthorizer constructed from airc identity + mesh + a durable watermark, next slice). - CommandRequestHandler gains an optional GrantAuthorizer (with_grant_authorizer); new() keeps the tier-only default (grants ignored). - parse_envelope decodes the optional base64 HEADER_AIRC_CAPABILITY_GRANT into a typed SignedCapabilityGrant (ParsedEnvelope.presented_grant). A present-but- undecodable header is surfaced loudly, never silently dropped. - process_request verifies a presented grant via the authorizer against the AUTHENTICATED sender key (airc.peer_public_key(sender) — the enrolled key from the same registry that signature-verified the envelope, NOT the grant's self- asserted grantee_pubkey: the review's hard gate #3). On Authorized, the grant's conferred capabilities ride into the gate via CallerIdentity::with_granted_ capabilities; otherwise the caller falls back to tier gating. - Dispatch refactored into dispatch_request(executor, parsed, caller); the static process_request_via keeps its exact prior behavior (plain authenticated caller, no grant) for tests + the LocalGridTransport fixture. Tests: parse_envelope decodes a presented grant + rejects a malformed grant header; the no-grant default stays None. command_handler suite 15 green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…sona path Closes the loop: every persona that boots now VERIFIES presented capability grants. A visiting peer presenting an owner-signed grant gets the conferred command past its tier ceiling; absent/invalid grants fall back to tier gating. - airc bump a7ae4f4 → 4aa717d (airc#1277): Airc::mesh_identity is now pub. - build_grant_authorizer(airc, home): constructs the per-persona GrantAuthorizer. "This node is the owner": trusted issuer = the node's OWN enrolled ed25519 key (self-enrolled at Airc::open — it signs the grants it hands out); expected mesh = the node's own mesh (airc.mesh_identity()); anti-replay = a DURABLE SqliteEpochWatermark under <persona-home>/grant_watermark.sqlite (survives restart — the review hard gate). Typed GrantAuthorizerBuildError; provider≠owner (pinned-issuer-key distribution) is the deferred generalization. - PersonaCommandInboundPump::spawn takes the authorizer and builds the handler via with_grant_authorizer. Both PersonaAircRuntime install sites (bootstrap + install_command_pump) build it first; a build failure is a typed bootstrap failure (PersonaAircRuntimeError::GrantAuthorizerBuild), never a silent fall-through to an unverified path. Validated: cargo check (metal,accelerate) clean; the production-shape persona_command_inbound_pump integration test passes (persona answers a tier-gated command through the installed pump + authorizer); routing lib suite 262 green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The grantee side of the contracted grid — the SEND half. A node holds grants an owner issued it and presents them so the owner can authorize otherwise-tier-denied commands. - PresentedGrantStore trait + InMemoryPresentedGrantStore (routing/presented_grant_store.rs): base64 grants keyed by TARGET peer (the owner that will verify). Latest-wins on insert so a re-issued / higher-epoch grant supersedes; sync lookup for the outbound hot path. - AircTransport gains an optional grant store (with_grant_store); on a peer-targeted dispatch it stamps the held grant onto HEADER_AIRC_CAPABILITY_GRANT. Room / wildcard targets have no single verifier, so nothing is stamped. None = present nothing (unchanged tier-gated behavior). Pairs with the receive path (handler verifies) + issuance (airc#1278 sign_grant + the grid/grant/issue command, next). Tests: store holds/presents/supersedes per target; airc_transport suite 16 green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…erify → run The proof the grid is ready for compute. Two REAL airc peers over the loopback fixture, the production install/gate/send paths — no mocks: - Owner exposes a tier-DENIED command (compute/echo) behind GridTrustAuthPolicy + the inbound pump with build_grant_authorizer. - WITHOUT a grant: the remote peer is DENIED (the gate holds). - Owner ISSUES a grant for the grantee conferring exactly compute/echo (Airc::sign_grant, airc#1278). - Grantee PRESENTS it (InMemoryPresentedGrantStore + AircTransport stamps HEADER_AIRC_CAPABILITY_GRANT). - Owner VERIFIES (handler → GrantAuthorizer, against the authenticated sender key + durable watermark) and RUNS the command, echoing the params back through the full chain. Both halves asserted: no grant → denied (no auth hole), valid grant → runs (the grid can sell compute). Also bumps airc 4aa717d → 55790e1 (airc#1278 sign_grant). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The reusable issuance primitive, symmetric to build_grant_authorizer (verify): issue_grant(airc, issued_at_ms, params) composes a CapabilityGrant + Airc::sign_grant + base64, returning the blob a grantee presents. Binds the grantee's AUTHENTICATED key (from the owner's enrolment), the owner's mesh, and the owner's signature — all from the one airc handle so issuer / mesh / grantee-key can't drift from what the verifier checks. Typed IssueGrantError; fail-closed (never returns a partial grant). Any surface holding an owner airc handle (a persona runtime, a future grid/grant/issue command) wraps this — the primitive is identity-agnostic; "which identity issues" stays the caller's decision. Dogfooded: the end-to-end contracted-grid test now mints its grant via issue_grant (replacing the inline construction) and still proves issue → present → verify → run. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…y grants The operator front door over routing::grant_issuance::issue_grant — closes the contracted-grid loop with a command instead of hand-written Rust. - GrantIssuanceModule (modules/grant_issuance.rs): holds the PersonaAircRuntimeRegistry (shared with the instance manager); handle_command decodes { issuerPersonaId, grantee, capabilities, expiresAtMs?, epoch? }, resolves the issuing persona's live airc handle, and returns the base64 grant blob to deliver. A non-running issuer is a hard error (it owns the signing key — never fabricated). - Registered at the live boot site (ipc/mod.rs) alongside the instance manager. - OWNER-ONLY: grid/grant/issue is outside the cross-grid ACL allow-list, so it falls to the ""=Owner wildcard — only the local operator can sell its personas' compute; a remote peer can never mint grants. Pinned with an acl regression. Each persona is its own owner selling ITS compute, so the issuer is a persona's airc identity (issuerPersonaId names which). Tests: malformed-request + issuer-not-running error paths (2); acl owner-only pin; the happy path is proven end-to-end in tests/capability_grant_e2e.rs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…self Persistent identity worked for Asha (her seed.json persona_id matches her live peer 90e758b2) but two fragilities could re-mint a stranger and orphan her engrams — exactly how personas-archive/ filled with 9 strangers: 1. The seed was written ONLY on FreshlyMinted, best-effort + non-fatal. A single failed mint-write, or a later deleted/corrupt seed, left her home (engrams + airc key) on disk but unresumable → next boot minted a stranger. 2. The bootstrap WRITE path (citizen_home_path(..).parent()/seed.json) and the resumer READ path (citizens_kind_dir) agreement was untested — that exact divergence (resumer hard-coding `personas/` vs `citizens/personas/`) is what created the strangers originally. Fixes: - seed::ensure_seed(seed_path, persona_id, agent_name, fallback_created_at_ms): idempotent upsert that runs on EVERY bootstrap (mint AND resume). Self-heals a missing/corrupt seed from the live identity, and PRESERVES created_at_ms from an existing seed (her birth time is stable — a naive rewrite would reset her age every boot). persona_instance_manager now always calls it (drops the FreshlyMinted gate) + the stale `personas/` path comment is corrected to `citizens/personas/`. - citizen_path: regression test pinning seed-write-path == resumer-scan-path for a Persona (the stranger-minting bug can never silently return). Tests: ensure_seed creates-missing / preserves-birth-time-on-resume / heals-corrupt (3); the path-agreement pin; seed 8, citizen_path 6, instance_manager 5, resume_or_mint 6 all green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…apter boundary Asha leaked her entire <think>… chain-of-thought into the room (and looped inside it without ever emitting an answer). Root cause: a reasoning model's reasoning was never separated from its user-facing text. unsloth's /v1 (llama.cpp backend) emits <think>…</think> INLINE in `content` (verified live: no `reasoning_content` field), the OpenAI adapter passed it straight into `text`, and the cognition cleaner only stripped `<thinking>` (with -ing) — which never matched `<think>`. Fix at the adapter boundary (where the model's output contract belongs): - TextGenerationResponse gains `reasoning: Option<String>` — reasoning is captured (for the glass-box harness + memory) and stripped from `text`, so it can NEVER reach the room. Uniform across adapters; ts-rs binding regenerated. - openai_adapter::extract_reasoning(content, reasoning_content): precedence — (1) a server `reasoning_content` field (vLLM-style) wins; (2) inline <think>…</think> is split out, answer = text outside the block; (3) an UNCLOSED <think> (the runaway loop) yields EMPTY text so the caller refuses to post, never leaking raw reasoning. Wired into the response parse; OpenAIMessage now reads reasoning_content too. - Other adapters set reasoning: None (anthropic: extended-thinking is a follow-up, doesn't leak; llamacpp: TODO to reuse extract_reasoning if it serves a reasoning model locally — not Asha's path). Tests: extract_reasoning over well-formed / unclosed-runaway / server-field / plain+empty-think (4); adapter + response_validator suites green (9 + 7); all TextGenerationResponse constructors updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…hink switch The reasoning-strip (9d3f895) cleans the OUTPUT; this addresses the INPUT side — the model thinks on EVERY turn (even "say hello"), burning latency and feeding the runaway loop. Verified live which mechanism works: the gateway IGNORES chat_template_kwargs.enable_thinking for this forged model, but Qwen3's `/no_think` SOFT-SWITCH appended to the user turn works — empty <think></think> + direct answer. - ThinkingMode { Default, Suppress } on the OpenAI adapter config; format_messages appends `/no_think` to the last user message when Suppress (apply_no_think_switch — model-specific token owned at the adapter boundary; higher layers stay model- agnostic). - The local unsloth/GGUF reasoning gateway defaults to Suppress (this 4B's thinking rambles + loops, and it answers correctly without it). Operator override `UNSLOTH_THINKING=on` re-enables it — the reasoning-strip still protects the room. Cloud providers keep their default. Gateway-level for now; per-task/per-request thinking is the follow-up (a recipe that needs deliberation re-enables it). Tests: apply_no_think_switch targets the last user turn + no-ops without one; adapter suite 11 green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

joelteply and others added 30 commits June 21, 2026 16:39

joelteply and others added 2 commits June 22, 2026 00:29

github-actions Bot added the size: XL label Jun 22, 2026

joelteply merged commit cd7d655 into canary Jun 22, 2026
6 checks passed

joelteply deleted the feat/persona-seed-self-heal branch June 22, 2026 05:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: contracted-grid auth + self-routing commands + persona persistence/cognition fixes#1726

feat: contracted-grid auth + self-routing commands + persona persistence/cognition fixes#1726
joelteply merged 32 commits into
canaryfrom
feat/persona-seed-self-heal

joelteply commented Jun 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

joelteply commented Jun 22, 2026

1. Self-routing command infrastructure

2. The contracted grid — capability-grant auth (READY FOR COMPUTE)

3. Persona persistence + cognition (Asha, proven live)

Validation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant