feat: contracted-grid auth + self-routing commands + persona persistence/cognition fixes#1726
Merged
Conversation
…outing foundation The routing-side erasure + the first base-trait shape, so a command becomes a self-contained routable object and a command author writes only a `run` body. - `DynCommand`: object-safe, type-erased command the kernel can hold in a flat `name -> Arc<dyn DynCommand>` map and route to directly (no per-module match arm, no prefix double-routing). Blanket impl makes EVERY `CommandHandler` a `DynCommand` for free; `invoke` delegates to the existing `dispatch`, so the routing side and the typed authoring side share one `CommandSpec` and can't drift. - `ActionCommand`: fire-and-forget verb shape with blanket `CommandSpec` (Bare wire) + `CommandHandler` impls. Implementing the shape IS implementing the command — the chain `ActionCommand ⟹ CommandSpec ⟹ CommandHandler ⟹ DynCommand` means declare the shape, get the routable object. Cross-cutting policy (`ACCESS` default AiSafe) is declared per command, not re-implemented. Validated against two outliers in isolation (not yet wired into the executor hot path): a stateless action (ping-shaped, captures no deps) and a stateful, dep-holding action (owns an Arc'd counter, tightens ACCESS to Privileged) — both route identically through the type-erased object. Error-mapping at the erased boundary preserved (bad params → named `invalid` refusal). Anchoring design: docs/architecture/COMMAND-ORGANIZATION.md — self-routing map (typed-path-wins, prefix/ServiceModule fallback during migration), composition via `ctx.call` through the same chain, and machine+environment-agnostic execution (cross-tower routing + `Provided` adapters) with latency as a first-class constraint. Slice 1 of #42. Next: boot-time command_map + executor consult (typed path wins, fallback preserved), then QueryCommand/CrudCommand/SessionCommand. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…re-Rust `cu` client Slice 2 of #42 — the DynCommand object map is now consulted on EVERY live dispatch path, and `ping` is migrated end-to-end onto it (proven via a pure-Rust `cu ping`, no Node). Runtime wiring (typed-path-wins, prefix/ServiceModule fallback preserved): - `ServiceModule::commands()` — default-empty hook; a module contributes its self-routing DynCommand objects (each owns its deps), so the kernel routes a name straight to the object with no per-module match arm. - `ModuleRegistry` — `command_objects: name -> Arc<dyn DynCommand>` map, populated at register() from each module's commands(), with a duplicate-name panic (the registry is the backstop the "no central list" design removes). New `route_object()` (O(1), lock-free) + `list_command_objects()`. - `dispatch_object_with_panic_guard()` — catch_unwind guard for object dispatch, mirroring the module path (persona tool calls converge here). - Consult added to ALL three live paths: `CommandExecutor::execute_inner`, `Runtime::route_command` (the IPC/`cu` socket route), and `route_command_sync` (rayon). Object map wins before prefix routing. (Unifying these three into one path is the COMMAND-ORGANIZATION.md follow-up.) ping migrated: `PingCommand` is now an `ActionCommand` (one type + a `run` body; CommandSpec/CommandHandler/DynCommand all blanket-derived), removed from HealthModule's command_prefixes and match arm, exposed via commands(). Off the prefix table, onto the typed object map. `cu` — the pure-Rust CLI client (`src/bin/cu.rs`), replacing the legacy Node `./jtag`. `cu <command> [json]` dispatches through the SAME uniform Connection every client uses (CLI/persona/web/mobile), over the core IPC socket via CoreIpcTransport (same transport as continuum-mcp). No tsx, no bundle, no Node. Validated live: built the core + cu directly with cargo (no npm start), ran the core on its socket, `cu ping` → {"ok":true,"roundTripMs":0} through route_object. Plus 10 unit/integration tests green (blanket-chain outliers, registry routing, executor + health typed-path). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…start is THE start Move-first excision of the Node `npm start` poison (Joel: "move shit first, compilation blows up, makes it easy to find all the smell"). - `git mv tools/scripts/parallel-start.sh → legacy/node-startup/`. `legacy/` is NOT a cargo workspace member, NOT referenced by any npm script / Dockerfile / CI workflow, with a README marking it dead and off-limits to editing. - Both `start` scripts now point at the EXISTING pure-Rust `start-server.sh` (root package.json already did; src/package.json was the poison path → parallel-start.sh). Dropped the `desktop:legacy` pointer. Verified no live (non-comment) consumers of parallel-start.sh remain. - `start-server.sh` now also builds the `cu` CLI client alongside continuum-mcp, so the headless start produces core + mcp + cu — pure Rust, no Node. - .gitignore: add `tools/models/` (the current voice/avatar model download path; the workers→core/tools restructure left the old `src/workers/models/` rule stale, so large model binaries were no longer ignored). `npm start` (from root or src) is now the headless Rust core via cargo run; the Node orchestrator that broke on stale `cd workers` / scene-gen is out of the path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…deterministic) The "working persona on the clean command infra" regression guard, with zero live deps (no inference, no airc, no models). A persona's CommandToolExecutor routes a `ping` tool call through the uniform Connection → InProcessTransport → CommandExecutor → execute_inner → route_object → the ping DynCommand (migrated via ActionCommand, off the prefix table), and the bare PingResult comes back. Proves the command-infra cleanup didn't break the persona's ability to ACT, and that the self-routing typed path serves personas — not just internal callers. The existing suite covered the prefix path (test/echo); this covers the new object path end-to-end on the persona's real dispatch route. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ommands
`cu` is now the reliable upstart, not just the client (Joel: "make it cu",
"reliable upstart", "cu-driven start"):
- `cu start` — locate `tools/scripts/start-server.sh` (env override or walk up
from cwd), spawn it in its OWN session (setsid) so the core outlives the CLI,
log to /tmp/continuum-core-start.log, write a pidfile, and poll `ping` until the
core is ready (or fail loud with the log tail). Idempotent: no-op if a core
already answers. start-server.sh stays the pure-Rust implementation detail
(cargo run, per-platform GPU features, no Node).
- `cu stop` — SIGTERM the recorded process group (setsid made the core a group
leader, so cargo + core are reaped together), pkill fallback if no pidfile,
remove the socket.
- `cu <command> [json]` — unchanged dispatch through the uniform Connection.
Validated live: `cu stop` (clean) → `cu start` (core ready in 28s, detached) →
`cu ping` {"ok":true} → `cu start` idempotent ("already running"). No npm, no
Node, no manual launch/poll dance.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…sting/packaging) The strict, design-first contract that makes startup + testing reliable (the two weakest links) so personas-on-airc is a dependable foundation to build clients on and roll out as dockerized/k8s nodes (Joel: "adherence to strict principles and design first"). Defines: the foundation thesis (install → start → personas on airc → iterate WITH them → clients on top → dockerized nodes → k8s); 8 strict principles (one pure-Rust startup, headless core + equal clients, modular units = build units = containers, deterministic layered testing, no Node in the foundation, move-first excision, and a SINGLE DYNAMIC command surface — cu calls every command, no duplicated lists / switch-on-name); the modular unit table (core/mcp/cu/inference/ livekit/unsloth/clients); the cu-driven startup; the three test layers; the Docker/k8s rollout shape (existing compose + per-unit Dockerfiles → continuum node as the k8s unit); the Node boundary (web client only); status + next slices. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ts (13GB) were unignored The cargo workspace root is the repo root, so `cargo run --manifest-path core/continuum-core/...` (start-server.sh / cu start) builds into /target — and .gitignore had NO `target` pattern at all (the workers→core restructure left it uncovered), so 13GB of build artifacts were staged-able. Add /target/ and **/target/. Canonical build target stays $HOME/.continuum/cache/cargo-target. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…emony, single-source discovery) Kills the "every command needs a host module to expose it" friction and makes the command catalog dynamically discoverable from the ONE registry. - `register_stateless_command!(T)` — a stateless command (no deps, `Default`) self-registers BOTH its static descriptor AND a runtime constructor via inventory. `ModuleRegistry::new()` seeds the typed object map from these (`stateless_command_objects()`), so the command is live on the typed path with NO host module, NO `commands()` override, NO match arm. Dep-holding commands still come from a module's `commands()` (their deps must be constructed). Duplicate-name panic guards both paths. - `commands/` tree (per COMMAND-ORGANIZATION.md): self-contained command files, no central list. First inhabitant: `commands/catalog.rs`. - `commands/list` — dynamic, single-source command discovery: returns a snapshot of `command_registry()` (name, description, access, wire, params type), optional name filter. Clients/trays/cu never hardcode a catalog — they call this and adapt. It's itself a zero-ceremony stateless command (dogfoods the mechanism). - `ping` migrated to `register_stateless_command!` — dropped HealthModule's `commands()` override; ping is now a pure stateless command, no ceremony. ts-rs bindings generated (protocol/typescript/commands/). 14 tests green incl. the catalog self-listing + filter, ping still routing via the typed object map, and the persona executing ping through it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…, no per-command code
The CLI edge of the uniform param-adaptation principle (Joel: "ideally it
naturally translates or adapts all params, so meet humans or AIs in the middle at
every interface… automatically though, not switch statements, procedural").
`cu <command> [args]` adapts params with ONE generic rule for ALL commands — never
a per-command switch:
- nothing → `{}`
- a single positional JSON object/array → verbatim (the AI / tool-call path)
- `--key value` / `--flag` → a JSON object built by one loop: keys normalized
kebab/snake → camelCase (`--round-trip-ms` → `roundTripMs`, matching the
canonical wire fields), values coerced by trying JSON first (`5`→number,
`true`→bool, `{…}`→object) then falling back to string, bare flag → true.
So a human types `cu ping --message hi` and the typed command receives
`{"message":"hi"}`; an AI sends the JSON object directly; both hit the same
command. Schema-AWARE coercion/validation lands when the registry exposes param
JSON schemas via commands/list — same single source, every interface adapts.
Validated live: `cu ping --message hi` → {"ok":true}; `cu commands/list --filter
commands/` → the live catalog. 2 adapter unit tests green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…e adapts (symmetry)
Each command's params now carry a JSON Schema derived AUTOMATICALLY from the Rust
type, so every interface handles any command from the one canonical schema — "all
SDKs automatically handle the rust command, across environments, symmetry" (Joel).
It's adapters everywhere over one source, thin code each.
- `CommandSpec::params_schema()` — provided method, default `Null`. The base
traits override it to derive the schema via `schemars` (ActionCommand::Params:
JsonSchema). So a command declared (or ported) onto a base trait gains a real
schema with ZERO extra code; manual CommandSpec impls stay `Null` until migrated
— breakage-free. CommandDescriptor carries `params_schema`.
- The adapters (one schema → each paradigm):
- AI / RAG: `persona_tools` projects the schema into the tool `input_schema`
(was an open object — the reasoner now sees real fields).
- cu / CLI: `cu <cmd> --help` renders the manual as bash flags (property →
--kebab, type, description, required) from the same schema — "the manual
matches the paradigm." Plus the existing procedural `--key value` adapter.
- web / mobile / RAG: `commands/list` returns `paramsSchema` to build forms /
tool-schemas — single source, no per-command code.
- schemars dep added (uuid1). PingParams/CommandsListParams/EchoParams derive
JsonSchema.
Validated live: `cu commands/list --filter ping` returns the derived schema
(Option<String> → ["string","null"], doc-comments as descriptions); `cu ping
--help` renders `--message <string>`. 12 tests green (schema projection, catalog,
cu adapters + help renderer).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…tity, every interface Threads the authenticated caller through the typed dispatch path so a command can gate/scope/compose BY identity — and makes discovery (commands/list) honor it, so "what's listed == what you can call" holds at the CLI and persona, matching the call gate that was already enforced. Cross-grid identity (airc-verified sender) keeps flowing the same way. - `caller_trust(caller)` — ONE source for the caller→trust rule (local/substrate → Owner; airc-sourced → Provisional ceiling). `GridTrustAuthPolicy::gate` refactored to use it (behavior preserved; tests green) so the gate and any trust-aware consumer can't drift. - `Ctx.caller: Option<CallerIdentity>` threaded via `dispatch_with_caller` → `DynCommand::invoke(params, caller)` → `dispatch_object_with_panic_guard`. The executor passes the identity it just gated on (persona / cross-grid airc sender); local in-process + IPC pass `None` (owner). Module `handle_command` path unchanged (legacy, owner-local). - `commands/list` filters by `caller_trust(ctx.caller)` + `is_command_authorized` — the SAME rule the gate uses. Local owner sees all; a Provisional persona/peer sees only its authorized surface. Test: provisional ⊆ owner, and every listed command is callable at its trust. Status: identity is now available in Ctx for handlers and gates discovery. Full composition-propagation (a handler re-dispatching via `ctx.call` as the same caller) is the next step — the caller is now in Ctx to enable it. Gating the local IPC path through the executor is NOT needed for correctness (local == owner by policy); it'd only matter for scoped local identities (future). 19 tests green (gate refactor, identity-gated list, command/handler/catalog/persona). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ty-propagation status A command composing another propagates the ORIGINAL caller via `execute_with_caller(sub, params, ctx.caller.clone())`, and the gate enforces that caller's trust on the sub-call. New test `composed_call_propagates_caller_no_ escalation`: an airc/Provisional caller composing into `data/delete` (Owner-only) is gate-FORBIDDEN; the local owner passes the gate — no escalation, identity flows through composition (and, by the same mechanism, across the grid via the airc- verified caller). COMMAND-ORGANIZATION.md updated to state the real status: identity propagation works today (ctx.caller + execute_with_caller); the typed `ctx.call::<C>(p)` sugar (an executor handle on Ctx so a handler can't forget to pass ctx.caller) is the remaining ergonomic follow-up on this foundation. 19 executor tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…y command `system/info` (version + pid) as a stateless ActionCommand in its own file: register_stateless_command! and it's instantly callable via cu/persona/SDKs with a derived param schema + ACL gating, no wiring elsewhere. The "minimal code per command" shape the ported catalog will follow. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…composition test, schema/edge notes
Two adversarial reviewers (security + correctness) audited the branch. Security
verdict: NO exploitable escalation; gate refactor behavior-preserving. Fixes for
the actionable findings:
- H3 (perf): `command_registry()` was rebuilding the descriptor Vec + running
schemars reflection (schema_for!) for EVERY command on EVERY call — so
commands/list and the persona tool surface were O(commands × reflection) per
call. Now built once into a OnceLock and cloned out.
- H1 (cu correctness): `cu cmd --key=value` was mis-parsed into a junk
`{"key=value": true}` key. Now splits on the first `=` (both `--key value` and
`--key=value` work). Extracted `coerce()`. Test added.
- L1 (test): the composition test only exercised the gate. Replaced with a REAL
composing handler (`Composer: ActionCommand`) that composes `data/delete` with
`ctx.caller.clone()` — proving identity propagates through a handler and an
airc/Provisional caller can't escalate (owner can).
- L2 (test): commands/list identity-gating test now asserts the Provisional surface
is non-empty (subset check no longer vacuous) and ≤ owner surface.
- M3 (schema): cu `--help` renders a nested-type `$ref` as its type name instead of
`<value>`; persona_tools tool_input_schema_from carries a TODO for nested
$defs/$ref (latent — all current params are flat).
- M2 (doc): caller_trust carries a TODO that every airc caller maps to Provisional
(Blocked peers not yet distinguished — needs the airc↔grid trust bridge).
- C1 (doc): the typed object path on the IPC route deliberately bypasses per-MODULE
metrics/concurrency (objects are module-independent); documented at the site +
flagged per-command observability as the command-framework's slice.
Not fixed here (reported, tracked): TCP IPC listener treats remote connections as
local Owner (pre-existing CRITICAL, config-gated to 127.0.0.1 by default — needs a
non-Owner caller for TCP-sourced requests); composition propagation is
author-discipline (the `ctx.compose` helper that forces it is the next slice);
AllowAllPolicy default (one refactor from bypass — consider GridTrust default).
55 tests green (52 lib + 3 cu).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…nticated-Owner hole Adversarial security review found: the TCP IPC listener funneled into the same `handle_client` → `route_command(caller=None)` = local Owner, UNGATED. With the Docker `0.0.0.0` bind, anyone who could open the port got unauthenticated Owner command execution (data/delete, grid/trust, …). Pre-existing, but `route_object` now rides that path too. Closed: - New `CallerSource::Tcp` (honest provenance — an unauthenticated remote socket, distinct from airc's verified envelope) + `CallerIdentity::tcp(peer_id)`. - `caller_trust(Tcp)` = Provisional ceiling (remote, never Owner) — same one-source rule the gate uses. So TCP can run the AiSafe surface + ai/generate but is FORBIDDEN every Owner-gated command. - `handle_client` now takes the connection's `caller`: the Unix socket passes `None` (owner-by-locality — the operator on the box), the TCP listener stamps `CallerIdentity::tcp(nil)`. A boundary ACL-gate (`caller_trust` + `is_command_authorized`) refuses Owner-gated commands for remote callers before dispatch. The caller is also threaded into `Runtime::route_command` so the typed object path / composition sees the REMOTE identity (no escalation via a composing command over TCP), not silently Owner. Unix-socket behavior unchanged (local owner). Test: `tcp_caller_is_remote_not_owner` (Provisional, ai/generate allowed, data/delete|grid/trust|grid/pair forbidden). 58 lib tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…open-TCP residual Second adversarial pass on the TCP fix: verdict = Owner-execution hole genuinely closed, no bypass. Addressing the two residual risks it surfaced: - Default-AiSafe migration footgun: destructive `data/*` are safe only because unregistered (unclassified→Owner default-deny). `ActionCommand` defaults ACCESS to AiSafe, so migrating one to a command object and forgetting `const ACCESS = Privileged` would silently expose it at Provisional (i.e. over TCP / to cross-grid peers). New regression test `destructive_data_commands_stay_owner_only` (data/delete|update|truncate| clear-all) trips CI if that ever happens. - Open-TCP residual: documented at the TCP listener that the Provisional AiSafe surface (arbitrary data/list reads, chat/send writes, ai/generate) is reachable UNauthenticated over a non-loopback bind. TODO(authenticated-tcp): shared-secret / signed handshake (+ optional sub-Provisional read-only ceiling) before relying on 0.0.0.0; pairs with the airc↔grid per-peer trust bridge. Until then: don't bind 0.0.0.0 on an untrusted network. acl tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…(NodeRegistry is wrong key space) The airc↔grid trust bridge mechanism (task #38): a gate that resolves a remote caller's REAL grid TrustLevel instead of the flat Provisional ceiling — built as a validated SEAM, but deliberately NOT wired in production yet. - `PeerTrustSource` trait (airc peer_id → TrustLevel) — the abstraction the gate depends on, so it's not coupled to any concrete store; mock-tested. - `GridTrustAuthPolicy::with_trust_source(..)` + `resolve_trust`: a remote (Airc/Tcp) caller's registered trust CAPPED at Trusted (REMOTE_TRUST_CEILING — Owner is local-only, a remote peer can never reach Owner-gated commands); Blocked → denied; unknown → Provisional. `new()` keeps the flat ceiling. - Test `per_peer_trust_bridge_blocks_blocked_and_caps_remote_at_trusted` proves the logic with a mock source: Blocked denied everything, Trusted graduated but data/delete still local-only, Owner-registered peer capped at Trusted, unknown → Provisional. WHY NOT WIRED: adversarial self-review caught that the grid `NodeRegistry` is keyed by transport ADDRESS (`address_to_node_id` → Tailscale IP / Reticulum hash), NOT by the airc `peer_id` the `CallerIdentity` carries — different identity spaces. Wiring it would silently no-op (every airc caller → "unknown" → Provisional) AND mislead (grid/trust by address wouldn't gate airc callers). So `NodeRegistry` does NOT impl `PeerTrustSource`, and the IPC gate keeps `new()` (flat ceiling — honest, zero behavior change). The seam activates when a real peer_id-keyed airc trust source exists — the airc↔grid identity unification (task #38). Net: the gate's behavior is unchanged in production; the bridge is a tested seam ready for the airc-side trust source. 12 trust/acl tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…y unification (design) The keystone design (Joel: "unification is everything, do it right"): identity, authorization, and the grid economy unified into ONE cryptographically-signed object — airc's `grid_auth` SignedCapabilityGrant / SignedMeshMembership. A peer doesn't assert who/what it is — it PRESENTS a grant the owner signed, and the executing node VERIFIES it (issuer-pin → sig → key-binding → mesh → expiry, stateless) and authorizes iff `grant.grants(command)` (capabilities use the SAME vocabulary as command names). This DISSOLVES the two-identity-space problem (the grant binds peer_id + pubkey, verified against the owner's key — no shared trust store, no address↔peer mismatch) and IS the contracted/for-sale grid (a paid grant = capabilities + expiry, signed, revocable by epoch). Specifies: the model (membership→tier→ACL + capability→grants(command), one verifier); the airc primitives (have, public: grid_auth); the continuum gate integration (verify on dispatch, Owner stays local-only, composition propagates the verdict); why it's the identity unification; issuance + transport + consumer-side epoch anti-replay; 3-phase plan (membership-tier → capability grants → economy); the cross-repo split (airc: envelope transport + issuance; continuum: verifying gate + epoch store + capability map); open questions. The continuum gate seam (GridTrustAuthPolicy/resolve_trust/cap) stays valid as the gate shape — this is what it verifies against. Next: Phase 1 (verify SignedMeshMembership → tier), a joint continuum+airc slice. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…cted-grid gate core)
Phase 2 core of the identity unification (docs/grid/GRID-CAPABILITY-AUTH.md): the
continuum-side engine that verifies an airc `SignedCapabilityGrant` and authorizes
a command from it. Identity + authorization + contract = one signed object.
- `Ed25519GrantVerifier`: impl of airc's `grid_auth::GrantVerifier` using the
substrate's ed25519 (verify_strict — same primitive as L1-6 envelope sigs).
- `GrantAuthorizer::authorize_command(signed, presenting_pubkey, command, now)`:
verify via grid_auth (issuer-pin → sig → key-binding → mesh → expiry, stateless)
→ consumer-side epoch anti-replay (reject a superseded lower epoch; revocation =
higher-epoch empty-caps grant) → `grant.grants(command)`. Returns a TYPED
`GrantAuthOutcome` (Authorized / Invalid(GrantVerdict) / Superseded / NotGranted)
so the gate + audit see exactly why.
- The capability vocabulary IS the command vocabulary (`grants("ai/generate")`) —
no parallel namespace. Owner-gated commands are never delegated (a grant confers
only its named capabilities).
This is the verification CORE — the heart of the contracted/for-sale gate. It's a
tested SEAM, NOT yet wired to live dispatch: the airc command envelope doesn't carry
grants yet (the airc-side transport slice). When it does, CommandRequestHandler
extracts the grant + presenting key and calls authorize_command from the gate.
4 tests green: valid-grant-authorizes (+ NotGranted for others), typed rejections
(UntrustedIssuer/BadSignature/KeyMismatch — stolen grant can't ride another peer),
epoch anti-replay + revocation, and the REAL ed25519 signature verify + tamper
rejection.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ngine Adversarial review of be5254d (verdict: crypto/trust-roots/key-binding sound; not exploitable today — unwired). Must-fixes folded: - TOCTOU (2.1): epoch check + advance are now ONE atomic `entry()` critical section — a superseded epoch can't pass its check while a higher epoch commits in the gap. Multi-thread stress test (gated `stress-tests`) proves monotonicity under concurrent same-grantee presentation. - Revocation actually revokes (2.2): the watermark advances on ANY valid grant (latest-epoch-authoritative, airc's model), so a higher-epoch empty-caps grant supersedes the old real-caps grant. Fixed the test that had enshrined the broken behavior — it now asserts the revoked grant returns Superseded. - Boundary-aware capability match (4): `confers()` matches exact OR on a `/` boundary (`ai/generate` confers `ai/generate/stream`, NOT `ai/generatex`) — consistent with the command-ACL's prefix rules, never a bare starts_with. - Test integrity (7.1): verifier is injectable (`with_verifier`); tests now drive the REAL `authorize_command` with a stub (no duplicated logic). Added malformed-proof reject vectors (wrong-length key/sig). Hard gates documented before live wiring (2.3/5.1/3.1): persist + bound the epoch watermark (volatile/unbounded today), and the presenting key MUST come from the authenticated sender, never the grant body. 6 tests green (5 + stress). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…transport Adopts airc#1276 (the capability-grant transport continuum's GrantAuthorizer verifies against): grid_auth::SignedCapabilityGrant::sign, Airc::peer_public_key, HEADER_AIRC_CAPABILITY_GRANT. Pulls the ~50-commit canary delta since 72824ba — the ai/generate 5090 compute-lease facility (#1242), the ai/embedding grid facility (#1239-1241), TranscriptKind::ChannelPurposePublished + channel_purpose (the typed room-purpose seam for RoomPurposeSource), relay self-election + stream-plane crypto, and StatusResponse.connected_lan_peers. ABI deltas are additive to types continuum only decodes (connected_lan_peers is #[serde(default)]; the new TranscriptKind variant has no exhaustive match against it), so the bump is decode-compatible. Validated: cargo check -p continuum-core --features metal,accelerate clean; routing::{grid_capability, grid_trust_policy, command_handler} tests green (5 + 5 + 13). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The gate-side half of the capability-grant wiring — the receiving seam a verified grant flows into, built + tested independently of the airc producer. - CallerIdentity gains `granted_capabilities`: the capability tags a transport boundary CRYPTOGRAPHICALLY VERIFIED for this dispatch (conferred by an owner-signed SignedCapabilityGrant, populated ONLY after GrantAuthorizer::authorize_command returns Authorized against the authenticated sender key). Default empty; `with_granted_capabilities` builder for the boundary. - GridTrustAuthPolicy::gate adds the contracted-grid fast-path: if a caller's verified granted_capabilities confer the command, it's authorized regardless of the tier ceiling — the explicit signed contract overrides the coarse default trust. Gated on trust > Blocked so a grant can't resurrect a Blocked peer. - grid_capability::confers is now pub(crate) — the gate re-checks granted caps through the SAME boundary-aware match rule (one source of truth, no divergent copy). Sound because the field is populated ONLY post-verification by a boundary; no local/Tcp constructor sets it. The airc command handler is the producer (next slice — needs Airc::own_public_key + owner-key provenance + epoch-watermark persistence). Until then the field stays empty and the gate is unchanged in behavior. Tests: grid_trust_policy verified_grant_overrides_tier_ceiling_for_conferred_command + verified_grant_does_not_resurrect_a_blocked_peer; auth_policy (9) + grid_capability (5) + grid_trust_policy (7) all green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Satisfies the review HARD GATE: the consumer-side anti-replay watermark was
in-memory + unbounded, so a node restart reopened the entire replay window (a
peer could re-present a grant the owner already superseded). The grid expects
mundane restarts, so this must be durable + bounded before grants gate live
traffic.
- New EpochWatermarkStore trait (routing/epoch_watermark.rs) behind which the
anti-replay state lives:
- InMemoryEpochWatermark — DashMap, atomic per-grantee (default for tests).
- SqliteEpochWatermark — durable (survives restart), bounded (evict_older_than
drops entries no live grant could reference, expiry-aligned by updated_at_ms).
Atomic check-and-advance runs in a serialized write transaction via
spawn_blocking, off the async executor (substrate concurrency style).
- GrantAuthorizer holds Arc<dyn EpochWatermarkStore>; authorize_command is now
async and consults the store. new() keeps in-memory; with_watermark() /
with_verifier_and_watermark() inject the durable store for the live path.
A store error fails CLOSED → GrantAuthOutcome::WatermarkUnavailable (deny),
never authorizes a grant whose replay status is unknown.
- VerifyContext (holding a non-Sync &dyn GrantVerifier) is scoped to drop before
the await so authorize_command's future is Send — required for the
multi-threaded handler runtime.
Tests: epoch_watermark anti-replay on BOTH impls + durability-across-reopen +
bounded eviction (4); grid_capability decision path migrated to async (5);
both stress concurrency proofs through the REAL SQLite path. Full routing suite
260 green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The producer half of the capability-grant wiring — built + tested against the gate seam, not yet installed on the live boot path (that needs the GrantAuthorizer constructed from airc identity + mesh + a durable watermark, next slice). - CommandRequestHandler gains an optional GrantAuthorizer (with_grant_authorizer); new() keeps the tier-only default (grants ignored). - parse_envelope decodes the optional base64 HEADER_AIRC_CAPABILITY_GRANT into a typed SignedCapabilityGrant (ParsedEnvelope.presented_grant). A present-but- undecodable header is surfaced loudly, never silently dropped. - process_request verifies a presented grant via the authorizer against the AUTHENTICATED sender key (airc.peer_public_key(sender) — the enrolled key from the same registry that signature-verified the envelope, NOT the grant's self- asserted grantee_pubkey: the review's hard gate #3). On Authorized, the grant's conferred capabilities ride into the gate via CallerIdentity::with_granted_ capabilities; otherwise the caller falls back to tier gating. - Dispatch refactored into dispatch_request(executor, parsed, caller); the static process_request_via keeps its exact prior behavior (plain authenticated caller, no grant) for tests + the LocalGridTransport fixture. Tests: parse_envelope decodes a presented grant + rejects a malformed grant header; the no-grant default stays None. command_handler suite 15 green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…sona path Closes the loop: every persona that boots now VERIFIES presented capability grants. A visiting peer presenting an owner-signed grant gets the conferred command past its tier ceiling; absent/invalid grants fall back to tier gating. - airc bump a7ae4f4 → 4aa717d (airc#1277): Airc::mesh_identity is now pub. - build_grant_authorizer(airc, home): constructs the per-persona GrantAuthorizer. "This node is the owner": trusted issuer = the node's OWN enrolled ed25519 key (self-enrolled at Airc::open — it signs the grants it hands out); expected mesh = the node's own mesh (airc.mesh_identity()); anti-replay = a DURABLE SqliteEpochWatermark under <persona-home>/grant_watermark.sqlite (survives restart — the review hard gate). Typed GrantAuthorizerBuildError; provider≠owner (pinned-issuer-key distribution) is the deferred generalization. - PersonaCommandInboundPump::spawn takes the authorizer and builds the handler via with_grant_authorizer. Both PersonaAircRuntime install sites (bootstrap + install_command_pump) build it first; a build failure is a typed bootstrap failure (PersonaAircRuntimeError::GrantAuthorizerBuild), never a silent fall-through to an unverified path. Validated: cargo check (metal,accelerate) clean; the production-shape persona_command_inbound_pump integration test passes (persona answers a tier-gated command through the installed pump + authorizer); routing lib suite 262 green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The grantee side of the contracted grid — the SEND half. A node holds grants an owner issued it and presents them so the owner can authorize otherwise-tier-denied commands. - PresentedGrantStore trait + InMemoryPresentedGrantStore (routing/presented_grant_store.rs): base64 grants keyed by TARGET peer (the owner that will verify). Latest-wins on insert so a re-issued / higher-epoch grant supersedes; sync lookup for the outbound hot path. - AircTransport gains an optional grant store (with_grant_store); on a peer-targeted dispatch it stamps the held grant onto HEADER_AIRC_CAPABILITY_GRANT. Room / wildcard targets have no single verifier, so nothing is stamped. None = present nothing (unchanged tier-gated behavior). Pairs with the receive path (handler verifies) + issuance (airc#1278 sign_grant + the grid/grant/issue command, next). Tests: store holds/presents/supersedes per target; airc_transport suite 16 green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…erify → run The proof the grid is ready for compute. Two REAL airc peers over the loopback fixture, the production install/gate/send paths — no mocks: - Owner exposes a tier-DENIED command (compute/echo) behind GridTrustAuthPolicy + the inbound pump with build_grant_authorizer. - WITHOUT a grant: the remote peer is DENIED (the gate holds). - Owner ISSUES a grant for the grantee conferring exactly compute/echo (Airc::sign_grant, airc#1278). - Grantee PRESENTS it (InMemoryPresentedGrantStore + AircTransport stamps HEADER_AIRC_CAPABILITY_GRANT). - Owner VERIFIES (handler → GrantAuthorizer, against the authenticated sender key + durable watermark) and RUNS the command, echoing the params back through the full chain. Both halves asserted: no grant → denied (no auth hole), valid grant → runs (the grid can sell compute). Also bumps airc 4aa717d → 55790e1 (airc#1278 sign_grant). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The reusable issuance primitive, symmetric to build_grant_authorizer (verify): issue_grant(airc, issued_at_ms, params) composes a CapabilityGrant + Airc::sign_grant + base64, returning the blob a grantee presents. Binds the grantee's AUTHENTICATED key (from the owner's enrolment), the owner's mesh, and the owner's signature — all from the one airc handle so issuer / mesh / grantee-key can't drift from what the verifier checks. Typed IssueGrantError; fail-closed (never returns a partial grant). Any surface holding an owner airc handle (a persona runtime, a future grid/grant/issue command) wraps this — the primitive is identity-agnostic; "which identity issues" stays the caller's decision. Dogfooded: the end-to-end contracted-grid test now mints its grant via issue_grant (replacing the inline construction) and still proves issue → present → verify → run. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…y grants
The operator front door over routing::grant_issuance::issue_grant — closes the
contracted-grid loop with a command instead of hand-written Rust.
- GrantIssuanceModule (modules/grant_issuance.rs): holds the
PersonaAircRuntimeRegistry (shared with the instance manager); handle_command
decodes { issuerPersonaId, grantee, capabilities, expiresAtMs?, epoch? },
resolves the issuing persona's live airc handle, and returns the base64 grant
blob to deliver. A non-running issuer is a hard error (it owns the signing key —
never fabricated).
- Registered at the live boot site (ipc/mod.rs) alongside the instance manager.
- OWNER-ONLY: grid/grant/issue is outside the cross-grid ACL allow-list, so it
falls to the ""=Owner wildcard — only the local operator can sell its personas'
compute; a remote peer can never mint grants. Pinned with an acl regression.
Each persona is its own owner selling ITS compute, so the issuer is a persona's
airc identity (issuerPersonaId names which). Tests: malformed-request +
issuer-not-running error paths (2); acl owner-only pin; the happy path is proven
end-to-end in tests/capability_grant_e2e.rs.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…self Persistent identity worked for Asha (her seed.json persona_id matches her live peer 90e758b2) but two fragilities could re-mint a stranger and orphan her engrams — exactly how personas-archive/ filled with 9 strangers: 1. The seed was written ONLY on FreshlyMinted, best-effort + non-fatal. A single failed mint-write, or a later deleted/corrupt seed, left her home (engrams + airc key) on disk but unresumable → next boot minted a stranger. 2. The bootstrap WRITE path (citizen_home_path(..).parent()/seed.json) and the resumer READ path (citizens_kind_dir) agreement was untested — that exact divergence (resumer hard-coding `personas/` vs `citizens/personas/`) is what created the strangers originally. Fixes: - seed::ensure_seed(seed_path, persona_id, agent_name, fallback_created_at_ms): idempotent upsert that runs on EVERY bootstrap (mint AND resume). Self-heals a missing/corrupt seed from the live identity, and PRESERVES created_at_ms from an existing seed (her birth time is stable — a naive rewrite would reset her age every boot). persona_instance_manager now always calls it (drops the FreshlyMinted gate) + the stale `personas/` path comment is corrected to `citizens/personas/`. - citizen_path: regression test pinning seed-write-path == resumer-scan-path for a Persona (the stranger-minting bug can never silently return). Tests: ensure_seed creates-missing / preserves-birth-time-on-resume / heals-corrupt (3); the path-agreement pin; seed 8, citizen_path 6, instance_manager 5, resume_or_mint 6 all green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…apter boundary Asha leaked her entire <think>… chain-of-thought into the room (and looped inside it without ever emitting an answer). Root cause: a reasoning model's reasoning was never separated from its user-facing text. unsloth's /v1 (llama.cpp backend) emits <think>…</think> INLINE in `content` (verified live: no `reasoning_content` field), the OpenAI adapter passed it straight into `text`, and the cognition cleaner only stripped `<thinking>` (with -ing) — which never matched `<think>`. Fix at the adapter boundary (where the model's output contract belongs): - TextGenerationResponse gains `reasoning: Option<String>` — reasoning is captured (for the glass-box harness + memory) and stripped from `text`, so it can NEVER reach the room. Uniform across adapters; ts-rs binding regenerated. - openai_adapter::extract_reasoning(content, reasoning_content): precedence — (1) a server `reasoning_content` field (vLLM-style) wins; (2) inline <think>…</think> is split out, answer = text outside the block; (3) an UNCLOSED <think> (the runaway loop) yields EMPTY text so the caller refuses to post, never leaking raw reasoning. Wired into the response parse; OpenAIMessage now reads reasoning_content too. - Other adapters set reasoning: None (anthropic: extended-thinking is a follow-up, doesn't leak; llamacpp: TODO to reuse extract_reasoning if it serves a reasoning model locally — not Asha's path). Tests: extract_reasoning over well-formed / unclosed-runaway / server-field / plain+empty-think (4); adapter + response_validator suites green (9 + 7); all TextGenerationResponse constructors updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…hink switch The reasoning-strip (9d3f895) cleans the OUTPUT; this addresses the INPUT side — the model thinks on EVERY turn (even "say hello"), burning latency and feeding the runaway loop. Verified live which mechanism works: the gateway IGNORES chat_template_kwargs.enable_thinking for this forged model, but Qwen3's `/no_think` SOFT-SWITCH appended to the user turn works — empty <think></think> + direct answer. - ThinkingMode { Default, Suppress } on the OpenAI adapter config; format_messages appends `/no_think` to the last user message when Suppress (apply_no_think_switch — model-specific token owned at the adapter boundary; higher layers stay model- agnostic). - The local unsloth/GGUF reasoning gateway defaults to Suppress (this 4B's thinking rambles + loops, and it answers correctly without it). Operator override `UNSLOTH_THINKING=on` re-enables it — the reasoning-strip still protects the room. Cloud providers keep their default. Gateway-level for now; per-task/per-request thinking is the follow-up (a recipe that needs deliberation re-enables it). Tests: apply_no_think_switch targets the last user turn + no-ops without one; adapter suite 11 green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The full stretch since canary (32 commits, each individually validated). Three coherent bodies of work, all on headless Rust, all proven live where applicable.
1. Self-routing command infrastructure
DynCommand object + ActionCommand base trait → stateless self-registration, one dynamic registry (no switch/list duplication),
commands/list. Procedural param adaptation + schema exposure so every interface (cu CLI, persona tools, SDKs) adapts from one source. Identity flows intoCtx(listed == callable, per identity). Pure-Rustcustart/stop + client; legacy Node start orchestrator quarantined. Docs: COMMAND-ORGANIZATION, BUILD-AND-PACKAGING.2. The contracted grid — capability-grant auth (READY FOR COMPUTE)
End-to-end signed-grant authorization so the grid can sell compute. Proven E2E with two real airc peers (issue → present → verify → run; tier-deny holds without a grant):
HEADER_AIRC_CAPABILITY_GRANT,SignedCapabilityGrant::sign,Airc::peer_public_key), fix(inference,#1262): delete dead compute_router.rs (no-CPU-fallback alpha) #1277mesh_identitypub, feat(airc): add realtime envelope contract #1278Airc::sign_grant. continuum airc pin → 55790e1.GrantAuthorizer(ed25519, durableSqliteEpochWatermarkanti-replay, authenticated presenting key) + gate fast-path + handler, installed on the live persona path. All 3 review hard gates satisfied.issue_grantprimitive,PresentedGrantStore+AircTransportstamping,grid/grant/issue(Owner-only) operator command.3. Persona persistence + cognition (Asha, proven live)
seed::ensure_seedself-heals the seed on every bootstrap + preserves birth time; regression test pins write-path == resumer-scan-path. Live: Asha resumes as herself (resumed_count=1, same id90e758b2, 12 engrams intact) across a restart.TextGenerationResponse.reasoning+extract_reasoningstrip<think>at the adapter boundary (serverreasoning_content→ inline split → unclosed-runaway → empty text). Fixed the leak where the persona dumped its whole chain-of-thought; reasoning captured for the harness, room sees clean text.ThinkingMode+ Qwen3/no_thinksoft-switch; the local unsloth reasoning gateway defaults to Suppress (env overrideUNSLOTH_THINKING=on). Live: Asha answers clean + correct ("144", "Blue.", "4:30pm").Validation
Workspace
cargo checkclean; touched-module sweep green (ai::openai_adapter 11, routing::grid_capability 5, epoch_watermark 4, persona::seed 8, citizen_path 6, grant_issuance 2, grid_trust_policy 7, command_handler 15, …); E2E + persona integration tests green; all three live behaviors proven on the rebuilt core.🤖 Generated with Claude Code