Define the receiver-side, end-state UX contract for Codex integration with
agentmemory.
This document is about surface shape and user experience, not sender-side
payload details. The main native Codex sender contract belongs in the Codex
repo. The narrow ingest companion remains
docs/codex_payload_quality_spec.md.
For backend performance and quality hardening of the direct TUI path, see
docs/codex_tui_hardening_spec.md.
The repository already documents two useful but incomplete views:
- generic Codex CLI setup as an MCP client in
README.md - narrow native-payload ingest guarantees in
docs/codex_payload_quality_spec.md
That still leaves one important gap: the active Codex integration has two different live surfaces, but the repo does not say so clearly enough.
Without that split, three UX mistakes become likely:
- MCP-only setup gets described as equivalent to native lifecycle capture.
- The always-on runtime lane gets treated like a grab bag of optional tools.
- Future additions to the explicit command surface risk being mistaken for baseline runtime dependencies.
Document Codex integration as three distinct levels:
Generic Codex CLI- MCP-only setup via
.codex/config.yaml - no native lifecycle capture implied
- MCP-only setup via
Codex-native runtime lane- always-on REST-backed lifecycle and retrieval path
- small, stable, latency-sensitive
Codex explicit memory lane- broader human-invoked memory, planning, and review surface
- useful, but not required for baseline automatic capture and resume
This document describes the receiver-side backend contract.
- the runtime-critical native lane is REST-backed in
agentmemory - the explicit memory lane may be presented inside Codex as tools, slash commands, prompts, or other adapter-owned UX
- docs in this repo should not imply that every Codex-facing command has a one-to-one MCP tool defined here
- generic MCP-only Codex setup remains a separate, thinner integration level
This is the receiver-side always-on lane that should remain small and stable.
POST /agentmemory/session/startGET /agentmemory/handoffs
Expected UX:
- starting or resuming a session should return immediate context
- resume should be able to review the latest durable handoff packet without requiring a human recap
POST /agentmemory/observePOST /agentmemory/context/refreshPOST /agentmemory/contextPOST /agentmemory/enrich
Expected UX:
- prompt submit should prefer query-aware
context/refreshwhen the adapter has retrieval intent contextremains the fallback path and also the explicit recall path- observe is the canonical capture sink for native lifecycle events
- enrich remains a supporting retrieval surface for file-touching/tool-time UX
POST /agentmemory/summarizePOST /agentmemory/session/endPOST /agentmemory/crystals/autoPOST /agentmemory/consolidate-pipeline
Expected UX:
- shutdown should distill useful state without requiring a human-written recap
- maintenance work should be best-effort and bounded, not a fragile hard block on session close
This is the broader Codex command/tool/slash surface. It should stay available, but it must not be treated as a prerequisite for baseline automatic capture.
POST /agentmemory/rememberPOST /agentmemory/consolidateGET /agentmemory/lessonsPOST /agentmemory/lessons/searchGET /agentmemory/crystalsPOST /agentmemory/crystals/createPOST /agentmemory/reflectGET /agentmemory/insightsPOST /agentmemory/insights/search
GET /agentmemory/actionsPOST /agentmemory/actionsPOST /agentmemory/actions/updateGET /agentmemory/frontierGET /agentmemory/nextGET /agentmemory/missionsGET /agentmemory/missions/:idGET /agentmemory/handoffsGET /agentmemory/handoffs/:idPOST /agentmemory/handoffs/generateGET /agentmemory/branch-overlays
GET /agentmemory/guardrailsPOST /agentmemory/guardrails/searchGET /agentmemory/decisionsPOST /agentmemory/decisions/searchGET /agentmemory/dossiersGET /agentmemory/dossiers/getGET /agentmemory/routine-candidates
POST /agentmemory/forget exists in the adapter/backend surface, but it should
not be described as part of the active Codex native lane unless the live Codex
path actually routes delete semantics through that endpoint.
- Do not describe MCP-only Codex setup as equivalent to native lifecycle capture.
- Do not let the explicit memory lane become an implicit dependency of the always-on runtime lane.
- Keep the runtime lane centered on capture, query-aware recall, resume, and session-end distillation.
- Prefer the smallest stable runtime contract over exposing every backend primitive as "required for Codex."
- When docs mention mission or handoff detail routes, prefer the real REST
shape (
:id) instead of inventing placeholder names that differ from the current API. - Keep sender-side payload evolution and receiver-side ingest compatibility as separate documents and responsibilities.
This section scopes a host-local diet for the current environment where the native Codex adapter is the only real client. It is not a public product position and should not be applied to upstream packaging without an explicit distribution decision.
2026-04-29 diagnostics show the installed runtime is not oversized because MCP stores a large database. MCP, plugin, and Claude integration code mostly add registered functions, handlers, docs, and package surface.
The larger storage/RSS contributors are active StateKV scopes and loaded indexes:
- active StateKV data is about 785 MB across about 18k files
- observation/retrieval indexes account for about 174 MB by manifest size
- turn capsules plus working sets account for about 124 MB
- Codex project entries inside turn capsules plus working sets are only about 1.7 MB of that 124 MB
- compaction dry-run reported 0 removable index bytes, so the immediate issue is active retained data, not orphaned shards
Implication: cutting MCP will simplify the process surface and reduce startup registration/attack area, but it will not by itself halve the database. Halving the database requires retention and project-scope policy, especially for old or non-Codex projects.
Keep these backend surfaces until Codex has migrated to any replacement contract:
GET /agentmemory/healthPOST /agentmemory/session/startPOST /agentmemory/session/closeoutPOST /agentmemory/session/enduntil closeout fully replaces direct end callsPOST /agentmemory/observePOST /agentmemory/contextPOST /agentmemory/context/refreshuntil unified retrieval replaces the caller branchPOST /agentmemory/enrichuntil file-enrich is folded into unified retrievalPOST /agentmemory/smart-searchPOST /agentmemory/summarizeuntil closeout fully owns summarizationPOST /agentmemory/crystals/autountil closeout fully owns crystallizationPOST /agentmemory/consolidate-pipelineuntil closeout fully owns bounded distillationGET /agentmemory/handoffsandGET /agentmemory/handoffs/:iduntil bootstrap returns latest handoff inlinePOST /agentmemory/handoffs/generateGET /agentmemory/actions,POST /agentmemory/actions,POST /agentmemory/actions/update,GET /agentmemory/frontier, andGET /agentmemory/nextif Codex continues to expose explicit work-item memory tools- guardrail, decision, dossier, lesson, insight, crystal, and branch-overlay reads that are consumed by explicit Codex memory commands
- operational proof/repair endpoints:
/agentmemory/codex-integration/proof,/agentmemory/retrieval-proof,/agentmemory/retrieval-index/verify,/agentmemory/index-persistence/compact,/agentmemory/active-scopes/diagnostics,/agentmemory/retrieval-blocks/diagnostics,/agentmemory/retrieval-blocks/retry, and/agentmemory/compress-retry
For this host, prefer deletion and direct pruning over a compatibility profile. The operating assumption is that native Codex is the only real client, so extra runtime branches are more complexity than value.
Cut directly:
- Remove MCP endpoint/resource/prompt registration from the main worker.
- Remove the MCP tool/resource/prompt registry if no standalone package remains.
If a standalone
agentmemory mcpcommand is kept, it must be isolated from the live worker startup path and from the native Codex proof. - Remove Claude bridge runtime registration, config loading, log messages, and its StateKV write path.
- Remove the shipped Claude plugin package, hook scripts, hook build outputs, plugin skills, and package entries that publish them.
- Remove multi-client setup/docs from the host-local operator path.
- Delete tests whose only purpose is proving removed client surfaces, and keep only contract tests for native Codex and operator diagnostics.
Known registration/file touch points:
- worker registration:
src/index.tsregisters Claude bridge when enabled, team memory, governance, orchestration families, API triggers, event triggers, and MCP endpoints before startup reports143 REST + 44 MCP tools + 6 MCP resources + 3 MCP prompts - API/MCP surface:
src/triggers/api.ts,src/mcp/server.ts,src/mcp/tools-registry.ts, andsrc/mcp/standalone.ts - Claude/plugin surface:
src/functions/claude-bridge.ts,src/hooks/*,plugin/hooks.json,plugin/scripts/*,plugin/skills/*,plugin/.claude-plugin/plugin.json, andpackage.json - docs/tests to narrow:
README.md, MCP standalone tests, plugin tests, and any count assertions tied to removed tools/endpoints
Expected impact:
- lower function/trigger registration count
- smaller active API surface
- less confusion around Codex-native versus MCP-only behavior
- modest process memory reduction
- little direct database reduction
Guardrail:
- the native Codex proof must still pass after MCP registration is disabled
npm testshould pass after removed-surface tests are deleted or narrowed- package/export cleanup should happen in the same cut so dead files are not left behind
- the startup endpoint/tool count log must be updated in the same commit as registration changes
- no compatibility stub should remain for deleted host-local surfaces unless an external consumer is found by live config/log evidence
The following primitives are not required for the current native Codex hot path unless the Codex explicit memory lane is actively using them:
- team memory
- mesh sync
- signals
- checkpoints
- sentinels
- sketches
- routines and routine compiler
- snapshots
- Obsidian export
- Claude bridge
- generic MCP governance wrappers
- generic import/export endpoints, except for operator backup/restore
Treat these as feature-family lanes, not one giant edit. The safer grouping is:
- client adapters: MCP, Claude bridge, plugin, hooks
- collaboration/runtime coordination: team, mesh, signals, checkpoints, sentinels, leases
- planning/editorial extras: routines, routine compiler, sketches, snapshots, Obsidian export
- operator backup exceptions: import/export only if still used for archive or rollback of destructive retention runs
Implementation shape:
- Remove registration, endpoint wrappers, docs, tests, and package entries in one lane per feature family.
- Keep StateKV schemas readable for one cleanup release only if old data needs migration.
- Delete disabled endpoints instead of returning compatibility stubs.
- Add one native Codex contract test that proves the reduced worker still registers every endpoint Codex needs.
Expected impact:
- meaningful complexity reduction
- less iii-engine function registry churn
- smaller viewer/API surface
- database reduction only after a retention migration deletes their stored scopes
This is the lane that can actually cut the database.
Codex-only mode should define a retained project allowlist:
/home/ericjuta/.openclaw/workspace/repos/codex/home/ericjuta/.openclaw/workspace/repos/agentmemory- optionally
/home/ericjuta/.openclaw/workspacefor operator control-plane context - optionally sibling runtime repos such as
codex-lbonly if Codex queries them often
For all other projects:
- Preserve durable, high-signal memories first: decisions, guardrails, lessons, handoffs, crystals, summaries, and explicit remembered facts.
- Drop or archive raw observations, old turn capsules, working sets, access logs, and per-session transient state.
- Rebuild retrieval indexes from the retained set.
- Run index compaction and restart iii-engine once to measure cold RSS.
Scope priority:
- largest likely savings:
mem:obs:<session>,mem:turn-capsules,mem:working-sets,mem:access,mem:context-injections, stalemem:enriched:<session>, and retry/maintenance transient scopes - rebuildable index storage:
mem:index:bm25,mem:index:retrieval-blocks, their manifests, and their sharded physical scopes after the retained record set is finalized - durable keep set:
mem:memories,mem:summaries,mem:retrieval-blocks:*for retained projects,mem:handoff-packets,mem:crystals,mem:lessons,mem:insights,mem:decisions,mem:guardrails, andmem:component-dossiers - removable only after feature deletion:
mem:claude-bridge,mem:team:*,mem:mesh,mem:signals,mem:checkpoints,mem:sentinels,mem:sketches,mem:routines,mem:routine-runs,mem:leases,mem:mission-runs, and related audit rows
Expected impact:
- likely the largest storage win
- likely reduces loaded index and StateKV scan pressure
- direct Codex recall quality should improve if stale non-Codex project material stops competing for rank
Guardrail:
- dry-run must report bytes by scope and project before mutation
- dry-run must distinguish archiveable, deletable, rebuildable, and must-keep bytes
- destructive deletion must require an explicit
force: truerequest - export/archive must be available before the first destructive run
- Codex integration proof and a project-scoped recall probe must pass after rebuild
After the backend contracts in docs/codex_tui_hardening_spec.md land, cut the
old generic endpoints from the native Codex path:
- replace
GET /agentmemory/handoffsat startup with inlinesession/start.bootstrap.latestHandoff - replace the
contextversuscontext/refreshbranch with one unified retrieval endpoint - replace
summarize+session/end+crystals/auto+consolidate-pipelinewithsession/closeout - keep the old endpoints only as compatibility/operator surfaces, then gate them
out of
codex-nativeafter Codex no longer calls them
Expected impact:
- lower latency and fewer failure modes
- fewer live endpoints needed for the only client
- easier future deletion because Codex has one contract per lifecycle phase
The current active-scope diagnostics show no stale candidates under the default 30-day policy, but the host-local Codex-only posture can be more aggressive.
New policy:
- keep working sets for active projects only
- keep turn capsules for non-allowlisted projects only when they contain decisions, failures, handoff-worthy summaries, or high importance
- cap per-session capsule and working-set payload size
- shorten access-log retention
- decay or compact insights that are not referenced by active projects
Expected impact:
- cuts the current active working set/capsule footprint
- reduces repeated context scans
- avoids weakening Codex recall because Codex project data is a small minority of the current active-scope bytes
- retrieval-block storage and indexes: Codex recall quality depends on them
- observations for active Codex sessions: needed for freshness
- handoff packets: startup/resume depends on them
- summaries/crystals/lessons/decisions/guardrails: these are the high-signal durable memory layer
- health, proof, diagnostics, retry, and compaction endpoints: needed for operator confidence while slimming the runtime
- iii-engine itself: project rules require StateKV through iii-engine
Before deleting data or permanently disabling surfaces, collect:
- endpoint/function registration count before and after each cut
- Docker RSS after cold start before and after each cut
- StateKV bytes by scope and by project
- index bytes before and after rebuild
- Codex proof latency and quality before and after
- top recall examples for Codex before and after, to prove no useful memory was lost
- Add a contract test/proof fixture that defines the exact native Codex endpoints and operator diagnostics that must survive.
- Remove MCP/plugin/Claude registration and packaging from the live worker.
- Remove team/mesh/signals/checkpoints/sentinels/sketches/routines/snapshots in feature-family lanes only after the explicit Codex command surface no longer calls them.
- Add a dry-run retention endpoint that reports deletable bytes by project and scope for the Codex-only allowlist.
- Add archive-then-delete support for non-allowlisted raw observations, turn capsules, working sets, and transient scopes.
- Rebuild retrieval indexes from retained data and compact.
- Restart iii-engine during a quiet window and compare cold RSS.
- Migrate Codex to the unified bootstrap/retrieval/closeout contracts.
- Gate now-dead generic lifecycle endpoints from
codex-native.
The repo should present Codex using the same UX clarity already used for OpenClaw:
- generic client setup
- deeper native lifecycle path
- explicit note that the deeper path depends on a compatible host adapter or fork, not on a plugin shipped by this repo