Define the backend hardening work that makes direct Codex TUI use of
agentmemory faster, higher-signal, and more reliable.
This spec is intentionally about the native Codex TUI path only:
- startup and resume
- prompt-time retrieval
- tool-time enrichment
- session closeout
- operator-facing memory reads
It is not about MCP coverage, generic external clients, or public integration marketing.
- adding more MCP tools
- pushing ranking policy into Codex
- making Codex own more orchestration logic than necessary
- redesigning the current retrieval engine from scratch
- widening the always-on runtime lane without a latency reason
Today the direct Codex TUI path is functional, but it still has several backend seams that leak into the caller:
POST /agentmemory/session/startcreates a session and returns onlysession + context.- resume handoff lookup requires a separate
GET /agentmemory/handoffsread. - prompt-time retrieval is split across:
POST /agentmemory/context/refreshPOST /agentmemory/context
- shutdown is fragmented across:
POST /agentmemory/summarizePOST /agentmemory/session/endPOST /agentmemory/crystals/autoPOST /agentmemory/consolidate-pipeline
- richer coordination surfaces exist, but startup/retrieval do not yet return them as a small structured operating picture.
That shape works, but it is not the best long-term backend contract for a TUI.
Codex should not need to assemble its startup and closeout lifecycle from a chain of loosely related endpoints.
Codex currently has to know when to use context/refresh versus context, and
how to stitch together resume state from multiple backend reads.
The backend is still too text-centric in places where the TUI wants structured state:
- why something was retrieved
- freshness and confidence
- blockers and next step
- relevant files and concepts
- whether a result is a resume artifact, a guardrail, a decision, or general context
Multiple independent shutdown calls increase the chance of partial success, duplicate work, or invisible failure.
The direct Codex TUI backend contract should converge on three main calls:
- session bootstrap
- prompt-time retrieval
- bounded closeout
Everything else can remain as supporting internal surfaces or explicit operator-facing reads.
Turn startup and resume into one high-signal backend read.
session/start currently returns only the new session record and plain context.
Resume-specific state requires additional reads.
POST /agentmemory/session/start should return a bootstrap payload shaped more
like:
{
session: Session;
bootstrap: {
context: string;
latestHandoff?: HandoffPacket | null;
nextAction?: Action | null;
guardrails: Guardrail[];
activeDecisions: Decision[];
branchOverlaySummary?: string | null;
retrievalTrace?: RetrievalTrace;
};
}- startup should produce a usable operator picture in one call
- resume should not need a second generic handoff listing call for the common path
- the returned set should stay small and latency-sensitive
- absence of one component should not fail the full bootstrap
- keep the existing session creation semantics
- add a backend-owned latest-handoff selection path instead of forcing the caller through generic list/filter logic
- cap supporting surfaces aggressively so bootstrap remains fast
Make prompt-time recall one semantic backend action, not a caller-side branch.
The direct TUI path currently has to reason about:
context/refreshfor query-aware recallcontextfor fallback and explicit recallenrichfor file/tool-local help
The backend should support one retrieval contract with explicit intent:
{
sessionId: string;
project: string;
intent:
| "resume"
| "user_turn"
| "manual_recall"
| "file_enrich"
| "next_action";
query?: string;
filePath?: string;
budget?: number;
}- Codex should call one backend retrieval surface for prompt-time recall
- the backend decides whether query-aware ranking, hot-path continuity, or file-local enrichment dominates
- short or noisy queries should degrade gracefully instead of producing an empty special-case branch that the caller must interpret
Return structured result metadata alongside text:
{
context: string;
items: Array<{
sourceType: string;
sourceId: string;
title: string;
why: string;
freshness: "hot" | "warm" | "cold";
confidence: number;
relevantFiles: string[];
concepts: string[];
blocker?: string | null;
recommendedNextStep?: string | null;
}>;
trace?: RetrievalTrace;
}- fresh same-session truth wins by default
- resume artifacts should only dominate when intent is clearly resume-oriented
- decisions and guardrails should be promoted when they directly constrain the current request
- old durable memory should not swamp active turn state
Keep enrich useful for Codex tool-time help without letting it become a noisy
general retrieval path.
- treat
enrichas an internal specialization of the shared retrieval engine - bias results toward the touched file, nearby files, and relevant failures
- keep token budgets smaller than prompt-time recall
- return structured file-local signals, not only appended prose
- do not expose generic memory spray during file-touching operations
- do not let enrich outrank fresh turn context for non-file intents
Replace the current multi-call shutdown choreography with one backend-owned closeout pipeline.
Add a closeout operation such as:
POST /agentmemory/session/closeoutwith semantics equivalent to:
- summarize session
- end session
- auto-crystallize
- run bounded consolidation maintenance
{
success: boolean;
steps: {
summarize: "ok" | "skipped" | "failed";
endSession: "ok" | "skipped" | "failed";
crystallize: "ok" | "skipped" | "failed";
consolidate: "ok" | "skipped" | "failed";
};
errors?: Array<{ step: string; message: string }>;
}- closeout must be idempotent
- partial success must be visible
- a failed maintenance step must not erase a successful session end
- the backend should own retries/bounding instead of making Codex orchestrate each substep separately
Lower latency for the Codex TUI hot path without regressing retrieval quality.
- precompute a latest-handoff pointer for resume
- cache small bootstrap bundles for immediate resume cases
- extend scoped retrieval indexing for direct intent-aware lookups
- keep repeated identical recall requests under short TTL caching
- prefer partial-good results when one retrieval lane is slow
- no global scans on the hot path
- no Git shell-outs on the hot path
- no all-or-nothing failure when one secondary surface is degraded
Prove direct Codex TUI behavior, not just narrow native payload acceptance.
Expand the Codex compatibility lane to cover:
- startup bootstrap
- resume with latest handoff selection
- prompt-time retrieval with intent-aware ranking
- file-local enrich behavior
- closeout pipeline success and partial failure cases
- idempotent retry behavior for startup and closeout
- degraded partial-success retrieval when one secondary lane fails
This hardening lane is done when:
- Codex startup gets a one-call bootstrap view
- prompt-time retrieval is exposed as one semantic backend action
- closeout is backend-owned and idempotent
- structured retrieval metadata is available for the TUI
- direct-TUI compatibility tests cover the full lifecycle, not just ingest
This spec is backend-first, but not every improvement is backend-only.
Some wins land immediately behind the current Codex client contract:
- faster retrieval/indexing/caching on the hot path
- better latest-handoff selection behind existing resume reads
- improved degrade-soft behavior for retrieval and closeout internals
The full UX win requires Codex-side adoption work after the backend changes land.
- update startup flow to consume the richer
session/startbootstrap payload instead of assumingsession + contextonly - remove the extra generic resume handoff fetch on the common path once the backend exposes latest-handoff selection in bootstrap
- replace caller-side branching between
context/refreshandcontextwith the unified retrieval contract - route file-local tool-time help through the unified retrieval intent instead of preserving separate enrich-specific decision logic in Codex
- replace the current multi-call shutdown choreography with the bounded
backend
session/closeoutoperation - render or otherwise consume structured retrieval metadata such as:
- source type
- why retrieved
- freshness
- confidence
- blockers
- recommended next step
- relevant files
- add direct Codex-side lifecycle validation for:
- startup bootstrap parsing
- prompt-time retrieval intent routing
- closeout path migration
- partial-success handling
Codex does not need a broad client rewrite.
The intended client-side end state is:
- fewer branches
- fewer round trips
- less backend choreography in Codex
- richer rendering from backend-owned state
The Codex-side follow-up is done when:
- Codex no longer assembles startup/resume from multiple generic reads
- Codex no longer owns the
context/refreshversuscontextbranch - Codex no longer owns the 4-step shutdown sequence
- Codex can display or consume the structured retrieval fields returned by the backend
- add bootstrap payload to
session/start - add backend latest-handoff selection
- unify prompt-time retrieval intent handling
- add bounded
session/closeout - expand direct Codex lifecycle tests
- land backend contract changes in
agentmemory - add backend validation for the new bootstrap, retrieval, and closeout surfaces
- switch Codex to the new bootstrap path
- switch Codex to the unified retrieval path
- switch Codex to the bounded closeout path
- remove now-dead Codex-side orchestration branches
- add final end-to-end Codex lifecycle validation