feat(codex): native app-server plan and chat controls#3
Draft
SYU8384 wants to merge 44 commits into
Draft
Conversation
When Codex's request_user_input tool sends 2-3 questions in one call, this renders Q1 first and posts Q2 as a brand-new reply after Q1 is answered (sequential path). The wire protocol still treats this as one request to one merged response; only the channel rendering policy changes. The legacy one-shot path (createCodexUserInputPrompt) is unchanged for N==1 and for channels without presentation support. The discord/telegram/slack adapters that consume the result see consumed=false for partial clicks on the legacy combined card so the user can keep clicking buttons for remaining questions, and consumed=true on the sequential path so the just-answered row gets disabled before the next question is posted. Covers: - per-question render (no header prefix when only one question on screen) - partial-click advance + emit next prompt closure - typed 'Other:' text advances similarly - out-of-order clicks rejected as 'awaiting Q[n] header' - partial answers dropped on cancel/abort per PR decision - legacy multi-question (no sequential emit) freeform merge path preserved Codex upstream's request_user_input tool spec allows 1-3 questions per call (codex-rs/core/src/tools/handlers/request_user_input_spec.rs:55-67). This change is purely channel-side rendering. (cherry picked from commit d9d42613f3a59cc2778ae44031517512176b9c74) (cherry picked from commit 5dd347b)
(cherry picked from commit 5e637a88841bb160b7d3a88a89e4a6aebdb2bbe7) (cherry picked from commit c4b232d)
…ion freeform - Telegram/Discord/Slack interactive handlers now skip the reply call when result.message is empty, so the sequential partial-click (which intentionally sends no acknowledgment because the next question is posted as a new reply) does not send an empty message to the chat API. disableComponents / clearButtons / editMessage still run, so the used row is correctly locked before Q2 arrives. - The freeform path now matches sequential pending entries by the currently-shown question's isOther flag instead of falling through to the legacy all-questions merge rule. Upstream Codex normalizes request_user_input questions to isOther=true and the prompt tells users they may reply with their own answer, so a user typing a custom answer for Q1 of a 2-3 question sequential prompt now correctly advances to Q2 instead of being rejected as 'matched: false'. (cherry picked from commit 8b043d18abffb4ec4be3ba158b92e8ec8ab0615a) (cherry picked from commit 2f4c9d1)
…mo overview widgets Two follow-ups to the codex plan controls PR (openclaw#88446), both surfaced by autoreview on the rebased feat/codex-plan-controls branch: (1) buildCommandInboundEvent / buildCommandInboundContext now thread the original conversation target through the synthetic inbound event/ctx so the follow-up turn's progress and any Codex request_user_input prompts it raises are deliverable to the chat that approved the plan. Previously the synthetic event dropped metadata.to / conversationId / parentConversationId and the synthetic ctx dropped pluginBinding, so the progress sender silently returned without sending and any request_user_input prompt would have waited for the 10-minute timeout before timing out. The two call sites in approveCurrentContextPlan and approveConversationPlanWithCleanContext read the current binding via ctx.getCurrentConversationBinding() and pass the routing fields from ctx (to, threadParentId, sessionId). Adds a unit test asserting the synthetic event + ctx carry the three fields. (2) ui/src/ui/views/overview.ts no longer renders the openclaw-demo-button and openclaw-demo-status-widget placeholder elements. These were introduced by fix: preserve codex telegram plan context on the production overview route and shipped to users as product UI noise unrelated to the Codex plan controls change. The two element imports + the two render usages are removed, and ui/src/ui/views/overview-render.test.ts (which only asserted the demo widgets' presence) is deleted. The standalone custom element source files in ui/src/ui/components/ are kept for parity with the rest of the custom-element set but are no longer referenced from the production overview route. (cherry picked from commit 34b681e)
…tParams The previous patch narrowed buildTurnStartParams.collaborationMode to the Codex wire object (CodexTurnCollaborationMode), but local callers and tests still pass stored string modes (e.g. 'plan', 'default'). Build failed TS at extensions/codex/src/app-server/run-attempt.test.ts when the test passes collaborationMode: 'plan' to buildTurnStartParams. Match the same fix applied to buildTurnCollaborationMode earlier in this rebase: accept CodexTurnCollaborationMode | string at the outer option, and let buildTurnCollaborationMode normalize the string into the wire object. (cherry picked from commit 5eea14e)
…ad resume The thread-resume writeCodexAppServerBinding call only carried forward the original binding's collaborationMode and legacy reasoningEffort. The new chat plan controls added two more persisted per-binding preferences (reasoningEffortDefaults for /codex think and liveProgress for /codex live) that are not part of the resume's overridden runtime fingerprints, so a later app-server resume on the same session silently dropped them and the user had to re-set /codex think and /codex live after every lifecycle operation. Include both fields in the resumed binding write so per-binding preferences survive across resume / reconnect operations. (cherry picked from commit 17d80cd)
The progress-reply path calls adapter.sendPayload directly without
invoking the adapter's afterDeliverPayload hook. Discord's
afterDeliverPayload registers the delivered message id with the
Codex user-input control tracker, so a typed/freeform answer can
resolve the Codex request but the original Discord buttons stay
live and stale until someone clicks them.
Invoke adapter.afterDeliverPayload?.({ cfg, target, payload,
results }) immediately after sendPayload returns, matching the
shape used by core's outbound deliver pipeline. The Discord
afterDeliverPayload then registers the delivered message id so a
later freeform answer disables the corresponding control token.
(cherry picked from commit 3b84c03)
Wrap the new direct adapter.afterDeliverPayload call in try/catch so a hook crash after a successful platform send does not fail the whole sendProgressReply. This matches the shared outbound pipeline's maybeNotifyAfterDeliveredPayload helper which isolates and logs hook failures, preserving successful delivery semantics for downstream Codex user-input prompts. Without this guard, a Discord afterDeliverPayload failure would cancel the pending Codex user-input token even though the prompt had already been delivered, leaving a visible prompt whose buttons and freeform answers no longer resolve the Codex request. (cherry picked from commit 0150b7b)
buildBoundConversationCollaborationMode emitted settings.model: null whenever the bound CodexAppServerThreadBinding did not have a stored model field. Earlier versions of the binding type treated model as optional, so legacy session files (and any binding created without a model) would now fail the next turn/start with an invalid collaborationMode payload because the Codex app-server contract requires Settings.model to be a string. Return undefined from the helper when binding.model is missing so the turn request omits the collaboration mode object entirely. The user can re-bind or set /codex model to pick a model; the existing turn semantics (collaborationMode + reasoningEffort) continue to work for bindings that have a model. Adds a regression test that asserts the turn/start request has no collaborationMode field when the binding omits model. (cherry picked from commit 41bd8fd)
…l prompts answerCodexUserInputFreeform rejected every pending request whose currently-shown question lacked isOther, even when the user's typed reply was a numeric prefix (e.g. '1') or the exact option label. Channels that cannot render or keep buttons (plain text relays, accessibility contexts) relied on this fallback to resolve the active request_user_input; otherwise the message was routed to a new bound turn while the original Codex turn waited until the 10-minute timeout. Add resolveFreeformOptionAnswer which normalizes the typed reply against the rendered options: numeric prefix -> option label, case-insensitive exact match -> option label, otherwise the raw text. The sequential filter accepts the entry when normalization matches or the question is isOther; replies that do not normalize to any option stay rejected so stray chat messages do not consume the request. The sequential branch records the normalized label on the pending entry so the resolved merge uses the canonical option label. Updates the existing test that was encoding the regression to exercise the correct fallback (numeric prefix on a labeled question) and the rejection path (unrelated reply on a labeled question stays matched: false). Adds a new test covering both numeric and label forms for sequential prompts. (cherry picked from commit c2ff571)
The previous commit returned a string from resolveFreeformOptionAnswer
and used string equality as a sentinel for 'no match'. That missed
exact same-case label replies because resolveFreeformOptionAnswer
returns the same label string when the user types the label
verbatim, so the filter treated it as 'no option matched' and
rejected a perfectly valid fallback.
Return { matched, answer } from the normalizer so callers can
distinguish a real match from a raw-typed reply. Update both the
sequential filter and the sequential branch to use the new shape.
Also fix a related gate: the legacy 'some question is isOther'
prefilter still ran before the sequential numeric/label
normalization, so all-option sequential prompts (where isOther
is false for every question) were rejected before the fallback
could try to resolve the typed reply. Skip the isOther prefilter
for sequential pending entries; the option-match check below
already validates the reply.
Add a regression test covering all-option sequential prompts
(where no question has isOther) and the case-insensitive exact
label fallback.
(cherry picked from commit ca50543)
…+ show active think defaults in /codex binding
Two P2/P3 follow-ups flagged by autoreview on the rebased
feat/codex-plan-controls branch:
(1) /codex plan and /codex think were added to
CODEX_NATIVE_EXECUTION_SUBCOMMANDS, which made every
'/codex plan ...' and '/codex think ...' invocation pass through
resolveCodexNativeExecutionBlock. In sessions where native
Codex execution is sandbox-blocked, users could not run local
preference forms like '/codex plan off', '/codex plan status',
'/codex plan stay <token>', or '/codex think status' even
though those forms only read or update the stored binding.
Existing controls (model, fast, permissions) explicitly return
before the native execution gate for status/invalid/local
forms. Match that pattern: plan [on|off|status|empty|stay
<token>] is local; only plan [approve|approve-clean] <token>
triggers a native Codex turn and must stay behind the guard.
think is always local (it is a preference write against the
bound binding).
(2) /codex binding still formatted threadBinding.reasoningEffort
directly, but the new /codex think command stores into
reasoningEffortDefaults and clears the legacy reasoningEffort
field. After running /codex think xhigh, the binding status
reported 'Think: default', making the new control appear not
to work. Switch the binding status path to the same
resolveCodexAppServerConversationReasoningEffort helper the
turn start path uses, so /codex binding and /codex think agree
on the active effort including the new per-mode defaults.
Adds two regression tests: one extends the existing 'local Codex
binding status forms in sandboxed sessions' test to cover
'/codex plan status' and '/codex think status'; one asserts the
binding status reports the active think effort resolved from the
new reasoningEffortDefaults field.
(cherry picked from commit 49d73e9)
…ng configured think defaults
The previous commit passed the full Codex plugin config to
readCodexAppServerConversationReasoningDefaults, but that helper
only reads a flat { execute, plan } object. Configured defaults
live at plugins.entries.codex.config.appServer.conversationReasoningDefaults
(not at the top level of the plugin config), so the configured
default was silently dropped on /codex binding status while the
turn start path used the unwrapped value via runtime config.
Read the configured defaults via readCodexPluginConfig(pluginConfig)
.appServer?.conversationReasoningDefaults before passing to the
helper, matching how /codex think status resolves them.
Adds a regression test that passes a pluginConfig with
appServer.conversationReasoningDefaults and asserts the binding
status reflects it.
(cherry picked from commit fe9d2b6)
…request_user_input prompts Plan-decision callbacks (Approve / Approve with clean context) ran the follow-up turn via runCodexBoundConversationPrompt, but no sendProgressReply was wired through. runBoundTurn returns emptyUserInputResponse() when sendProgressReply is undefined, so any Codex request_user_input prompt in the follow-up turn timed out after 10 minutes and the user never saw the question. CodexCommandDeps.buildPlanApprovalProgressReply now takes the channel of the originating callback so the progress sender is correctly targeted for telegram, discord, and slack. The factory is threaded through all three handleCodexPlanDecisionCallbackLazy call sites in extensions/codex/index.ts and the slash command path. (cherry picked from commit 3d77573)
…presentation - Use the originating event.channel for the inbound_claim progress sender so Discord/Slack bound conversations no longer get their request_user_input prompts routed through the Telegram adapter. - Render the portable Codex presentation via normalizeMessagePresentation + adaptMessagePresentationForChannel + channel renderPresentation before sendPayload, so plan-approval follow-up prompts reach the user as native buttons/components instead of text-only payloads. (cherry picked from commit c78067a)
…nversations
When the user types a freeform reply to a Codex request_user_input
prompt on a bound conversation and answerCodexUserInputFreeform
returns matched: false, the inbound_claim falls through to
`return { handled: true }` (the non-command-authorized path) and the
user sees no response. The most likely cause is a scope mismatch
between the pending user input (queued by sendProgressReply on the
synthetic inbound event from the plan-approval follow-up) and the
inbound ctx (the user's typed text dispatch context) on channel /
senderId / sessionKey / messageThreadId.
Add a single embeddedAgentLog.warn after the freeform check that
records:
- the typed prompt preview
- inputResult.matched and the resulting message
- the inbound event's channel / senderId / accountId / sessionKey /
messageThreadId / commandAuthorized
- the binding's kind and sessionFile
This is a diagnostic-only change: no behavior is altered. After the
user re-tests on Discord, the log line will identify which field
mismatches and we can land the actual fix in a follow-up commit.
(cherry picked from commit cc97d6a)
…orm does not match
Previously, a non-command-authorized typed message on a bound
codex-app-server conversation was silently swallowed by
`return { handled: true }` after the freeform matcher check. The
matcher could return matched: false for a typed reply that did not
exactly match a pending user_input option label, scope-mismatched
on sessionKey / messageThreadId, or had no pending at all. The
result: the user typed text into a bound Codex chat, saw no
response, and the bound turn never started.
A bound chat session should always reach Codex as a fresh turn
prompt so the user sees a response, even if the typed text is plain
prose. Slash commands are still protected upstream by
answerCodexUserInputFreeform's "/" check (line 322 of
conversation-chat-controls.ts), and the codex plugin's own /codex
command router handles explicit /codex <verb> commands before this
inbound_claim hook is reached.
Update the two existing tests that were asserting the old
silent-drop behavior to assert the new fall-through-to-turn-start
behavior, including a binding sidecar file so the new turn can
locate the thread.
A diagnostic log line is kept for the case where the freeform
matcher returns matched: false on an app-server bound conversation.
This logs only scope fields (channel / senderId / accountId /
sessionKey / messageThreadId / commandAuthorized and the binding's
kind / sessionFile) — never prompt content — so future debugging
does not require a code change and authorized prompts (which can
contain secrets) are not captured.
(cherry picked from commit ac62d6d)
Previously, a typed freeform reply that was a prefix of the option label but not the exact label was rejected by resolveFreeformOptionAnswer. For example, a user that types "CLI Cleanup" against a rendered option labeled "CLI Cleanup (Recommended) - Keeps the fake plan small and engineering-shaped." would not match (case-insensitive exact match only), so the matcher returned matched: false and the user's typed text was treated as an unmatched freeform. Add a single-option prefix match: if exactly one option's label starts with the typed text (case-insensitive), resolve to that option. If two or more options share a common prefix, fall through to the caller's freeform fallback (or the new inbound_claim "couldn't match your reply" guard) so ambiguity is handled elsewhere. This unblocks the common pattern of typing a shortened option label without copying the (Recommended) suffix or description that the chat UI renders alongside the button. (cherry picked from commit 0ce4984)
…m the chat UI The buildCurrentContextPlanApprovalPrompt and buildCleanContextPlanApprovalPrompt builders used a generic "The user approved the plan below" framing that did not name Codex or attribute the approval to the chat-UI button click. That left Codex without a clear mental model of how the approval reached it, and led to a follow-up reply where Codex mis-attributed the button to the surrounding OpenClaw shell rather than accepting it as a real plan approval. Update both builders to use first-person framing: "I (Codex) just received an ... button click from the OpenClaw chat UI. That button is OpenClaw routing the user's plan approval back into this Codex thread — it is not a command the user typed." The remainder of the prompt (execute the plan, re-read files, verify) is unchanged. Add tests that pin the new wording for both the current-context and clean-context approval flows. (cherry picked from commit 32272cf)
…atch hook The 15eda36 commit (resolve typed input replies) removed a closing paren and trailing comma in dispatch-from-config.ts, breaking the traceReplyPhase arrow-function call site. Restore the missing paren to make the function-call expression syntactically valid.
The 15eda36 commit (resolve typed input replies) added a before_dispatch handler in codex/index.ts that calls answerCodexUserInputFreeform with channel: event.channel ?? ctx.channelId. The freeform helper requires a non-null channel string. The current branch has an explicit channel guard; the cherry-pick lost the guard during conflict resolution. Restore the guard, and reformat the test assertion block to match.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Consolidates the Codex plan-controls work into a single PR covering the native app-server conversation binding plus all chat-side plan and user-input controls. A follow-up PR (#4) adds interactive approval routing.
Companion PR: #4 — bound-conversation approval routing through the OpenClaw gateway.
Features
Native conversation binding
approvalPolicy,sandbox,serviceTier, model-backed reviewer)Plan mode and reasoning effort
/codex plan [on|off|status]togglescollaborationModebetweendefaultandplanfor bound threads/codex think [plan|execute] [default|minimal|low|medium|high|xhigh|status]sets per-mode reasoning defaultsLive progress and progress delivery
/codex live [on|off|status]causes bound turns to emit standalone progress/commentary messagesChat plan approval controls
<proposed_plan>in assistant textUser-input chat controls
item/userInput(orrequest_user_input) prompts to chat buttons and freeform repliesbefore_dispatchhook routes typed input replies before normal steeringChanges
extensions/codex/src/conversation-control.ts— plan/think/live command parsing and binding stateextensions/codex/src/conversation-binding.ts— bound turn loop, progress delivery, chat controls wiringextensions/codex/src/conversation-chat-controls.ts— plan approval and user-input control stateextensions/codex/src/conversation-progress-reply.ts— standalone progress message deliveryextensions/codex/src/app-server/reasoning-defaults.ts— per-mode reasoning defaultsextensions/codex/src/app-server/session-binding.ts— collaboration mode and reasoning persistenceextensions/codex/src/app-server/user-input-bridge.ts— request_user_input to chat bridgeextensions/codex/src/app-server/user-input-shared.ts— shared user-input helpersextensions/codex/src/command-handlers.ts— local plan/think command handlersextensions/codex/index.ts— bound inbound claim, freeform dispatch hook, progress reply channelsrc/auto-reply/reply/dispatch-from-config.ts— verbose progress lane,before_dispatchfreeform hookui/src/ui/views/overview.ts— removed Codex demo widget usage (widgets preserved)Verification
pnpm tsgo:extensionspassespnpm buildpassesextensions/codex/src/conversation-binding.test.ts,extensions/codex/src/app-server/user-input-bridge.test.ts, and related modulesextensions/codex/index.test.ts,extensions/codex/src/app-server/config.test.ts,extensions/codex/src/app-server/thread-lifecycle.test.tsare documented as unrelated to this change (same on the current branch)