Summary
Audit of the two upstream CLIs we wrap — google-gemini/gemini-cli and openai/codex — against our current executor code (packages/gemini-mcp/src/utils/geminiExecutor.ts, packages/codex-mcp/src/utils/codexExecutor.ts, plus their constants.ts).
Bottom line: no immediate breaking changes. Our flags and JSON schemas still match the upstream contracts. But there are several places where we are leaving value on the table, and one or two items that will bite us the next time Google or OpenAI does a rename.
Versions checked:
gemini-cli latest stable v0.38.2 (2026-04-17); preview v0.39.0-preview.0 (2026-04-14); nightly v0.40.0-nightly (2026-04-15)
codex latest stable rust-v0.122.0 (2026-04-20); alphas through v0.123.0-alpha.6 (2026-04-21)
1. Gemini CLI
1.1 No breaking changes to flags we use
We currently pass -m, -s, -p, --output-format stream-json, --resume, --include-directories. All still supported in v0.38.2 / v0.39.0-preview. The headless mode contract (JSON object with response / stats / error) is unchanged.
1.2 Breaking-change watchlist (not yet triggered)
- Model alias graduation. Our default is hard-coded to
gemini-3.1-pro-preview and fallback to gemini-3-flash-preview (packages/gemini-mcp/src/constants.ts:25). Upstream discussion #19724 shows that gemini-3.1-pro / gemini-3-flash (no -preview suffix) are already referenced in Auto routing. When Google drops the preview suffix, our hard-coded defaults will silently keep hitting a deprecated endpoint until it's retired, then break. Mitigation: document the env vars ASK_GEMINI_MODEL / ASK_GEMINI_FALLBACK_MODEL more prominently and bump defaults once the non-preview aliases are the stable names.
- Tool-call events in
stream-json. Per the headless mode docs, the stream-json event stream includes tool_use and tool_result events in addition to init / message / result / error. Our parser (parseGeminiStreamJsonl, geminiExecutor.ts:265) silently drops those event types. Not broken today — we only care about final assistant text — but if Google promotes tool events to required for progress reporting, we'll lose visibility.
1.3 Improvement opportunities
- Surface structured exit codes. Gemini CLI headless mode defines:
0 success, 1 general error, 42 input error, 53 turn-limit exceeded. commandExecutor.ts:136 swallows the code and just returns a generic "Failed with exit code N" string. We could map these to typed errors so ask-gemini callers get a meaningful TurnLimitExceeded vs InputError distinction instead of string matching.
- Auto routing.
gemini -m auto is now the default interactive behaviour and picks between Pro and Flash per-prompt. Exposing auto as a model choice would let users opt into upstream routing without having to pick the preview alias. This would also cleanly sidestep the preview-suffix graduation issue above.
- Tool event forwarding. Extending
makeStreamingProgressForwarder (geminiExecutor.ts:334) to emit tool_use / tool_result events as progress lines would give a much better live UX in /multi-review and /brainstorm when Gemini uses its built-in tools.
/memory inbox + background skill extraction (v0.38/v0.39). Gemini now extracts reusable "skills" in the background and surfaces them via /memory inbox. For our review/brainstorm flows this is a free quality boost with no code change — but we should document it so users enable it, and potentially plumb the session feature through includeDirs so skills persist across review invocations.
1.4 Known upstream bug we should track
- Historical flakiness of
--output-format json (issue #11184 — closed): response sometimes contained escaped-string JSON rather than a JSON object. We already dodge this by using stream-json by default, but ask-gemini-edit and ask-gemini fall back to parseGeminiJsonOutput in the !parsedAnyEvent branch (geminiExecutor.ts:312). Worth a regression test with the latest CLI to confirm the fallback still works.
2. Codex CLI
2.1 No breaking changes to flags or JSONL schema
We pass exec, resume, --skip-git-repo-check, --ephemeral, --full-auto, --json, -m. All present in v0.122.0. The JSONL event shape we parse — thread.started → thread_id, item.completed with item.type === "agent_message" + item.text, turn.completed.usage with input_tokens / cached_input_tokens / output_tokens — is documented as stable in the current exec --json cheatsheet.
2.2 Useful new flags in v0.122.0 we are not using
--ignore-user-config and --ignore-rules on codex exec. These make an exec run deterministic and independent of the user's ~/.codex/config.toml / project AGENTS.md — exactly what an MCP wrapper wants for reproducible review output. I'd recommend adding these to buildArgs in codexExecutor.ts:129 for the ephemeral (non-sessionId) path, behind an opt-out env var so users who deliberately customize config keep their overrides.
2.3 Missing metadata in exec --json (upstream issue)
- openai/codex#14736 — still open — points out that
exec --json does not emit the resolved model name anywhere in the event stream. Our buildUsageStats (codexExecutor.ts:41) compensates by using the requested model (or our hard-coded fallback) as model. This is correct only because we control fallback client-side; if OpenAI ever adds server-side routing (they already have it for ChatGPT accounts), our usage.model will be wrong. Worth following this issue; when it lands, switch buildUsageStats to prefer the event-reported model.
2.4 Model catalog is still in sync
Our defaults (gpt-5.4 default, gpt-5.4-mini fallback — codex-mcp/src/constants.ts:16) match the current Codex stable catalog. Note: gpt-5.1-codex-* family was deprecated across GitHub Copilot on 2026-04-01 but those were never our defaults, so no action needed. ChatGPT-account-only restriction on gpt-5.4 (issue #14181) is a pre-existing constraint our users hit via API-key auth — no CLI change.
2.5 Chat Completions API deprecation (medium-term)
Codex docs note Chat Completions API support will be removed "in future releases." Since we only shell out to codex exec and let the CLI handle API transport, this should not affect us — but worth a note in CHANGELOG.md so contributors know not to reach past the CLI.
2.6 Neutral / TUI-only (no action)
v0.121.0 / v0.122.0 additions that do NOT affect our wrapper: codex marketplace add, codex app desktop integration, /side TUI command, tabbed plugin browsing, memory mode TUI controls, devcontainer bubblewrap profile, reverse-search prompt history. These are all interactive-mode features and codex exec --json ignores them.
3. Proposed action items
Ordered by effort/payoff ratio:
- [Easy, high value] Add
--ignore-user-config + --ignore-rules to the ephemeral codex exec path in codex-mcp/src/utils/codexExecutor.ts behind an opt-out (e.g. ASK_CODEX_RESPECT_USER_CONFIG=1). Makes review output reproducible across dev machines.
- [Easy, medium value] Surface Gemini's documented exit codes (42 = input error, 53 = turn limit) as typed errors in
packages/shared/src/commandExecutor.ts.
- [Medium, medium value] Teach
parseGeminiStreamJsonl and makeStreamingProgressForwarder about tool_use / tool_result events so progress output reflects what Gemini is actually doing during long runs.
- [Easy, low value today, avoids future breakage] Document
ASK_GEMINI_MODEL / ASK_CODEX_MODEL overrides prominently in README. Add CI check that pulls Gemini's supported-model list and warns if our default is no longer in it.
- [Tracking only] Watch openai/codex#14736; when model is added to JSONL events, have
buildUsageStats prefer the event-reported model over the requested one.
- [Tracking only] Watch for Gemini 3.1 Pro / Gemini 3 Flash graduating out of
-preview; bump the default in gemini-mcp/constants.ts when that happens.
No hotfix required — none of these are user-visible regressions today.
Report generated from an upstream release audit. Sources: gemini-cli releases, codex releases, codex changelog, upstream issues linked inline.
Summary
Audit of the two upstream CLIs we wrap —
google-gemini/gemini-cliandopenai/codex— against our current executor code (packages/gemini-mcp/src/utils/geminiExecutor.ts,packages/codex-mcp/src/utils/codexExecutor.ts, plus theirconstants.ts).Bottom line: no immediate breaking changes. Our flags and JSON schemas still match the upstream contracts. But there are several places where we are leaving value on the table, and one or two items that will bite us the next time Google or OpenAI does a rename.
Versions checked:
gemini-clilatest stable v0.38.2 (2026-04-17); preview v0.39.0-preview.0 (2026-04-14); nightly v0.40.0-nightly (2026-04-15)codexlatest stable rust-v0.122.0 (2026-04-20); alphas through v0.123.0-alpha.6 (2026-04-21)1. Gemini CLI
1.1 No breaking changes to flags we use
We currently pass
-m,-s,-p,--output-format stream-json,--resume,--include-directories. All still supported in v0.38.2 / v0.39.0-preview. The headless mode contract (JSON object withresponse/stats/error) is unchanged.1.2 Breaking-change watchlist (not yet triggered)
gemini-3.1-pro-previewand fallback togemini-3-flash-preview(packages/gemini-mcp/src/constants.ts:25). Upstream discussion #19724 shows thatgemini-3.1-pro/gemini-3-flash(no-previewsuffix) are already referenced in Auto routing. When Google drops the preview suffix, our hard-coded defaults will silently keep hitting a deprecated endpoint until it's retired, then break. Mitigation: document the env varsASK_GEMINI_MODEL/ASK_GEMINI_FALLBACK_MODELmore prominently and bump defaults once the non-preview aliases are the stable names.stream-json. Per the headless mode docs, thestream-jsonevent stream includestool_useandtool_resultevents in addition toinit/message/result/error. Our parser (parseGeminiStreamJsonl,geminiExecutor.ts:265) silently drops those event types. Not broken today — we only care about final assistant text — but if Google promotes tool events to required for progress reporting, we'll lose visibility.1.3 Improvement opportunities
0success,1general error,42input error,53turn-limit exceeded.commandExecutor.ts:136swallows the code and just returns a generic"Failed with exit code N"string. We could map these to typed errors soask-geminicallers get a meaningfulTurnLimitExceededvsInputErrordistinction instead of string matching.gemini -m autois now the default interactive behaviour and picks between Pro and Flash per-prompt. Exposingautoas a model choice would let users opt into upstream routing without having to pick the preview alias. This would also cleanly sidestep the preview-suffix graduation issue above.makeStreamingProgressForwarder(geminiExecutor.ts:334) to emittool_use/tool_resultevents as progress lines would give a much better live UX in/multi-reviewand/brainstormwhen Gemini uses its built-in tools./memory inbox+ background skill extraction (v0.38/v0.39). Gemini now extracts reusable "skills" in the background and surfaces them via/memory inbox. For our review/brainstorm flows this is a free quality boost with no code change — but we should document it so users enable it, and potentially plumb the session feature throughincludeDirsso skills persist across review invocations.1.4 Known upstream bug we should track
--output-format json(issue #11184 — closed):responsesometimes contained escaped-string JSON rather than a JSON object. We already dodge this by usingstream-jsonby default, butask-gemini-editandask-geminifall back toparseGeminiJsonOutputin the!parsedAnyEventbranch (geminiExecutor.ts:312). Worth a regression test with the latest CLI to confirm the fallback still works.2. Codex CLI
2.1 No breaking changes to flags or JSONL schema
We pass
exec,resume,--skip-git-repo-check,--ephemeral,--full-auto,--json,-m. All present in v0.122.0. The JSONL event shape we parse —thread.started→thread_id,item.completedwithitem.type === "agent_message"+item.text,turn.completed.usagewithinput_tokens/cached_input_tokens/output_tokens— is documented as stable in the current exec --json cheatsheet.2.2 Useful new flags in v0.122.0 we are not using
--ignore-user-configand--ignore-rulesoncodex exec. These make an exec run deterministic and independent of the user's~/.codex/config.toml/ project AGENTS.md — exactly what an MCP wrapper wants for reproducible review output. I'd recommend adding these tobuildArgsincodexExecutor.ts:129for the ephemeral (non-sessionId) path, behind an opt-out env var so users who deliberately customize config keep their overrides.2.3 Missing metadata in
exec --json(upstream issue)exec --jsondoes not emit the resolved model name anywhere in the event stream. OurbuildUsageStats(codexExecutor.ts:41) compensates by using the requested model (or our hard-coded fallback) asmodel. This is correct only because we control fallback client-side; if OpenAI ever adds server-side routing (they already have it for ChatGPT accounts), ourusage.modelwill be wrong. Worth following this issue; when it lands, switchbuildUsageStatsto prefer the event-reported model.2.4 Model catalog is still in sync
Our defaults (
gpt-5.4default,gpt-5.4-minifallback —codex-mcp/src/constants.ts:16) match the current Codex stable catalog. Note:gpt-5.1-codex-*family was deprecated across GitHub Copilot on 2026-04-01 but those were never our defaults, so no action needed. ChatGPT-account-only restriction ongpt-5.4(issue #14181) is a pre-existing constraint our users hit via API-key auth — no CLI change.2.5 Chat Completions API deprecation (medium-term)
Codex docs note Chat Completions API support will be removed "in future releases." Since we only shell out to
codex execand let the CLI handle API transport, this should not affect us — but worth a note inCHANGELOG.mdso contributors know not to reach past the CLI.2.6 Neutral / TUI-only (no action)
v0.121.0 / v0.122.0 additions that do NOT affect our wrapper:
codex marketplace add,codex appdesktop integration,/sideTUI command, tabbed plugin browsing, memory mode TUI controls, devcontainer bubblewrap profile, reverse-search prompt history. These are all interactive-mode features andcodex exec --jsonignores them.3. Proposed action items
Ordered by effort/payoff ratio:
--ignore-user-config+--ignore-rulesto the ephemeralcodex execpath incodex-mcp/src/utils/codexExecutor.tsbehind an opt-out (e.g.ASK_CODEX_RESPECT_USER_CONFIG=1). Makes review output reproducible across dev machines.packages/shared/src/commandExecutor.ts.parseGeminiStreamJsonlandmakeStreamingProgressForwarderabouttool_use/tool_resultevents so progress output reflects what Gemini is actually doing during long runs.ASK_GEMINI_MODEL/ASK_CODEX_MODELoverrides prominently in README. Add CI check that pulls Gemini's supported-model list and warns if our default is no longer in it.buildUsageStatsprefer the event-reported model over the requested one.-preview; bump the default ingemini-mcp/constants.tswhen that happens.No hotfix required — none of these are user-visible regressions today.
Report generated from an upstream release audit. Sources: gemini-cli releases, codex releases, codex changelog, upstream issues linked inline.