Upstream CLI drift report: gemini-cli v0.38/v0.39 &amp; codex v0.122 — no hard breakage, several improvement opportunities

## Summary

Audit of the two upstream CLIs we wrap — [`google-gemini/gemini-cli`](https://github.com/google-gemini/gemini-cli) and [`openai/codex`](https://github.com/openai/codex) — against our current executor code (`packages/gemini-mcp/src/utils/geminiExecutor.ts`, `packages/codex-mcp/src/utils/codexExecutor.ts`, plus their `constants.ts`).

**Bottom line: no immediate breaking changes.** Our flags and JSON schemas still match the upstream contracts. But there are several places where we are leaving value on the table, and one or two items that will bite us the next time Google or OpenAI does a rename.

Versions checked:
- `gemini-cli` latest stable **v0.38.2** (2026-04-17); preview **v0.39.0-preview.0** (2026-04-14); nightly **v0.40.0-nightly** (2026-04-15)
- `codex` latest stable **rust-v0.122.0** (2026-04-20); alphas through **v0.123.0-alpha.6** (2026-04-21)

---

## 1. Gemini CLI

### 1.1 No breaking changes to flags we use
We currently pass `-m`, `-s`, `-p`, `--output-format stream-json`, `--resume`, `--include-directories`. All still supported in v0.38.2 / v0.39.0-preview. The headless mode contract (JSON object with `response` / `stats` / `error`) is unchanged.

### 1.2 Breaking-change watchlist (not yet triggered)
- **Model alias graduation.** Our default is hard-coded to `gemini-3.1-pro-preview` and fallback to `gemini-3-flash-preview` (`packages/gemini-mcp/src/constants.ts:25`). Upstream [discussion #19724](https://github.com/google-gemini/gemini-cli/discussions/19724) shows that `gemini-3.1-pro` / `gemini-3-flash` (no `-preview` suffix) are already referenced in Auto routing. When Google drops the preview suffix, our hard-coded defaults will silently keep hitting a deprecated endpoint until it's retired, then break. **Mitigation:** document the env vars `ASK_GEMINI_MODEL` / `ASK_GEMINI_FALLBACK_MODEL` more prominently and bump defaults once the non-preview aliases are the stable names.
- **Tool-call events in `stream-json`.** Per the [headless mode docs](https://github.com/google-gemini/gemini-cli/blob/HEAD/docs/cli/headless.md), the `stream-json` event stream includes `tool_use` and `tool_result` events in addition to `init` / `message` / `result` / `error`. Our parser (`parseGeminiStreamJsonl`, `geminiExecutor.ts:265`) silently drops those event types. Not broken today — we only care about final assistant text — but if Google promotes tool events to required for progress reporting, we'll lose visibility.

### 1.3 Improvement opportunities
- **Surface structured exit codes.** Gemini CLI headless mode defines: `0` success, `1` general error, `42` input error, `53` turn-limit exceeded. `commandExecutor.ts:136` swallows the code and just returns a generic `"Failed with exit code N"` string. We could map these to typed errors so `ask-gemini` callers get a meaningful `TurnLimitExceeded` vs `InputError` distinction instead of string matching.
- **Auto routing.** `gemini -m auto` is now the default interactive behaviour and picks between Pro and Flash per-prompt. Exposing `auto` as a model choice would let users opt into upstream routing without having to pick the preview alias. This would also cleanly sidestep the preview-suffix graduation issue above.
- **Tool event forwarding.** Extending `makeStreamingProgressForwarder` (`geminiExecutor.ts:334`) to emit `tool_use` / `tool_result` events as progress lines would give a much better live UX in `/multi-review` and `/brainstorm` when Gemini uses its built-in tools.
- **`/memory inbox` + background skill extraction (v0.38/v0.39).** Gemini now extracts reusable "skills" in the background and surfaces them via `/memory inbox`. For our review/brainstorm flows this is a free quality boost with no code change — but we should document it so users enable it, and potentially plumb the session feature through `includeDirs` so skills persist across review invocations.

### 1.4 Known upstream bug we should track
- Historical flakiness of `--output-format json` ([issue #11184](https://github.com/google-gemini/gemini-cli/issues/11184) — closed): `response` sometimes contained escaped-string JSON rather than a JSON object. We already dodge this by using `stream-json` by default, but `ask-gemini-edit` and `ask-gemini` fall back to `parseGeminiJsonOutput` in the `!parsedAnyEvent` branch (`geminiExecutor.ts:312`). Worth a regression test with the latest CLI to confirm the fallback still works.

---

## 2. Codex CLI

### 2.1 No breaking changes to flags or JSONL schema
We pass `exec`, `resume`, `--skip-git-repo-check`, `--ephemeral`, `--full-auto`, `--json`, `-m`. All present in v0.122.0. The JSONL event shape we parse — `thread.started` → `thread_id`, `item.completed` with `item.type === "agent_message"` + `item.text`, `turn.completed.usage` with `input_tokens` / `cached_input_tokens` / `output_tokens` — is documented as stable in the current [exec --json cheatsheet](https://takopi.dev/reference/runners/codex/exec-json-cheatsheet/).

### 2.2 Useful new flags in v0.122.0 we are not using
- **`--ignore-user-config`** and **`--ignore-rules`** on `codex exec`. These make an exec run deterministic and independent of the user's `~/.codex/config.toml` / project AGENTS.md — exactly what an MCP wrapper wants for reproducible review output. I'd recommend adding these to `buildArgs` in `codexExecutor.ts:129` for the ephemeral (non-`sessionId`) path, behind an opt-out env var so users who deliberately customize config keep their overrides.

### 2.3 Missing metadata in `exec --json` (upstream issue)
- [openai/codex#14736](https://github.com/openai/codex/issues/14736) — still open — points out that `exec --json` does not emit the resolved model name anywhere in the event stream. Our `buildUsageStats` (`codexExecutor.ts:41`) compensates by using the requested model (or our hard-coded fallback) as `model`. This is correct only because we control fallback client-side; if OpenAI ever adds server-side routing (they already have it for ChatGPT accounts), our `usage.model` will be wrong. Worth following this issue; when it lands, switch `buildUsageStats` to prefer the event-reported model.

### 2.4 Model catalog is still in sync
Our defaults (`gpt-5.4` default, `gpt-5.4-mini` fallback — `codex-mcp/src/constants.ts:16`) match the current Codex stable catalog. Note: `gpt-5.1-codex-*` family was [deprecated across GitHub Copilot on 2026-04-01](https://github.blog/changelog/2026-04-03-gpt-5-1-codex-gpt-5-1-codex-max-and-gpt-5-1-codex-mini-deprecated/) but those were never our defaults, so no action needed. ChatGPT-account-only restriction on `gpt-5.4` ([issue #14181](https://github.com/openai/codex/issues/14181)) is a pre-existing constraint our users hit via API-key auth — no CLI change.

### 2.5 Chat Completions API deprecation (medium-term)
Codex [docs](https://developers.openai.com/codex/changelog) note Chat Completions API support will be removed "in future releases." Since we only shell out to `codex exec` and let the CLI handle API transport, this should not affect us — but worth a note in `CHANGELOG.md` so contributors know not to reach past the CLI.

### 2.6 Neutral / TUI-only (no action)
v0.121.0 / v0.122.0 additions that do NOT affect our wrapper: `codex marketplace add`, `codex app` desktop integration, `/side` TUI command, tabbed plugin browsing, memory mode TUI controls, devcontainer bubblewrap profile, reverse-search prompt history. These are all interactive-mode features and `codex exec --json` ignores them.

---

## 3. Proposed action items

Ordered by effort/payoff ratio:

1. **[Easy, high value]** Add `--ignore-user-config` + `--ignore-rules` to the ephemeral `codex exec` path in `codex-mcp/src/utils/codexExecutor.ts` behind an opt-out (e.g. `ASK_CODEX_RESPECT_USER_CONFIG=1`). Makes review output reproducible across dev machines.
2. **[Easy, medium value]** Surface Gemini's documented exit codes (42 = input error, 53 = turn limit) as typed errors in `packages/shared/src/commandExecutor.ts`.
3. **[Medium, medium value]** Teach `parseGeminiStreamJsonl` and `makeStreamingProgressForwarder` about `tool_use` / `tool_result` events so progress output reflects what Gemini is actually doing during long runs.
4. **[Easy, low value today, avoids future breakage]** Document `ASK_GEMINI_MODEL` / `ASK_CODEX_MODEL` overrides prominently in README. Add CI check that pulls Gemini's supported-model list and warns if our default is no longer in it.
5. **[Tracking only]** Watch [openai/codex#14736](https://github.com/openai/codex/issues/14736); when model is added to JSONL events, have `buildUsageStats` prefer the event-reported model over the requested one.
6. **[Tracking only]** Watch for Gemini 3.1 Pro / Gemini 3 Flash graduating out of `-preview`; bump the default in `gemini-mcp/constants.ts` when that happens.

No hotfix required — none of these are user-visible regressions today.

---

*Report generated from an upstream release audit. Sources: [gemini-cli releases](https://github.com/google-gemini/gemini-cli/releases), [codex releases](https://github.com/openai/codex/releases), [codex changelog](https://developers.openai.com/codex/changelog), upstream issues linked inline.*


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upstream CLI drift report: gemini-cli v0.38/v0.39 & codex v0.122 — no hard breakage, several improvement opportunities #24

Summary

1. Gemini CLI

1.1 No breaking changes to flags we use

1.2 Breaking-change watchlist (not yet triggered)

1.3 Improvement opportunities

1.4 Known upstream bug we should track

2. Codex CLI

2.1 No breaking changes to flags or JSONL schema

2.2 Useful new flags in v0.122.0 we are not using

2.3 Missing metadata in `exec --json` (upstream issue)

2.4 Model catalog is still in sync

2.5 Chat Completions API deprecation (medium-term)

2.6 Neutral / TUI-only (no action)

3. Proposed action items

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Upstream CLI drift report: gemini-cli v0.38/v0.39 &amp; codex v0.122 — no hard breakage, several improvement opportunities #24

Description

Summary

1. Gemini CLI

1.1 No breaking changes to flags we use

1.2 Breaking-change watchlist (not yet triggered)

1.3 Improvement opportunities

1.4 Known upstream bug we should track

2. Codex CLI

2.1 No breaking changes to flags or JSONL schema

2.2 Useful new flags in v0.122.0 we are not using

2.3 Missing metadata in exec --json (upstream issue)

2.4 Model catalog is still in sync

2.5 Chat Completions API deprecation (medium-term)

2.6 Neutral / TUI-only (no action)

3. Proposed action items

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Upstream CLI drift report: gemini-cli v0.38/v0.39 & codex v0.122 — no hard breakage, several improvement opportunities #24

2.3 Missing metadata in `exec --json` (upstream issue)