fix(ai): clamp GPT-5.5 context cap#767
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4675946895
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } | ||
| out.push(enrichModelThinking(item as Model<TApi>)); | ||
| } | ||
| applyGeneratedModelPolicies(out as Model<Api>[]); |
There was a problem hiding this comment.
Apply GPT-5.5 clamp after all merges
This call only normalizes the static list (and the fresh-cache fast path). In the online/offline merge path, fetched dynamic rows and stale cache rows are normalized with normalizeModelList and merged afterward; mergeDynamicModel then takes dynamicModel.contextWindow via preferDiscoveryLimit and only calls enrichModelThinking, so a discovered or cached gpt-5.5 with 1_100_000 tokens overwrites the 400_000 value set here. That leaves the resolved model over cap for the exact online discovery scenario covered by the new test, so the policy needs to run on the final merged result or inside the dynamic/cache normalization path.
Useful? React with 👍 / 👎.
|
REQUEST_CHANGES Red-team review found merge blockers:
Not merge-ready until CI is green and the clamp is applied after dynamic merge or otherwise made authoritative without clobbering static transport/baseUrl overrides or pro models. — |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: dbd88d3a3c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } | ||
| out.push(enrichModelThinking(item as Model<TApi>)); | ||
| } | ||
| applyGeneratedModelPolicies(out as Model<Api>[]); |
There was a problem hiding this comment.
Preserve cached explicit thinking metadata
When the fresh-cache fingerprint fast path returns cached dynamic rows, this new policy pass routes them through applyGeneratedModelPolicies, which calls refreshModelThinking and ignores any explicit thinking saved in the cache. For dynamic providers such as Ollama that discover and cache endpoint-provided thinking ranges, a warm cache can widen a model from its advertised range (for example minimal..high) to the generic inferred OpenAI Responses range (minimal..xhigh), causing later requests to send unsupported effort levels. Apply the generated context correction without refreshing explicit thinking for cache/custom rows.
Useful? React with 👍 / 👎.
| contextPromotionTarget: dynamicModel.contextPromotionTarget ?? existingModel.contextPromotionTarget, | ||
| }); | ||
| const policyModels = [merged as Model<Api>]; | ||
| applyGeneratedModelPolicies(policyModels); |
There was a problem hiding this comment.
Clamp dynamic-only GPT-5.5 rows
This policy call only runs when a dynamic row matches an existing model and reaches mergeDynamicModel; providers with no static row for the id still return or insert the dynamic model directly, so a discovered/cache-only gpt-5.5 from configured providers or empty-catalog providers like vLLM/LM Studio can keep the stale context window. Fresh evidence in this revision is that the new policy application is scoped to the matched-model helper rather than the final merged list or dynamic normalization path.
Useful? React with 👍 / 👎.
|
REQUEST_CHANGES Revalidated follow-up commit Resolved:
Still blocked:
Required fix: make the GPT-5.5 cap apply only to the exact base model (not This supersedes my previous REQUEST_CHANGES comment only for the dynamic-merge clamp and transport/baseUrl items; the pro/unknown-suffix blocker remains. — |
|
REQUEST_CHANGES Final revalidation of fixed head
Required before merge: make the GPT-5.5 context cap exact-base-only, and add regression coverage that — |
|
Pushed follow-up commit Evidence:
— |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6a1ee281ad
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } | ||
| out.push(enrichModelThinking(item as Model<TApi>)); | ||
| } | ||
| applyGeneratedModelPolicies(out as Model<Api>[]); |
There was a problem hiding this comment.
Apply the clamp to bundled startup models
This policy pass only covers model lists that go through resolveProviderModels. The coding-agent startup path loads built-ins directly in #loadBuiltInModels via getBundledModels (packages/coding-agent/src/config/model-registry.ts:1149-1155), and the bundled openai/gpt-5.5 row still has contextWindow: 1050000 plus a promotion target to openai/gpt-5.4, so before discovery runs (or when discovery is skipped for missing credentials) users selecting openai/gpt-5.5 still hit the stale cap/demotion behavior this change is meant to fix. Apply the policy at bundled-model load time or regenerate the catalog as well.
Useful? React with 👍 / 👎.
|
|
||
| function parseOpenAIModel(modelId: string): OpenAIModel | null { | ||
| const match = /gpt-(\d+(?:\.\d+){0,2})(?:-(codex-spark|codex-mini|codex-max|codex|mini|max|nano))?\b/.exec(modelId); | ||
| const match = /gpt-(\d+(?:\.\d+){0,2})(?:-(codex-spark|codex-mini|codex-max|codex|mini|max|nano))?$/.exec(modelId); |
There was a problem hiding this comment.
Preserve parsing for date-pinned GPT models
Anchoring the OpenAI parser to the end of the id makes date-stamped aliases such as gpt-5-mini-2025-08-07 and gpt-5.1-2025-11-13 stop parsing as GPT-5/GPT-5.1 models, even though the bundled catalog declares their thinking range only up to high. When enrichModelThinking sees them as unknown, the openai-responses fallback advertises xhigh, so these catalog rows can now allow and send unsupported reasoning levels; keep accepting release-date suffixes while still excluding the gpt-5.5 variants that should not be clamped.
Useful? React with 👍 / 👎.
6a1ee28 to
8d2912c
Compare
|
Pushed follow-up commit Evidence:
— |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8d2912c573
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| function parseOpenAIModel(modelId: string): OpenAIModel | null { | ||
| const match = /gpt-(\d+(?:\.\d+){0,2})(?:-(codex-spark|codex-mini|codex-max|codex|mini|max|nano))?\b/.exec(modelId); | ||
| const match = /gpt-(\d+(?:\.\d+){0,2})(?:-(codex-spark|codex-mini|codex-max|codex|mini|max|nano))?$/.exec(modelId); |
There was a problem hiding this comment.
Preserve parsing for suffixed GPT aliases
Beyond the already noted date-stamped IDs, this anchor also makes bundled non-date aliases such as openai/gpt-5-pro and cursor/gpt-5.2-high parse as unknown whenever provider model resolution runs. Because passModelList now refreshes thinking via applyGeneratedModelPolicies, those rows lose their explicit catalog ranges (gpt-5-pro max high, gpt-5.2-high max xhigh) and fall back to API defaults, so selection can advertise/send unsupported effort levels or clamp away supported ones for those built-in aliases.
Useful? React with 👍 / 👎.
|
Pushed follow-up commit Evidence:
— |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 06ded22496
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } | ||
| out.push(enrichModelThinking(item as Model<TApi>)); | ||
| } | ||
| applyGeneratedModelPolicies(out as Model<Api>[]); |
There was a problem hiding this comment.
Clear stale GPT-5.5 promotion targets
When this new policy pass clamps a bundled/cache gpt-5.5 row, it leaves any existing contextPromotionTarget untouched; the bundled OpenAI/LiteLLM/OpenCode Zen rows still point at */gpt-5.4. With contextPromotion.enabled, AgentSession.#resolveContextPromotionConfiguredTarget follows that explicit target and accepts it if the target window is larger than the now-400K current window, so an overflow on GPT-5.5 can still switch users to GPT-5.4 instead of staying on GPT-5.5 or compacting. Clear the generated GPT-5.4 target along with the cap unless it was user-overridden.
Useful? React with 👍 / 👎.
|
MERGE_READY Revalidated exact PR head This supersedes my stale REQUEST_CHANGES comments at Verified:
— |
Summary
Verification