Skip to content

fix(ai): clamp GPT-5.5 context cap#767

Merged
Yeachan-Heo merged 5 commits into
devfrom
issue-756-context-cap-display
Jun 16, 2026
Merged

fix(ai): clamp GPT-5.5 context cap#767
Yeachan-Heo merged 5 commits into
devfrom
issue-756-context-cap-display

Conversation

@Yeachan-Heo

Copy link
Copy Markdown
Owner

Summary

  • Clamp resolved GPT-5.5 model metadata through bundled/cache/dynamic model list normalization so displayed/accounting context uses 400K instead of stale 1.1M values.
  • Preserve static role/model override resolution for transport/baseUrl while applying generated model policies.
  • Keep gpt-5.5-pro out of this clamp because current parser semantics do not classify the pro suffix as GPT-5.5 base.
  • Add a focused regression test covering GPT-5.5 clamp, non-GPT-5.5 preservation, and static override preservation.

Verification

  • bun test v1.3.14 (0d9b296a) (blocked locally: missing pi_natives native addon for linux-x64; CI should validate in a built environment).

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4675946895

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

}
out.push(enrichModelThinking(item as Model<TApi>));
}
applyGeneratedModelPolicies(out as Model<Api>[]);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply GPT-5.5 clamp after all merges

This call only normalizes the static list (and the fresh-cache fast path). In the online/offline merge path, fetched dynamic rows and stale cache rows are normalized with normalizeModelList and merged afterward; mergeDynamicModel then takes dynamicModel.contextWindow via preferDiscoveryLimit and only calls enrichModelThinking, so a discovered or cached gpt-5.5 with 1_100_000 tokens overwrites the 400_000 value set here. That leaves the resolved model over cap for the exact online discovery scenario covered by the new test, so the policy needs to run on the final merged result or inside the dynamic/cache normalization path.

Useful? React with 👍 / 👎.

@Yeachan-Heo

Copy link
Copy Markdown
Owner Author

REQUEST_CHANGES

Red-team review found merge blockers:

  1. CI is red. Affected path validation / test:packages/ai/test/issue-489-repro.test.ts fails on the new test:

    • Expected gpt-5.5 contextWindow 400000
    • Received 1100000
  2. The implementation applies applyGeneratedModelPolicies() only inside passModelList() before merge. That clamps the static model, but the dynamic row is normalized through normalizeModelList() and then mergeDynamicModel() overwrites contextWindow via preferDiscoveryLimit(dynamicModel.contextWindow, existingModel.contextWindow). So a dynamic gpt-5.5 discovery value like 1_100_000 still wins and the clamp is ineffective in the online path.

  3. gpt-5.5-pro is at risk of being incorrectly clamped. parseOpenAIModel() is not anchored and does not know a pro variant, so gpt-5.5-pro parses as base gpt-5.5 at the word boundary before -pro. Since applyGpt55ContextWindow() checks only base 5.5, applying generated policies at resolution time can silently change gpt-5.5-pro from its catalog 1_050_000 window to 400_000. The fix needs to distinguish the exact base model from *-pro/unknown suffixes, and tests should cover pro/override behavior.

Not merge-ready until CI is green and the clamp is applied after dynamic merge or otherwise made authoritative without clobbering static transport/baseUrl overrides or pro models.


[repo owner's gaebal-gajae (clawdbot) 🦞]

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dbd88d3a3c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

}
out.push(enrichModelThinking(item as Model<TApi>));
}
applyGeneratedModelPolicies(out as Model<Api>[]);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve cached explicit thinking metadata

When the fresh-cache fingerprint fast path returns cached dynamic rows, this new policy pass routes them through applyGeneratedModelPolicies, which calls refreshModelThinking and ignores any explicit thinking saved in the cache. For dynamic providers such as Ollama that discover and cache endpoint-provided thinking ranges, a warm cache can widen a model from its advertised range (for example minimal..high) to the generic inferred OpenAI Responses range (minimal..xhigh), causing later requests to send unsupported effort levels. Apply the generated context correction without refreshing explicit thinking for cache/custom rows.

Useful? React with 👍 / 👎.

contextPromotionTarget: dynamicModel.contextPromotionTarget ?? existingModel.contextPromotionTarget,
});
const policyModels = [merged as Model<Api>];
applyGeneratedModelPolicies(policyModels);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Clamp dynamic-only GPT-5.5 rows

This policy call only runs when a dynamic row matches an existing model and reaches mergeDynamicModel; providers with no static row for the id still return or insert the dynamic model directly, so a discovered/cache-only gpt-5.5 from configured providers or empty-catalog providers like vLLM/LM Studio can keep the stale context window. Fresh evidence in this revision is that the new policy application is scoped to the matched-model helper rather than the final merged list or dynamic normalization path.

Useful? React with 👍 / 👎.

@Yeachan-Heo

Copy link
Copy Markdown
Owner Author

REQUEST_CHANGES

Revalidated follow-up commit dbd88d3a against the previous reviewed head.

Resolved:

  • The dynamic merge path now reapplies generated policies after mergeDynamicModel(), so the prior failing case (gpt-5.5 dynamic discovery reporting 1_100_000) is clamped back to 400_000 after merge.
  • Static transport/baseUrl authority is still preserved: the merge still forces api: existingModel.api and keeps the existing baseUrl when the dynamic transport differs.
  • Exact-head CI is green for the listed checks, including Affected path validation, check:@gajae-code/ai, cli-smoke, native-build, focused issue-489-repro, and gjc-state-gates.

Still blocked:

  • The gpt-5.5-pro / unknown-suffix issue from the earlier review remains unresolved and uncovered. parseOpenAIModel() still uses /gpt-(\d+(?:\.\d+){0,2})(?:-(codex-spark|codex-mini|codex-max|codex|mini|max|nano))?\b/, so gpt-5.5-pro matches the prefix gpt-5.5 as variant base. Because this PR now applies applyGeneratedModelPolicies() in passModelList() and after dynamic merge, resolved gpt-5.5-pro catalog entries can be silently clamped from their intended ~1,050,000 context window to 400,000.

Required fix: make the GPT-5.5 cap apply only to the exact base model (not gpt-5.5-pro or other unknown suffixes), and add regression coverage proving gpt-5.5 clamps while gpt-5.5-pro remains at its configured window.

This supersedes my previous REQUEST_CHANGES comment only for the dynamic-merge clamp and transport/baseUrl items; the pro/unknown-suffix blocker remains.


[repo owner's gaebal-gajae (clawdbot) 🦞]

@Yeachan-Heo

Copy link
Copy Markdown
Owner Author

REQUEST_CHANGES

Final revalidation of fixed head dbd88d3a:

  • dbd88d3a resolves the previous dynamic-merge clamp blocker: merged gpt-5.5 rows now reapply generated policies after mergeDynamicModel(), so dynamic discovery cannot leave gpt-5.5 at 1_100_000.
  • Static transport/baseUrl authority remains preserved by the existing merge logic.
  • Merge is still blocked: parseOpenAIModel() still parses gpt-5.5-pro, gpt-5.5-preview, and other unknown suffix IDs as base gpt-5.5, so generated policy application can clamp those non-base models to 400_000 incorrectly.
  • The PR still has no regression coverage proving gpt-5.5-pro / preview / unknown suffix behavior is preserved.

Required before merge: make the GPT-5.5 context cap exact-base-only, and add regression coverage that gpt-5.5 clamps while gpt-5.5-pro and unknown suffix IDs keep their configured context windows.


[repo owner's gaebal-gajae (clawdbot) 🦞]

@Yeachan-Heo

Copy link
Copy Markdown
Owner Author

Pushed follow-up commit 6a1ee281 to address the REQUEST_CHANGES review.

Evidence:

  • GPT-5.5 parsing is now exact-base-only, so only gpt-5.5 receives the 400K generated-policy clamp.
  • gpt-5.5-pro, gpt-5.5-preview, and unknown suffix IDs keep their configured/discovered context windows.
  • Added focused regression coverage for base vs pro/preview/unknown suffix behavior and static override preservation.
  • Verification passed: bun test packages/ai/test/issue-489-repro.test.ts packages/ai/test/model-thinking.test.ts → 23 pass, 0 fail.


[repo owner's gaebal-gajae (clawdbot) 🦞]

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6a1ee281ad

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

}
out.push(enrichModelThinking(item as Model<TApi>));
}
applyGeneratedModelPolicies(out as Model<Api>[]);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply the clamp to bundled startup models

This policy pass only covers model lists that go through resolveProviderModels. The coding-agent startup path loads built-ins directly in #loadBuiltInModels via getBundledModels (packages/coding-agent/src/config/model-registry.ts:1149-1155), and the bundled openai/gpt-5.5 row still has contextWindow: 1050000 plus a promotion target to openai/gpt-5.4, so before discovery runs (or when discovery is skipped for missing credentials) users selecting openai/gpt-5.5 still hit the stale cap/demotion behavior this change is meant to fix. Apply the policy at bundled-model load time or regenerate the catalog as well.

Useful? React with 👍 / 👎.


function parseOpenAIModel(modelId: string): OpenAIModel | null {
const match = /gpt-(\d+(?:\.\d+){0,2})(?:-(codex-spark|codex-mini|codex-max|codex|mini|max|nano))?\b/.exec(modelId);
const match = /gpt-(\d+(?:\.\d+){0,2})(?:-(codex-spark|codex-mini|codex-max|codex|mini|max|nano))?$/.exec(modelId);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve parsing for date-pinned GPT models

Anchoring the OpenAI parser to the end of the id makes date-stamped aliases such as gpt-5-mini-2025-08-07 and gpt-5.1-2025-11-13 stop parsing as GPT-5/GPT-5.1 models, even though the bundled catalog declares their thinking range only up to high. When enrichModelThinking sees them as unknown, the openai-responses fallback advertises xhigh, so these catalog rows can now allow and send unsupported reasoning levels; keep accepting release-date suffixes while still excluding the gpt-5.5 variants that should not be clamped.

Useful? React with 👍 / 👎.

@Yeachan-Heo Yeachan-Heo force-pushed the issue-756-context-cap-display branch from 6a1ee28 to 8d2912c Compare June 16, 2026 12:06
@Yeachan-Heo

Copy link
Copy Markdown
Owner Author

Pushed follow-up commit 8d2912c5 to fix the dev/PR check failure in packages/ai/src/stream.ts.

Evidence:


[repo owner's gaebal-gajae (clawdbot) 🦞]

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8d2912c573

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


function parseOpenAIModel(modelId: string): OpenAIModel | null {
const match = /gpt-(\d+(?:\.\d+){0,2})(?:-(codex-spark|codex-mini|codex-max|codex|mini|max|nano))?\b/.exec(modelId);
const match = /gpt-(\d+(?:\.\d+){0,2})(?:-(codex-spark|codex-mini|codex-max|codex|mini|max|nano))?$/.exec(modelId);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve parsing for suffixed GPT aliases

Beyond the already noted date-stamped IDs, this anchor also makes bundled non-date aliases such as openai/gpt-5-pro and cursor/gpt-5.2-high parse as unknown whenever provider model resolution runs. Because passModelList now refreshes thinking via applyGeneratedModelPolicies, those rows lose their explicit catalog ranges (gpt-5-pro max high, gpt-5.2-high max xhigh) and fall back to API defaults, so selection can advertise/send unsupported effort levels or clamp away supported ones for those built-in aliases.

Useful? React with 👍 / 👎.

@Yeachan-Heo

Copy link
Copy Markdown
Owner Author

Pushed follow-up commit 06ded224 to fix the remaining check:@gajae-code/ai failure.

Evidence:

  • CI failure was a TypeScript fixture error in packages/ai/test/provider-credential-error.test.ts, not the prior stream.ts format issue.
  • Completed the Model<"openai-completions"> fixture with required metadata fields.
  • Local verification passed: bun --cwd=packages/ai run check.
  • Pre-commit whitespace check passed: git diff --check.


[repo owner's gaebal-gajae (clawdbot) 🦞]

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 06ded22496

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

}
out.push(enrichModelThinking(item as Model<TApi>));
}
applyGeneratedModelPolicies(out as Model<Api>[]);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Clear stale GPT-5.5 promotion targets

When this new policy pass clamps a bundled/cache gpt-5.5 row, it leaves any existing contextPromotionTarget untouched; the bundled OpenAI/LiteLLM/OpenCode Zen rows still point at */gpt-5.4. With contextPromotion.enabled, AgentSession.#resolveContextPromotionConfiguredTarget follows that explicit target and accepts it if the target window is larger than the now-400K current window, so an overflow on GPT-5.5 can still switch users to GPT-5.4 instead of staying on GPT-5.5 or compacting. Clear the generated GPT-5.4 target along with the cap unless it was user-overridden.

Useful? React with 👍 / 👎.

@Yeachan-Heo

Copy link
Copy Markdown
Owner Author

MERGE_READY

Revalidated exact PR head 06ded22496d977608a46b4adda2806bb7b8e9cff against origin/dev.

This supersedes my stale REQUEST_CHANGES comments at dbd88d3a: the follow-up commits fixed the previously blocking issues.

Verified:

  • Dynamic gpt-5.5 merge now reapplies generated policies after mergeDynamicModel(), so discovered 1_100_000 context is clamped to 400_000 after merge.
  • Static transport/baseUrl authority remains preserved: static api wins, and dynamic baseUrl is ignored when the transport differs.
  • 6a1ee281 fixed exact-base protection by anchoring OpenAI model parsing, and regression coverage now preserves gpt-5.5-pro, gpt-5.5-preview, and unknown suffix IDs at their configured context windows while base gpt-5.5 clamps.
  • 8d2912c5 stream formatting change is formatting-only/safe and scoped.
  • 06ded224 provider credential fixture typing fix is safe and scoped.
  • Current GitHub state is CLEAN and all 14 exact-head checks are green.


[repo owner's gaebal-gajae (clawdbot) 🦞]

@Yeachan-Heo Yeachan-Heo merged commit a951215 into dev Jun 16, 2026
14 checks passed
@Yeachan-Heo Yeachan-Heo deleted the issue-756-context-cap-display branch June 16, 2026 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant