Skip to content

fix: make max_tokens configurable (closes #29)#78

Open
lefarcen wants to merge 5 commits intonexu-io:mainfrom
lefarcen:fix/configurable-max-tokens-29
Open

fix: make max_tokens configurable (closes #29)#78
lefarcen wants to merge 5 commits intonexu-io:mainfrom
lefarcen:fix/configurable-max-tokens-29

Conversation

@lefarcen
Copy link
Copy Markdown
Contributor

Summary

Fixes issue where users with Anthropic-compatible APIs (e.g., Xiaomi MiMo) hit the hardcoded 8192 token limit, causing design tasks requiring large artifacts (15-30k tokens) to be truncated mid-generation.

Problem

User @majiabin2020 reported (#29) that when using Anthropic API + Xiaomi MiMo v2.5:

  • Design tasks would be interrupted mid-generation
  • Artifacts were incomplete (stop_reason: max_tokens)
  • Previews showed only background color, no prototype content

Root cause: src/providers/anthropic.ts:44 hardcoded max_tokens: 8192, but design tasks often require 15-30k tokens for complete artifacts.

Solution

Make max_tokens configurable via Settings:

  • Backend: Add AppConfig.maxTokens?: number field (defaults to 8192)
  • Provider: Update anthropic.ts to use cfg.maxTokens || 8192
  • UI: Add "Max Tokens" input in Settings (API mode only)
  • i18n: Add translations (EN/ZH-CN) with helpful hint explaining when to increase

Users can now configure maxTokens to match their provider's limits (e.g., 32768+ for large design tasks).

Testing

  • Existing behavior preserved: omitting maxTokens still defaults to 8192
  • New behavior works: setting maxTokens: 32768 in Settings allows larger artifacts
  • UI validation: input accepts 1024-200000, step 1024
  • i18n: both EN and ZH-CN translations display correctly

Closes

Closes #29

cc @PerishCode @majiabin2020

@lefarcen lefarcen added the enhancement New feature or request label Apr 29, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 50a9d145e5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/components/SettingsDialog.tsx Outdated
Copy link
Copy Markdown
Contributor Author

@lefarcen lefarcen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lefarcen! 👋 Nice fix for issue #29 — making max_tokens configurable is the right approach for users with Anthropic-compatible APIs that support higher limits.

Overall

Correctness: Logic is sound. Fallback to 8192 preserves existing behavior.
Security: No new attack surface (input validation covers the range).
UI/UX: Clear hint text, sensible min/max/step.
i18n: Both EN and ZH-CN translations present.

Lens A findings (code correctness)

Two P3 suggestions inline — both non-blocking, but would improve robustness.

Comment thread src/components/SettingsDialog.tsx Outdated
Comment thread src/providers/anthropic.ts Outdated
Copy link
Copy Markdown

Merge gate review summary:

Status: GO

Why:

  • Fixes real truncation issue caused by hardcoded max_tokens.
  • Change is localized and backward-compatible (default remains 8192).

Required checks before merge:

  1. pnpm typecheck
  2. API mode with default max tokens (8192) still works.
  3. API mode with custom max tokens (e.g. 32768) works.
  4. Validate settings input bounds and persistence behavior.

Risk:

  • Low to medium (API provider path only).

Rollback:

  • Revert this PR commit; fallback returns to hardcoded value.

@lefarcen
Copy link
Copy Markdown
Contributor Author

Thanks for the structured review summary, @roberthgnz! 👍

Agreed on GO status — the two P3 suggestions I flagged earlier are nice-to-haves, not blockers. The core fix is sound and addresses the real truncation issue.

@lefarcen This is ready to merge from my side. If you want to address the P3 input validation suggestions (clamping in JS + guarding against negative values), feel free, but not required.

BYOK users on custom Anthropic-compatible providers (e.g. Xiaomi MiMo)
hit the hardcoded 8192 cap and saw artifacts truncated mid-stream.

- AppConfig.maxTokens with Settings input (EN/CN + 8 other locales)
- ProxyStreamRequest.maxTokens contract field
- anthropic, anthropic-compatible, and openai-compatible providers all
  forward cfg.maxTokens
- /api/proxy/anthropic/stream and /api/proxy/stream payloads honor it,
  defaulting to 8192 when unset so prior clients are unaffected

Original sketch by @mashu in nexu-io#78 (50a9d14); rebased to the apps/web
layout and extended to the proxy paths actually used when baseUrl is
set, which is where nexu-io#29's user actually traffics.
@lefarcen lefarcen force-pushed the fix/configurable-max-tokens-29 branch from 50a9d14 to 6a3ae5f Compare May 2, 2026 03:52
lefarcen added 4 commits May 2, 2026 12:16
Adds a hand-maintained MODEL_MAX_TOKENS table (Claude 4.5 line → 64k,
mimo-v2.5-pro → 32k) and an effectiveMaxTokens helper layered over the
override field added in 6a3ae5f, so nexu-io#29's user — and others on supported
models — don't have to discover Settings to avoid mid-stream truncation.

- apps/web/src/state/maxTokens.ts: lookup + helpers
- providers/{anthropic,anthropic-compatible,openai-compatible}.ts:
  forward effectiveMaxTokens(cfg) instead of cfg.maxTokens ?? 8192
- SettingsDialog: input becomes an optional override (blank = default,
  shown as placeholder)
- 10 locale hint strings updated to the new semantics
Replaces the 4-entry hand-rolled MODEL_MAX_TOKENS map from 544e67e with
a vendored slice of BerriAI/litellm's model_prices_and_context_window
JSON (1970 chat models, ~97KB raw / ~25KB gzip). Future model launches
land in maxTokens.ts via `pnpm sync-litellm-models` instead of manual
edits.

- scripts/sync-litellm-models.ts: fetches the upstream JSON, filters to
  chat-mode entries, projects each entry to its max_output_tokens (or
  max_tokens fallback), and writes a sorted, license-attributed JSON
- apps/web/src/state/litellm-models.json: generated artifact, committed
- apps/web/src/state/maxTokens.ts: lookup is now
  OVERRIDES → LITELLM_MODELS → FALLBACK_MAX_TOKENS. The OVERRIDES table
  shrinks to just `mimo-v2.5-pro` (LiteLLM only ships MiMo via
  OpenRouter/Novita aliases, not the canonical id Xiaomi's API uses).

LiteLLM is MIT-licensed (BerriAI/litellm/blob/main/LICENSE); attribution
is preserved in both the script header and the generated JSON's
_license field.
- apps/web/src/state/maxTokens.test.ts: six vitest cases pinning the
  three-tier lookup (override → LiteLLM → fallback) and the
  effectiveMaxTokens user-override path. Guards against a future sync
  silently dropping the Anthropic 4.5 entries we rely on.
- CONTRIBUTING.md / CONTRIBUTING.zh-CN.md: new "Updating model
  max_tokens metadata" section pointing future maintainers at
  scripts/sync-litellm-models.ts and explaining when OVERRIDES is
  appropriate (it's the rare exception, not the default).
The Settings field is optional (blank means "use the per-model default")
but the label gave no visual cue, breaking the implicit pattern that
every other API-mode field (key/model/baseUrl) is required. Append
"(optional)" — using the locale's natural parenthetical convention
(Chinese full-width brackets, Japanese 任意, Russian опционально, etc.)
— so the field reads as discretionary at a glance.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

为什么编码过程中一直会被中断

3 participants