Add model context length and manual model deletion#520
Conversation
- New modelContextLengthCache service for in-memory model→context_length mapping - Platform adapters (newApi, standardApiProvider) extract context_length from upstream /v1/models - modelsSurface injects context_length into both OpenAI and Claude response formats - Default 1,000,000 when upstream does not provide context_length - Supports field names: context_length, contextLength, max_context_length, contextWindow, etc.
- Backend: DELETE /api/accounts/:id/models/manual endpoint - Only deletes models where isManual=true (safe against auto-discovered models) - Frontend: AccountModelsModal shows '✕ 删除' button next to each manual model - Frontend: api.removeAccountManualModels() function - Accounts.tsx: wires up delete handler with toast feedback
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds a scoped in-memory model context-length cache and threads extraction/caching through platform adapters and model discovery (per-refresh and per-scan scopes). Exposes cache APIs and payload parsing, surfaces per-model ChangesModel Context Length Cache & Tests
Context-Length Extraction & Platform Integration
Manual Model Removal: Backend Service, Route, Tests
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Server as Server\n(Model Listing)
participant Adapter as Platform\nAdapter
participant Cache as Context Length\nCache
participant DB as Database
Client->>Server: GET /models
activate Server
Server->>Adapter: adapter.getModels(..., contextSourceScope?)
activate Adapter
Adapter->>Adapter: fetch upstream models payload
Adapter->>Cache: extractContextLengthsFromPayload(payload) -> setModelContextLengths(scope)
Cache-->>Adapter: ack
Adapter-->>Server: model id list
Server->>Cache: getModelContextLength(modelId, accountScope?)
Cache-->>Server: contextLength (or default)
Server-->>Client: models + context_length
deactivate Server
sequenceDiagram
participant Client as Client
participant UI as Accounts UI
participant API as Server API
participant Service as AccountManualModelService
participant DB as Database
participant Router as RouteRebuilder
Client->>UI: click delete manual model
UI->>API: DELETE /api/accounts/:id/models/manual {models:[name]}
activate API
API->>Service: removeManualModelsFromAccount(accountId, [name])
activate Service
Service->>DB: DELETE FROM modelAvailability WHERE accountId & isManual & name IN (...)
DB-->>Service: changes count
Service->>Router: rebuildRoutesBestEffort()
Router-->>Service: done
Service-->>API: { deletedCount }
API-->>UI: { success: true }
UI-->>Client: show toast, refresh modal list
deactivate API
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: af9aac3ad7
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
|
||
| const DEFAULT_CONTEXT_LENGTH = 1_000_000; | ||
|
|
||
| const cache = new Map<string, number>(); |
There was a problem hiding this comment.
Scope context-length cache by account or endpoint
modelContextLengthCache stores values in a single process-wide map keyed only by model name, but the values are populated from per-account upstream /v1/models payloads (via setModelContextLengths in platform adapters). In a multi-account deployment where the same model ID has different limits (or one provider omits the field), the most recently refreshed account overwrites the value for all others, so /v1/models can return incorrect context_length metadata and cause clients to size prompts incorrectly for the actual selected channel.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
src/server/services/modelContextLengthCache.test.ts (1)
33-45: Add a whitespace-only model-name test case.Current invalid-name assertions cover empty string but not whitespace-only input (e.g.
' '). Adding that case would lock in the expected behavior and catch empty-key regressions.Also applies to: 67-78
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server/services/modelContextLengthCache.test.ts` around lines 33 - 45, Add tests that assert whitespace-only model names are treated as invalid: update the 'ignores invalid values' case to call setModelContextLength(' ', 128000) and expect hasModelContextLength(' ') toBe(false), and likewise add the same whitespace-only check in the other invalid-name test block (the one around lines 67-78). This ensures setModelContextLength and hasModelContextLength both trim/validate names (or reject whitespace-only keys) consistent with the empty-string behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/server/routes/api/accounts.ts`:
- Around line 1969-1983: Extract the deletion loop into a new service function
(e.g., removeManualModelsFromAccount in
src/server/services/accountManualModelsService.ts) that accepts { accountId,
modelNames } and runs the deletes inside a single db.transaction using the
transaction handle (tx) for the .delete(schema.modelAvailability) calls and the
same where(...) predicate (use input.accountId and each modelName); then replace
the loop in the route with a call to this service and only call
rebuildRoutesBestEffort() after the service resolves successfully; ensure errors
are propagated so the route can return the appropriate error response.
In `@src/server/services/modelContextLengthCache.ts`:
- Around line 13-43: The cache is currently process-global and keyed only by
model name, causing cross-source collisions; change the cache scheme to be
scoped by source by updating the cache variable and normalizeKey to incorporate
a source identifier (e.g., normalizeKey(source, modelName) or use a compound key
`${source}:${modelName}`), update setModelContextLength(source, modelName,
contextLength) and getModelContextLength(source, modelName) signatures to use
the scoped key, and modify setModelContextLengths to accept a source and
replace/refresh only that source's entries (or clear existing entries for that
source before bulk-setting) so stale or missing upstream context_length values
from one source don't affect others.
- Around line 22-34: The validation currently checks raw names but not the
normalized result, allowing whitespace-only names to become an empty key; update
setModelContextLength to compute const key = normalizeKey(modelName) and only
call cache.set(key, ...) if key is non-empty (truthy) and contextLength is
valid, and similarly update setModelContextLengths to compute const key =
normalizeKey(name) inside the loop and only write cache.set(key, ...) when key
is non-empty and length is a finite positive number.
In `@src/web/pages/Accounts.tsx`:
- Around line 3480-3486: The async handler onRemoveManualModel can reopen the
modal after the user has closed or changed it; to fix, capture the current modal
state/account (e.g., const currentAccountId = modelModal?.account?.id and const
wasOpen = modelModal?.open) before awaiting api.removeAccountManualModels, then
after the await verify that modelModal is still open and modelModal.account.id
=== currentAccountId (or wasOpen is true) before calling
loadModelModalModels(modelModal.account, {}); return early if the modal was
closed or the account changed to avoid re-opening stale UI.
---
Nitpick comments:
In `@src/server/services/modelContextLengthCache.test.ts`:
- Around line 33-45: Add tests that assert whitespace-only model names are
treated as invalid: update the 'ignores invalid values' case to call
setModelContextLength(' ', 128000) and expect hasModelContextLength(' ')
toBe(false), and likewise add the same whitespace-only check in the other
invalid-name test block (the one around lines 67-78). This ensures
setModelContextLength and hasModelContextLength both trim/validate names (or
reject whitespace-only keys) consistent with the empty-string behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2db3f102-e339-4deb-ac30-214a3b75022e
📒 Files selected for processing (9)
src/server/proxy-core/surfaces/modelsSurface.tssrc/server/routes/api/accounts.tssrc/server/services/modelContextLengthCache.test.tssrc/server/services/modelContextLengthCache.tssrc/server/services/platforms/newApi.tssrc/server/services/platforms/standardApiProvider.tssrc/web/api.tssrc/web/pages/Accounts.tsxsrc/web/pages/accounts/AccountModelsModal.tsx
| try { | ||
| for (const modelName of normalizedModels) { | ||
| await db | ||
| .delete(schema.modelAvailability) | ||
| .where( | ||
| and( | ||
| eq(schema.modelAvailability.accountId, accountId), | ||
| eq(schema.modelAvailability.modelName, modelName), | ||
| eq(schema.modelAvailability.isManual, true), | ||
| ), | ||
| ) | ||
| .run(); | ||
| } | ||
| await rebuildRoutesBestEffort(); | ||
|
|
There was a problem hiding this comment.
Extract deletion workflow to a service and execute it atomically.
Line 1970-Line 1981 performs DB mutation orchestration directly in the route, and a mid-loop failure can leave partial deletions while returning an error.
Proposed refactor direction
- try {
- for (const modelName of normalizedModels) {
- await db
- .delete(schema.modelAvailability)
- .where(
- and(
- eq(schema.modelAvailability.accountId, accountId),
- eq(schema.modelAvailability.modelName, modelName),
- eq(schema.modelAvailability.isManual, true),
- ),
- )
- .run();
- }
+ try {
+ await removeManualModelsFromAccount({
+ accountId,
+ modelNames: normalizedModels,
+ });
await rebuildRoutesBestEffort();
return { success: true };
} catch (err: any) {// src/server/services/accountManualModelsService.ts
export async function removeManualModelsFromAccount(input: {
accountId: number;
modelNames: string[];
}) {
await db.transaction(async (tx) => {
for (const modelName of input.modelNames) {
await tx
.delete(schema.modelAvailability)
.where(
and(
eq(schema.modelAvailability.accountId, input.accountId),
eq(schema.modelAvailability.modelName, modelName),
eq(schema.modelAvailability.isManual, true),
),
)
.run();
}
});
}As per coding guidelines: "Route files in src/server/routes/** are adapters, not owners... must not own ... persistence."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/server/routes/api/accounts.ts` around lines 1969 - 1983, Extract the
deletion loop into a new service function (e.g., removeManualModelsFromAccount
in src/server/services/accountManualModelsService.ts) that accepts { accountId,
modelNames } and runs the deletes inside a single db.transaction using the
transaction handle (tx) for the .delete(schema.modelAvailability) calls and the
same where(...) predicate (use input.accountId and each modelName); then replace
the loop in the route with a call to this service and only call
rebuildRoutesBestEffort() after the service resolves successfully; ensure errors
are propagated so the route can return the appropriate error response.
|
请处理coderabbit ai给出的comments和CI错误 |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0b2201e8c2
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| return; | ||
| } | ||
|
|
||
| cache.set(scopeKey, nextScopeCache); |
There was a problem hiding this comment.
Preserve context-length entries across credential scans
setModelContextLengths overwrites the whole scope cache each time it is called, but refreshModelsForAccount now passes the same account scope into multiple adapter.getModels(...) calls (account credential plus each managed token). If a later call returns only a subset of models or omits context_length, this replacement drops entries discovered earlier in the same refresh, so /v1/models reports fallback 1_000_000 for still-routable models from earlier credentials. This is reproducible on accounts with multiple tokens that expose different model metadata.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/server/routes/api/accounts.ts (1)
1951-1958: 💤 Low valueConsider removing duplicate normalization logic.
The normalization at lines 1951-1955 (trim, dedupe, filter empty) is duplicated in
removeManualModelsFromAccountat lines 9-11 ofsrc/server/services/accountManualModelService.ts. Since the service already handles this defensively, the route could pass the rawmodelsarray directly and let the service normalize it, reducing duplication.♻️ Simplified route validation
- const normalizedModels = Array.from( - new Set( - models.map((m) => String(m).trim()).filter((m) => m.length > 0), - ), - ); - if (normalizedModels.length === 0) { - return reply.code(400).send({ message: "模型列表不能为空" }); - } - const account = await db .select() .from(schema.accounts) .where(eq(schema.accounts.id, accountId)) .get(); if (!account) { return reply.code(404).send({ message: "账号不存在" }); } try { - await removeManualModelsFromAccount(accountId, normalizedModels); + const result = await removeManualModelsFromAccount(accountId, models); + if (result.deletedCount === 0) { + return reply.code(400).send({ message: "没有匹配的手动模型可删除" }); + } return { success: true };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/server/routes/api/accounts.ts` around lines 1951 - 1958, Remove the duplicated normalization block that builds normalizedModels and instead pass the raw models array to the service; specifically delete the Array.from(new Set(...trim...)) logic and any use of normalizedModels, replace it with a simple sanity check that models is an array (e.g., if (!Array.isArray(models) || models.length === 0) return reply.code(400)...), and call removeManualModelsFromAccount(accountId, models) so the service (removeManualModelsFromAccount) performs the trimming/deduping/filtering.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/server/services/modelContextLengthCache.ts`:
- Around line 52-54: The function buildEndpointModelContextLengthScope currently
uses the raw baseUrl via normalizeKey, causing semantically-equivalent URLs
(trailing slashes, default ports, case differences) to map to different cache
keys; update it to first canonicalize the endpoint (e.g., parse with the URL
constructor, lowercase scheme/host, remove default ports like :80/:443, strip
trailing slash, and include pathname consistently) and then call normalizeKey on
that canonicalized string (you can add a helper like canonicalizeEndpointUrl or
extend normalizeKey usage) so equivalent endpoints produce the same scope key.
---
Nitpick comments:
In `@src/server/routes/api/accounts.ts`:
- Around line 1951-1958: Remove the duplicated normalization block that builds
normalizedModels and instead pass the raw models array to the service;
specifically delete the Array.from(new Set(...trim...)) logic and any use of
normalizedModels, replace it with a simple sanity check that models is an array
(e.g., if (!Array.isArray(models) || models.length === 0) return
reply.code(400)...), and call removeManualModelsFromAccount(accountId, models)
so the service (removeManualModelsFromAccount) performs the
trimming/deduping/filtering.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: df3cd867-ba5c-4f0a-adfe-542b6bdf92d9
📒 Files selected for processing (19)
src/server/proxy-core/surfaces/modelsSurface.test.tssrc/server/proxy-core/surfaces/modelsSurface.tssrc/server/routes/api/accounts.manual-models.test.tssrc/server/routes/api/accounts.tssrc/server/services/accountManualModelService.tssrc/server/services/modelContextLengthCache.test.tssrc/server/services/modelContextLengthCache.tssrc/server/services/modelService.tssrc/server/services/platforms/base.tssrc/server/services/platforms/claude.tssrc/server/services/platforms/cliproxyapi.tssrc/server/services/platforms/gemini.tssrc/server/services/platforms/newApi.tssrc/server/services/platforms/oneApi.tssrc/server/services/platforms/oneHub.tssrc/server/services/platforms/openai.tssrc/server/services/platforms/standardApiProvider.tssrc/server/services/platforms/veloera.tssrc/web/pages/Accounts.tsx
🚧 Files skipped from review as they are similar to previous changes (4)
- src/server/proxy-core/surfaces/modelsSurface.ts
- src/server/services/platforms/standardApiProvider.ts
- src/web/pages/Accounts.tsx
- src/server/services/modelContextLengthCache.test.ts
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 039d0d92dc
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| } | ||
| }; | ||
|
|
||
| const beginModelContextScanScope = () => `${modelContextScope}:scan:${modelContextScanCounter += 1}`; |
There was a problem hiding this comment.
Make per-scan context cache keys globally unique
The temporary scope key generated for each credential scan is only account:<id>:scan:<counter> where counter is local to a single refreshModelsForAccount invocation, so two concurrent refreshes for the same account will reuse identical scope names (e.g. both start at scan:1). Because each scan unconditionally calls clearModelContextLengthCache(sourceScope) in finally, one refresh can clear the other refresh's in-flight context-length data before it is collected, producing missing or mixed context_length values in the account cache and incorrect /v1/models metadata under concurrent refresh triggers.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/server/services/modelService.discovery.test.ts (1)
147-210: ⚡ Quick winAdd a second-refresh regression for account-scope replacement.
This only proves scan-scope merging within a single refresh. A follow-up successful refresh that returns fewer models is the case that will catch stale account-scope entries if the final write merges instead of replaces.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/server/services/modelService.discovery.test.ts` around lines 147 - 210, The test only exercises merging within a single refresh; to catch the regression where a later successful refresh that returns fewer models might replace (not preserve) account-scoped context-length entries, update this test to perform a second refresh: after the first refresh (which returns model-a and model-b) change the mocked getModels behavior (or rely on the existing branch for 'managed-token-subset') to simulate a subsequent refresh that returns only model-a, call refreshModelsForAccount(account.id) again, and then assert that getModelContextLength('model-b', buildAccountModelContextLengthScope(account.id)) still returns 256000; use the existing helpers refreshModelsForAccount, buildAccountModelContextLengthScope, getModelContextLength and setModelContextLengths to implement the two-step scenario.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/server/services/modelService.discovery.test.ts`:
- Around line 147-210: The test only exercises merging within a single refresh;
to catch the regression where a later successful refresh that returns fewer
models might replace (not preserve) account-scoped context-length entries,
update this test to perform a second refresh: after the first refresh (which
returns model-a and model-b) change the mocked getModels behavior (or rely on
the existing branch for 'managed-token-subset') to simulate a subsequent refresh
that returns only model-a, call refreshModelsForAccount(account.id) again, and
then assert that getModelContextLength('model-b',
buildAccountModelContextLengthScope(account.id)) still returns 256000; use the
existing helpers refreshModelsForAccount, buildAccountModelContextLengthScope,
getModelContextLength and setModelContextLengths to implement the two-step
scenario.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fd20c32d-329c-4a55-ab17-a0b2a464c013
📒 Files selected for processing (2)
src/server/services/modelService.discovery.test.tssrc/server/services/modelService.ts
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: eaa3279833
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const decision = await input.tokenRouter.explainSelection(modelName, [], input.downstreamPolicy); | ||
| if (typeof decision.selectedChannelId === 'number') { | ||
| allowed.push(modelName); | ||
| allowed.push({ | ||
| id: modelName, | ||
| selectedAccountId: decision.selectedAccountId, |
There was a problem hiding this comment.
Derive context_length from all eligible channels
/v1/models now picks context_length from a single explainSelection result, but route selection is strategy-dependent (round-robin/weighted/stable-first) and can choose different channels/accounts for the same model across requests. In multi-channel routes where the same model has different limits (or one channel has no cached metadata and falls back to 1_000_000), this can overstate the safe window and cause intermittent upstream failures when a lower-limit channel is chosen at dispatch time. The value should be computed from all eligible routing candidates (e.g., conservative minimum) rather than one sampled account.
Useful? React with 👍 / 👎.
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
就一个传个context length下去,改这么多回,老有所谓的设计问题,hermes默认就是256k,难道我去搞hermes? |
|
虽然从设计上是有问题,中转站,多个相同模型的context length理论上可能不同,但实际上同一个模型的context length又怎么可能不同?或者说,我给了默认值1M,实际比这个小,调用就会报错,用户再手动压缩就行了啊,给一个理论最大值没毛病啊。 |
Title
Add model context length and manual model deletion
Body
Summary
context_lengthto/v1/modelsresponsesTest plan
/v1/modelsincludescontext_lengthSummary by CodeRabbit
New Features
Tests