refactor(rate-limit): key free-model limit on user id for authenticated requests#3004
Open
kilo-code-bot[bot] wants to merge 11 commits intomainfrom
Open
refactor(rate-limit): key free-model limit on user id for authenticated requests#3004kilo-code-bot[bot] wants to merge 11 commits intomainfrom
kilo-code-bot[bot] wants to merge 11 commits intomainfrom
Conversation
…ed requests Authenticated free-model requests are now rate-limited per user id regardless of feature or source IP. Anonymous requests continue to be rate-limited per IP, counting only anonymous usage so they aren't skewed by authenticated users on shared IPs. This removes the feature/Cloudflare-IP special case that existed for cloud-agent, code-review and app-builder.
Contributor
Author
Code Review SummaryStatus: No Issues Found | Recommendation: Merge Files Reviewed (4 files)
Reviewed by gpt-5.5-2026-04-23 · 1,170,243 tokens |
… limit - Rate Limit Testing now operates on the admin's user id (authenticated) instead of their IP; previously the inserted anonymous rows had no effect on the admin's own per-user limit. - Stats endpoint splits "at limit" into anonymous IPs (anonymous-only count) and authenticated users (per-user count), matching the two gates. - Updated page and card copy to describe the hybrid user/IP semantics.
…uter route Remove the `noFreeModelsAvailableResponse` import and the corresponding check after `applyResolvedAutoModel` in the OpenRouter API route. Also includes minor formatting updates to admin components.
Merge with main dropped the autoResult.kind check, so kilo-auto requests with no free candidates continued through rate limiting and were sent upstream as the synthetic auto-model id instead of returning the 503. Also re-apply oxfmt formatting (repo oxfmt 0.40.0).
Use the actual x-forwarded-for IP instead of a sentinel string so the inserted rows remain useful in analytics and match the shape of normal free-model usage rows.
The two rate-limit gate cards (anonymous IPs and users) are the actionable signal on this page. Accent them with a primary border at rest and a destructive border + red number when the count is non-zero.
Contributor
|
@chrarnoldus - maybe we could have a the user-based sign-in rate limit be its own little section so you can see how many are affected, and then which actual userid's like you can with IP's?
|
Per Josh's review feedback, surface the per-user rate-limit signal as its own section showing both the count of users at the limit and a table of the actual user ids (with name/email/avatar via the existing UserAvatarLink). Removes the now-duplicate 'Users at Limit' card from the aggregate stats grid.
Contributor
|
@chrarnoldus - I'm still seeing a similar set of cards?
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Summary
free_model_usageso the limit is not skewed by authenticated users sharing the same IP.USER_RATE_LIMITED_FEATURES/isUserRateLimitedFeature/ Cloudflare-IP special case forcloud-agent,code-review, andapp-builder— they are handled uniformly now./admin/free-model-usagepanel to reflect the new semantics (see below).Behavioural notes:
resolveRateLimitnow awaits auth before deciding the key. The pre-auth IP fast-path for authenticated callers is gone; the trade-off is correct per-user accounting across shared-infra IPs.checkFreeModelRateLimit(ipAddress)is now anonymous-only;checkFreeModelRateLimitByUser(userId)is unchanged and used for every authenticated request.checkPromotionLimit(10k/24h anonymous gate) is unaffected.Admin panel updates
RateLimitTestingnow rate-limits the admin's own user id (getMyUsage/rateLimitMe). The previous version inserted anonymous rows for the admin's IP, which no longer affects the admin's per-user limit after this change.windowIpsAtRequestLimitwith two separate counters matching the two gates:windowAnonymousIpsAtRequestLimit— anonymous IPs whose anonymous-only count has reached the limit.windowUsersAtRequestLimit— authenticated users whose per-user count has reached the limit.Verification
apps/web/src/app/api/openrouter/[...path]/route.tsto confirm anonymous free-model requests still run through the promotion limit, and authenticated ones hit the user-keyed counter.kilo_user_id, so it actually triggers a 429 for the admin's own subsequent authenticated requests.feature-detection.test.tsto confirm only the removed helper's tests needed deletion.Full
pnpm typecheck/pnpm test/pnpm formatwere skipped locally (sandbox has nonode_modules); CI will run them.Visual Changes
Admin panel — the "IPs at Request Limit" card is replaced with two cards: "Anonymous IPs at Limit" and "Users at Limit". Intro copy updated to describe the per-user / per-IP split.
Reviewer Notes
Database indexes on
free_model_usageNo migration is introduced; the existing indexes cover the new access pattern:
idx_free_model_usage_user_created_aton(kilo_user_id, created_at) WHERE kilo_user_id IS NOT NULL— already servescheckFreeModelRateLimitByUser. The partial predicate matches the query predicate, so read and write costs are unchanged; the index simply becomes the hot read path for all authenticated features instead of only the three server-side ones.idx_free_model_usage_ip_created_aton(ip_address, created_at)— still used forcheckFreeModelRateLimit, but the query now addskilo_user_id IS NULL. Postgres will range-scan the index and filter out non-null rows. Given the 1-hour window and per-IP volumes this is fine, but a follow-up could make this a partial indexWHERE kilo_user_id IS NULL(or drop it in favor of one) to avoid reading rows that will be filtered out. Not required for correctness.idx_free_model_usage_created_at— unaffected (admin analytics / cleanup cron).No write-amplification change: every insert already updates the same set of indexes (the partial user index only fires for authenticated rows, which is unchanged).