feat: make Gemini flash lite models the default across tools#2990
feat: make Gemini flash lite models the default across tools#2990google-labs-jules[bot] wants to merge 6 commits into
Conversation
- Updates `geminiModelPicker.ts` to default to `lite` instead of `flash`. - Adjusts `modelPicker.ts` to fallback to `gemini-3.1-flash-lite`. - Modifies `geminiVisualReviewClient` and `geminiCodeReviewClient` to prioritize `lite`/`flash` appropriately. - Updates URL endpoints in `useUXAuditor.ts` and `ai_service.py` to point to `gemini-3.1-flash-lite`. - Reduces baseline costs for automated tooling.
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
🚀 Deployment Details (Last updated: Jun 26, 2026, 3:59 PM PST) 🚀 Pushed to gh-pages; publish in progress
|
👁️ Gemini Code Review Agent
Reviewing: PR #2990 Code Review Feedback[ARCHITECTURE] ReviewError: failed to execute ARCHITECTURE review. Details: [GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-3.1-flash-lite:generateContent: [429 Too Many Requests] Your project has exceeded its monthly spending cap. Please go to AI Studio at https://ai.studio/spend to manage your project spend cap. Learn more at https://ai.google.dev/gemini-api/docs/billing#project-spend-caps. [PERFORMANCE] ReviewError: failed to execute PERFORMANCE review. Details: [GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-3.1-flash-lite:generateContent: [429 Too Many Requests] Your project has exceeded its monthly spending cap. Please go to AI Studio at https://ai.studio/spend to manage your project spend cap. Learn more at https://ai.google.dev/gemini-api/docs/billing#project-spend-caps. [SECURITY] ReviewError: failed to execute SECURITY review. Details: [GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-3.1-flash-lite:generateContent: [429 Too Many Requests] Your project has exceeded its monthly spending cap. Please go to AI Studio at https://ai.studio/spend to manage your project spend cap. Learn more at https://ai.google.dev/gemini-api/docs/billing#project-spend-caps. [STYLE] ReviewError: failed to execute STYLE review. Details: [GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-3.1-flash-lite:generateContent: [429 Too Many Requests] Your project has exceeded its monthly spending cap. Please go to AI Studio at https://ai.studio/spend to manage your project spend cap. Learn more at https://ai.google.dev/gemini-api/docs/billing#project-spend-caps. Generated by gemini-code-review |
🐙 GitHub Models Code Review
Reviewing: PR #2990 Model: gpt-4o Code Review Feedback[ARCHITECTURE] ReviewReview SummaryThe PR introduces changes to default Gemini model selection across multiple files, prioritizing the High Severity Issues
Questions
Nitpicks
Final VerdictThe PR introduces significant changes to the model selection logic and API calls. While the changes align with the stated goal of prioritizing cost-effective models, there are high-severity issues related to security, logic, and error handling that must be addressed before merging. [PERFORMANCE] ReviewError: failed to execute PERFORMANCE review. Details: GitHub Models API error: 429 Too Many Requests - {"error":{"code":"RateLimitReached","message":"Rate limit of 2 per 0s exceeded for UserConcurrentRequests. Please wait 0 seconds before retrying.","details":"Rate limit of 2 per 0s exceeded for UserConcurrentRequests. Please wait 0 seconds before retrying."}} [SECURITY] ReviewReview SummaryThis pull request modifies the default Gemini model selection across multiple files in the repository, switching from higher-tier models ( FindingsHIGH Severity Issues
MEDIUM Severity Issues
LOW Severity Issues
Recommendations
Final VerdictStructured JSON Summary[STYLE] ReviewError: failed to execute STYLE review. Details: GitHub Models API error: 429 Too Many Requests - {"error":{"code":"RateLimitReached","message":"Rate limit of 2 per 0s exceeded for UserConcurrentRequests. Please wait 0 seconds before retrying.","details":"Rate limit of 2 per 0s exceeded for UserConcurrentRequests. Please wait 0 seconds before retrying."}} Generated by github-models-code-review |
- Updates model fallbacks to `gemini-2.5-flash-lite` across all files to reduce costs. - Fixes security vulnerability by moving the `effectiveApiKey` and `self.gemini_api_key` from the URL string to the `x-goog-api-key` header in both TS and Python scripts. - Adjusts tiering logic in `modelPicker.ts` to favor `lite` for up to 1M tokens. - Adjusts `thinkingBudget` scaling in `geminiCodeReviewClient.ts` based on token estimates.
- Updates model fallbacks to `gemini-2.5-flash-lite` across all files to reduce costs. - Fixes security vulnerability by moving the `effectiveApiKey` and `self.gemini_api_key` from the URL string to the `x-goog-api-key` header in both TS and Python scripts. - Adjusts tiering logic in `modelPicker.ts` to favor `lite` for up to 1M tokens. - Adjusts `thinkingBudget` scaling in `geminiCodeReviewClient.ts` based on token estimates.
🤖 AI Technical AuditANTI-AI-SLOPThe pull request demonstrates a clear and targeted effort to shift default model selection to more cost-effective 'lite' models, which is a sensible architectural decision. The changes are focused and avoid unnecessary complexity or verbose comments. The pattern of updating model defaults, fallback logic, and API key handling (from URL parameter to header) is consistently applied where needed across Python and TypeScript, indicating a well-understood strategy rather than duplicated slop. There are no signs of over-engineering or gratuitous additions. The net line changes are minimal, well below the threshold that would require active line removals, reinforcing the perception of efficient and purposeful modifications. FINAL RECOMMENDATIONApproved with Minor Changes DEFINITION OF DONE
Review automatically published via RepoAuditor. |
Acknowledged. |
- Updates model fallbacks to `gemini-2.5-flash-lite` across all files to reduce costs. - Fixes security vulnerability by moving the `effectiveApiKey` and `self.gemini_api_key` from the URL string to the `x-goog-api-key` header in both TS and Python scripts. - Adjusts tiering logic in `modelPicker.ts` to favor `lite` for up to 1M tokens. - Adjusts `thinkingBudget` scaling in `geminiCodeReviewClient.ts` based on token estimates.
- Updates model fallbacks to `gemini-2.5-flash-lite` across all files to reduce costs. - Fixes security vulnerability by moving the `effectiveApiKey` and `self.gemini_api_key` from the URL string to the `x-goog-api-key` header in both TS and Python scripts. - Adjusts tiering logic in `modelPicker.ts` to favor `lite` for up to 1M tokens. - Adjusts `thinkingBudget` scaling in `geminiCodeReviewClient.ts` based on token estimates.
arii
left a comment
There was a problem hiding this comment.
PR Review
Summary: This PR updates the default AI models across the application to prioritize gemini-2.5-flash-lite. It also updates the Python HTTP requests to Gemini to use the x-goog-api-key header rather than passing the API key in the URL.
Findings:
- Model Selection: The changes successfully default to the
litetier, specificallygemini-2.5-flash-lite, across Python services (ai_service.py,utils.py), TypeScript scripts (geminiModelPicker.ts,geminiCodeReviewClient.ts), and the frontenduseUXAuditor.ts. This aligns perfectly with the repository's constraint to minimize costs using the lite models. - Security Posture: In
ai_service.pyanduseUXAuditor.ts, moving the Gemini API key from the query string (?key=...) to the HTTP header (x-goog-api-key) resolves a security concern where sensitive credentials could be exposed in request URLs/logs, strictly adhering to the repository's convention. - Testing: All CI checks, including deployment impact analysis, build, and linters, passed successfully. The model fallbacks are correctly implemented based on token estimations.
Recommendation: Approved. The changes effectively implement the cost-saving directive by defaulting to Flash Lite and improve security by properly passing API keys via headers.
🤖 AI Technical AuditANTI-AI-SLOPThe transition to FINAL RECOMMENDATIONApproved with Minor Changes DEFINITION OF DONE
Review automatically published via RepoAuditor. |
arii
left a comment
There was a problem hiding this comment.
ANTI-AI-SLOP
The changes successfully default to the lite tier, specifically gemini-2.5-flash-lite, across the application. Moving the Gemini API key from the query string to the HTTP header (x-goog-api-key) resolves a security concern.
FINDINGS
This PR updates the default AI models across the application to prioritize gemini-2.5-flash-lite and updates the Python HTTP requests to Gemini to use the x-goog-api-key header rather than passing the API key in the URL. The changes effectively implement the cost-saving directive and improve security.
FINAL RECOMMENDATION
Approved
arii
left a comment
There was a problem hiding this comment.
ANTI-AI-SLOP
The changes successfully default to the lite tier, specifically gemini-2.5-flash-lite, across the application. Moving the Gemini API key from the query string to the HTTP header (x-goog-api-key) resolves a security concern.
FINDINGS
This PR updates the default AI models across the application to prioritize gemini-2.5-flash-lite and updates the Python HTTP requests to Gemini to use the x-goog-api-key header rather than passing the API key in the URL. The changes effectively implement the cost-saving directive and improve security.
FINAL RECOMMENDATION
Approved
arii
left a comment
There was a problem hiding this comment.
ANTI-AI-SLOP
The changes successfully default to the lite tier, specifically gemini-2.5-flash-lite, across the application. Moving the Gemini API key from the query string to the HTTP header (x-goog-api-key) resolves a security concern.
FINDINGS
This PR updates the default AI models across the application to prioritize gemini-2.5-flash-lite and updates the Python HTTP requests to Gemini to use the x-goog-api-key header rather than passing the API key in the URL. The changes effectively implement the cost-saving directive and improve security.
FINAL RECOMMENDATION
Approved
🤖 AI Technical AuditANTI-AI-SLOPThe pull request demonstrates a clear and targeted effort to optimize costs by shifting Gemini model defaults to the
FINAL RECOMMENDATIONApproved with Minor Changes DEFINITION OF DONETo finalize this PR, the following tasks must be completed:
Review automatically published via RepoAuditor. |
…led ai reviews - Adds logic to `scripts/lib/codeReviewOrchestrator.ts` to skip appending the error stack trace to the markdown output if an AI review fails completely due to limits. - Extracts token bounds (`10000`, `15000`, `1000000`) into descriptive constants across `geminiCodeReviewClient.ts`, `geminiVisualReviewClient.ts`, and `modelPicker.ts`. - Centralizes model selection by introducing `src/lib/geminiModelConfig.ts` for the frontend and `get_gemini_model()` in `dev-tools/utils.py` for Python tools, removing the hardcoded fallback strings from the fetch calls.
Changes Gemini model selection defaults across the repository to prioritize
litetier (orgemini-3.1-flash-lite) instead of standardflashorpromodels to reduce cost. Updates the internal pickers, code/visual review clients, and raw API call endpoints in both TypeScript and Python.PR created automatically by Jules for task 5137777364521997060 started by @arii