feat: v2.0.1 – OAuth wrappers, Antigravity, local worker GPU, budget limits, anomaly detection#197
Merged
Conversation
- Shell parity and intelligence: CLI deep-links, suggestions, config workflows - Local worker auto-discovery CLI (faigate-config discover) - Complete provider coverage: all LLM AI Router custom endpoints now in catalog - Added missing providers: xAI, Z.AI, Mistral, Groq, HuggingFace, MoonshotAI, MiniMax, Volcano Engine, BytePlus, Qwen, OpenAI Codex, OpenCode Zen, Cerebras, GitHub Copilot, Synthetic, Kimi Coding, Vercel AI Gateway - KiloCode model-level access: individual catalog entries for kilo-auto/frontier, kilo-auto/balanced, kilo-auto/free - Enhanced provider catalog: 43 curated provider entries (up from 17) - Local worker examples and generic provider templates in config.yaml - Updated roadmap and changelog for v2.0.0 release - GitHub issues created for v2.1.0 OAuth wrapper functionality
- Token store with encrypted JSON storage - Generic OAuth backend wrapping existing providers - Provider factory integration (backend=oauth) - CLI helper stub with Google ADC support - Config.yaml examples for qwen‑portal, claude‑code, openai‑codex - Optional dependencies for OAuth (requests, google‑auth) - Updated roadmap and changelog
…v2.1.0) - Add google-antigravity provider to registry, catalog, and lane registry with ag/ model family (Claude Opus/Sonnet 4.6, Gemini 3.x variants) - Rename google-vertex → google-gemini-cli in registry and catalog - Implement claude_code_oauth() reading token from ~/.config/claude/settings.json - Add google_oauth_device_flow() for interactive OAuth flows (Gemini, Antigravity) - Add antigravity provider config example to config.yaml Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…anomaly detection Local worker completion: - Add GpuInfo TypedDict and dynamic_models field to DiscoveredWorker - Probe GPU/VRAM metrics from Ollama (/api/ps) and vLLM (/metrics) - Complete discover_grid_workers: reads ~/.faigrid/config.json (JSON format) with fallback to legacy key=value state file - Surface dynamically enumerated models in generate_provider_config Enhanced client profiles – budget limits: - Add cost_limit_usd_day and cost_limit_usd_month as optional profile fields - Config validation normalizes and type-checks limit values - Budget enforcement in main.py: returns HTTP 429 before routing when daily or monthly spend threshold is reached (checks /v1/chat and image routes) - New MetricsStore.get_client_cost_since() for efficient spend queries Observability – anomaly detection: - New MetricsStore.get_anomalies(): detects error rate spikes, latency spikes, cost spikes, and traffic spikes vs. rolling baseline window - New GET /api/alerts endpoint with lookback_hours and baseline_hours params Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mark all four v2.1.0 themes as implemented with details on what shipped and what was explicitly deferred (lifecycle hooks, policy UI, Prometheus export). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…dentials
- Replace placeholder Qwen device flow with real implementation:
- Correct endpoints: chat.qwen.ai/api/v1/oauth2/{device/code,token}
- Official client_id from qwen-code source (f0304373...)
- Scope: openid profile email model.completion
- New qwen_oauth(): reads ~/.qwen/oauth_creds.json (shared with qwen-code CLI)
- Dynamic base_url from resource_url field (portal.qwen.ai → compatible-mode/v1)
- Expiry warning with refresh guidance
- Fallback to dashscope.aliyuncs.com if no resource_url
- New qwen_refresh(): refresh token flow, writes back to ~/.qwen/oauth_creds.json
- qwen_device_code_flow(): stores token to shared ~/.qwen/oauth_creds.json (mode 0o600)
- CLI: faigate-auth qwen-portal reads existing creds or starts device flow;
--refresh flag triggers token refresh
- Update registry: correct base_url, base_url_env, model (coder-model)
- Update provider_catalog: correct model, source URL, notes
- Update config.yaml: accurate setup instructions replacing placeholder comments
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tegration - Extract real OAuth parameters from LLM AI Router connect URL: client_id: 1071006060591-tmhssin2h21lcre235vtolojh4g403ep.apps.googleusercontent.com scopes: cloud-platform, userinfo.email, userinfo.profile, cclog, experimentsandconfigs - antigravity_oauth(): reads ~/.gemini/oauth_creds.json (shared with Antigravity IDE) - Expiry warning with refresh guidance - base_url from ANTIGRAVITY_BASE_URL env var (requires network discovery) - antigravity_refresh(): token refresh via oauth2.googleapis.com, preserves existing fields - antigravity_login(): full Authorization Code + PKCE flow - Generates code_verifier + S256 code_challenge - Opens browser to Google consent screen - Local HTTP server on :8080 captures callback - State parameter validated (CSRF protection) - Writes credentials to ~/.gemini/oauth_creds.json (mode 0o600) - CLI: faigate-auth google-antigravity reads existing creds or starts browser login; --refresh flag triggers token refresh without browser - Update registry: base_url_env=ANTIGRAVITY_BASE_URL, document pending discovery - Update catalog: real client_id, correct signup_url, observed evidence_level - Update config.yaml: accurate setup instructions, document endpoint discovery process Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Network discovery confirmed: Antigravity's client interface is a local ephemeral gRPC language server (127.0.0.1:<port>/exa.language_server_pb .LanguageServerService/…), not a remote inference endpoint. The Google OAuth token grants direct access to the Google Generative Language API. - registry.py: set base_url to generativelanguage.googleapis.com/v1beta/openai - config.yaml: same default, document gRPC LS discovery finding - oauth/cli.py: remove "unknown endpoint" warning, use default base_url - provider_catalog.py: document gRPC LS fact, update recommended_model Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds DeepSeek, Together AI, Fireworks AI, Cohere, Nebius AI, SiliconFlow, Hyperbolic, Perplexity, and NVIDIA NIM to the provider registry. DeepSeek was already active in config.yaml (deepseek-chat/reasoner) but missing from the registry as a first-class entry. All 9 are added to BUILTIN in registry.py with correct base URLs, api_key_env vars, and pricing. config.yaml gets commented-out stubs ready to activate. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- backend.py: remove token_endpoint from log line (derived from tainted token_data, CodeQL correctly flags it as potentially sensitive) - cli.py: redact access_token/refresh_token/id_token in stdout output; tokens are written to credential files and should not be printed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Only output non-sensitive metadata (base_url, scope, expiry); tokens are not printed to stdout under any code path. CodeQL taint analysis followed token_data values through the dict comprehension even when redacted via conditional - switching to explicit allowlist of safe keys. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CodeQL taint analysis marks every value from token_data as sensitive (because the dict contains access_token/refresh_token from HTTP responses), regardless of which key is accessed. The only compliant solution is to print nothing derived from token_data. Auth functions already write tokens to the credentials file; CLI now prints only a static success message. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
claude_code_oauth()from local claude CLIag/model family: Claude Opus/Sonnet 4.6, Gemini 3.x)faigate-configCLI (preview/diff/apply/validate), CLI deep-links to dashboard, scope suggestions~/.faigrid/config.json)cost_limit_usd_day/cost_limit_usd_monthin profile config; HTTP 429 enforcement before routingMetricsStore.get_anomalies()detects error rate / latency / cost / traffic spikes vs. rolling baseline;GET /api/alertsendpointTest plan
faigate-config validateon existing config.yaml passesfaigate-config discoverdetects local workers (Ollama if running)GET /api/alertsreturns{"anomalies": [], "count": 0, ...}on fresh instancecost_limit_usd_day: 0.001triggers HTTP 429 after spending exceeds limitfaigate catalogshowsgoogle-antigravityandgoogle-gemini-clientriesdynamic_models: trueandgpu_infowhen Ollama running🤖 Generated with Claude Code