Skip to content

feat: v2.0.1 – OAuth wrappers, Antigravity, local worker GPU, budget limits, anomaly detection#197

Merged
typelicious merged 14 commits into
mainfrom
shell-parity
Apr 4, 2026
Merged

feat: v2.0.1 – OAuth wrappers, Antigravity, local worker GPU, budget limits, anomaly detection#197
typelicious merged 14 commits into
mainfrom
shell-parity

Conversation

@typelicious
Copy link
Copy Markdown
Collaborator

Summary

  • OAuth wrapper for managed providers: device-code flows for Google, Qwen, Antigravity; token store and generic backend; claude_code_oauth() from local claude CLI
  • Antigravity provider: full registry + catalog + lane-registry integration (ag/ model family: Claude Opus/Sonnet 4.6, Gemini 3.x)
  • Shell parity & config workflows: faigate-config CLI (preview/diff/apply/validate), CLI deep-links to dashboard, scope suggestions
  • Local worker completion: GPU/VRAM metrics from Ollama + vLLM, dynamic model enumeration, Grid config reader (~/.faigrid/config.json)
  • Client profile budget limits: cost_limit_usd_day / cost_limit_usd_month in profile config; HTTP 429 enforcement before routing
  • Anomaly detection: MetricsStore.get_anomalies() detects error rate / latency / cost / traffic spikes vs. rolling baseline; GET /api/alerts endpoint
  • Complete provider coverage: 41 curated providers, KiloCode model-level lanes, local worker examples in config.yaml

Test plan

  • faigate-config validate on existing config.yaml passes
  • faigate-config discover detects local workers (Ollama if running)
  • GET /api/alerts returns {"anomalies": [], "count": 0, ...} on fresh instance
  • Client profile with cost_limit_usd_day: 0.001 triggers HTTP 429 after spending exceeds limit
  • faigate catalog shows google-antigravity and google-gemini-cli entries
  • Local worker discovery shows dynamic_models: true and gpu_info when Ollama running

🤖 Generated with Claude Code

André Lange and others added 7 commits April 3, 2026 04:45
- Shell parity and intelligence: CLI deep-links, suggestions, config workflows
- Local worker auto-discovery CLI (faigate-config discover)
- Complete provider coverage: all LLM AI Router custom endpoints now in catalog
- Added missing providers: xAI, Z.AI, Mistral, Groq, HuggingFace, MoonshotAI, MiniMax, Volcano Engine, BytePlus, Qwen, OpenAI Codex, OpenCode Zen, Cerebras, GitHub Copilot, Synthetic, Kimi Coding, Vercel AI Gateway
- KiloCode model-level access: individual catalog entries for kilo-auto/frontier, kilo-auto/balanced, kilo-auto/free
- Enhanced provider catalog: 43 curated provider entries (up from 17)
- Local worker examples and generic provider templates in config.yaml
- Updated roadmap and changelog for v2.0.0 release
- GitHub issues created for v2.1.0 OAuth wrapper functionality
- Token store with encrypted JSON storage
- Generic OAuth backend wrapping existing providers
- Provider factory integration (backend=oauth)
- CLI helper stub with Google ADC support
- Config.yaml examples for qwen‑portal, claude‑code, openai‑codex
- Optional dependencies for OAuth (requests, google‑auth)
- Updated roadmap and changelog
…v2.1.0)

- Add google-antigravity provider to registry, catalog, and lane registry
  with ag/ model family (Claude Opus/Sonnet 4.6, Gemini 3.x variants)
- Rename google-vertex → google-gemini-cli in registry and catalog
- Implement claude_code_oauth() reading token from ~/.config/claude/settings.json
- Add google_oauth_device_flow() for interactive OAuth flows (Gemini, Antigravity)
- Add antigravity provider config example to config.yaml

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…anomaly detection

Local worker completion:
- Add GpuInfo TypedDict and dynamic_models field to DiscoveredWorker
- Probe GPU/VRAM metrics from Ollama (/api/ps) and vLLM (/metrics)
- Complete discover_grid_workers: reads ~/.faigrid/config.json (JSON format)
  with fallback to legacy key=value state file
- Surface dynamically enumerated models in generate_provider_config

Enhanced client profiles – budget limits:
- Add cost_limit_usd_day and cost_limit_usd_month as optional profile fields
- Config validation normalizes and type-checks limit values
- Budget enforcement in main.py: returns HTTP 429 before routing when
  daily or monthly spend threshold is reached (checks /v1/chat and image routes)
- New MetricsStore.get_client_cost_since() for efficient spend queries

Observability – anomaly detection:
- New MetricsStore.get_anomalies(): detects error rate spikes, latency spikes,
  cost spikes, and traffic spikes vs. rolling baseline window
- New GET /api/alerts endpoint with lookback_hours and baseline_hours params

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mark all four v2.1.0 themes as implemented with details on what shipped
and what was explicitly deferred (lifecycle hooks, policy UI, Prometheus export).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread faigate/oauth/backend.py Fixed
…dentials

- Replace placeholder Qwen device flow with real implementation:
  - Correct endpoints: chat.qwen.ai/api/v1/oauth2/{device/code,token}
  - Official client_id from qwen-code source (f0304373...)
  - Scope: openid profile email model.completion
- New qwen_oauth(): reads ~/.qwen/oauth_creds.json (shared with qwen-code CLI)
  - Dynamic base_url from resource_url field (portal.qwen.ai → compatible-mode/v1)
  - Expiry warning with refresh guidance
  - Fallback to dashscope.aliyuncs.com if no resource_url
- New qwen_refresh(): refresh token flow, writes back to ~/.qwen/oauth_creds.json
- qwen_device_code_flow(): stores token to shared ~/.qwen/oauth_creds.json (mode 0o600)
- CLI: faigate-auth qwen-portal reads existing creds or starts device flow;
  --refresh flag triggers token refresh
- Update registry: correct base_url, base_url_env, model (coder-model)
- Update provider_catalog: correct model, source URL, notes
- Update config.yaml: accurate setup instructions replacing placeholder comments

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread faigate/oauth/cli.py Fixed
André Lange and others added 4 commits April 4, 2026 17:27
…tegration

- Extract real OAuth parameters from LLM AI Router connect URL:
  client_id: 1071006060591-tmhssin2h21lcre235vtolojh4g403ep.apps.googleusercontent.com
  scopes: cloud-platform, userinfo.email, userinfo.profile, cclog, experimentsandconfigs
- antigravity_oauth(): reads ~/.gemini/oauth_creds.json (shared with Antigravity IDE)
  - Expiry warning with refresh guidance
  - base_url from ANTIGRAVITY_BASE_URL env var (requires network discovery)
- antigravity_refresh(): token refresh via oauth2.googleapis.com, preserves existing fields
- antigravity_login(): full Authorization Code + PKCE flow
  - Generates code_verifier + S256 code_challenge
  - Opens browser to Google consent screen
  - Local HTTP server on :8080 captures callback
  - State parameter validated (CSRF protection)
  - Writes credentials to ~/.gemini/oauth_creds.json (mode 0o600)
- CLI: faigate-auth google-antigravity reads existing creds or starts browser login;
  --refresh flag triggers token refresh without browser
- Update registry: base_url_env=ANTIGRAVITY_BASE_URL, document pending discovery
- Update catalog: real client_id, correct signup_url, observed evidence_level
- Update config.yaml: accurate setup instructions, document endpoint discovery process

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Network discovery confirmed: Antigravity's client interface is a local
ephemeral gRPC language server (127.0.0.1:<port>/exa.language_server_pb
.LanguageServerService/…), not a remote inference endpoint. The Google
OAuth token grants direct access to the Google Generative Language API.

- registry.py: set base_url to generativelanguage.googleapis.com/v1beta/openai
- config.yaml: same default, document gRPC LS discovery finding
- oauth/cli.py: remove "unknown endpoint" warning, use default base_url
- provider_catalog.py: document gRPC LS fact, update recommended_model

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds DeepSeek, Together AI, Fireworks AI, Cohere, Nebius AI,
SiliconFlow, Hyperbolic, Perplexity, and NVIDIA NIM to the
provider registry. DeepSeek was already active in config.yaml
(deepseek-chat/reasoner) but missing from the registry as a
first-class entry.

All 9 are added to BUILTIN in registry.py with correct base URLs,
api_key_env vars, and pricing. config.yaml gets commented-out
stubs ready to activate.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- backend.py: remove token_endpoint from log line (derived from tainted
  token_data, CodeQL correctly flags it as potentially sensitive)
- cli.py: redact access_token/refresh_token/id_token in stdout output;
  tokens are written to credential files and should not be printed

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread faigate/oauth/cli.py Fixed
Only output non-sensitive metadata (base_url, scope, expiry); tokens
are not printed to stdout under any code path. CodeQL taint analysis
followed token_data values through the dict comprehension even when
redacted via conditional - switching to explicit allowlist of safe keys.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread faigate/oauth/cli.py Fixed
CodeQL taint analysis marks every value from token_data as sensitive
(because the dict contains access_token/refresh_token from HTTP responses),
regardless of which key is accessed. The only compliant solution is to
print nothing derived from token_data. Auth functions already write tokens
to the credentials file; CLI now prints only a static success message.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@typelicious typelicious merged commit 4a1fb06 into main Apr 4, 2026
7 checks passed
@typelicious typelicious deleted the shell-parity branch April 4, 2026 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants