feat: add NVIDIA NIM (has free tier), Google Gemini (has free tier), local vLLM + embed providers by Marinski · Pull Request #1 · psyb0t/aigate

Marinski · 2026-06-01T11:19:43Z

Description

Adds four new provider fragments for models accessible outside the existing free-tier flags, wiring them into the build system, tests, docs, and .env.example.

Why

NVIDIA NIM (build.nvidia.com) and Google Gemini (aistudio.google.com) both offer free-rate-limited tiers - similar to Groq, Cerebras, and OpenRouter that aigate already supports. Together they unlock ~10 more models at no cost, including reasoning (kimi-k2), finance-specialized (palmyra), large vision (llama-3.2-90b), MoE code models (qwen3-coder), and Google's latest Gemini 2.5/3 series.

Local vLLM and Embed are optional placeholders for users who self-host their own inference containers. They're wired as custom_llm_provider: openai - no LiteLLM code changes needed.

New providers

Provider	File	Models	Auth	Tier
NVIDIA NIM	`providers/nvidia.yaml`	7	`NVIDIA_API_KEY` + `NVIDIA_API_BASE`	free-rate-limited
Google Gemini	`providers/gemini.yaml`	6	`GEMINI_API_KEY`	free-rate-limited
Local vLLM	`providers/vllm-local.yaml`	2	`LOCAL_VLLM_API_KEY` + per-model base URLs	self-hosted
Local Embed	`providers/embed-local.yaml`	1	`EMBED_LOCAL_API_BASE`	self-hosted

Build integration

build-config.py - registers nvidia, gemini, vllm-local, embed-local in active_providers()
.env.example - adds NVIDIA, GEMINI, VLLM_LOCAL, EMBED_LOCAL flags + credential variables
recommend-limits.sh - detects the four new flags for the enabled summary line
tests/test_litellm.sh - gated EXPECTED_MODELS blocks for all four providers
docs/providers.md - provider documentation tables with model aliases, underlying models, and notes

Notes

NVIDIA free tier is rate-limited. Set NVIDIA_API_BASE to https://integrate.api.nvidia.com/v1 (the free-tier endpoint). Implementation adds just a few of the models, there are a lot more that can be use within the free tier of NVIDIA.
nvidia-kimi-k2 uses moonshotai/kimi-k2-thinking which hit EOL 2026-05-12 — included as a placeholder; swap in a replacement model when available.
Local providers assume existing Docker containers at documented ports (default API bases use Docker host gateway 172.17.0.1).
No changes to Makefile, docker-compose.yml, or existing provider fragments.

Add four new provider fragments and wire them into the build system: - **NVIDIA NIM** (`nvidia.yaml`): 7 models via api.nvidia.com (kimi-k2, palmyra-fin-70b, llama-3.2-90b, qwen3-80b, qwen3-coder, deepseek-v3.2, nv-embedqa-e5-v5) - **Google Gemini** (`gemini.yaml`): 6 models via Gemini API (2.5-pro, 2.5-flash, 2.5-flash-lite, 3-flash-preview, 3.1-flash-lite-preview, embedding-001) - **Local vLLM** (`vllm-local.yaml`): 2 existing Docker vLLM instances (Gemma 4 on :8000, Qwen 3.6 on :8001) - **Local Embed** (`embed-local.yaml`): Nomic Embed v2 on :8010 Build integration: - build-config.py: register all 4 providers in active_providers() - .env.example: add flags (NVIDIA, GEMINI, VLLM_LOCAL, EMBED_LOCAL) and credential variables - recommend-limits.sh: detect new flags for enabled summary - tests/test_litellm.sh: add gated EXPECTED_MODELS blocks - docs/providers.md: document all 4 provider sections

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add NVIDIA NIM (has free tier), Google Gemini (has free tier), local vLLM + embed providers#1

feat: add NVIDIA NIM (has free tier), Google Gemini (has free tier), local vLLM + embed providers#1
Marinski wants to merge 1 commit into
psyb0t:mainfrom
Marinski:feat/nvidia-gemini-providers

Marinski commented Jun 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Marinski commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Why

New providers

Build integration

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Marinski commented Jun 1, 2026 •

edited

Loading