Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,12 @@ LOG_RESPONSE_BODY=false
# Providers are managed via the Admin API:
# POST /admin/providers — register a provider (openai, openrouter, dashscope)
# POST /admin/models — map a model name to a provider

# ─── Single-instance mode (SQLite + in-memory cache) ───
# Uncomment the line below and remove the DATABASE_URL / REDIS_URL
# above to run without PostgreSQL or Redis.
#
# DATABASE_URL=sqlite:llm_gateway.db?mode=rwc
#
# When DATABASE_URL starts with "sqlite:", Redis is not required.
# An in-memory key/route cache is used instead.
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ tower-http = { version = "0.6", features = ["cors", "trace"] }
tower = { version = "0.5" }

# Database
sqlx = { version = "0.8", features = ["runtime-tokio", "tls-rustls", "postgres", "migrate", "uuid", "chrono"] }
sqlx = { version = "0.8", features = ["runtime-tokio", "tls-rustls", "postgres", "sqlite", "migrate", "uuid", "chrono"] }

# Redis
redis = { version = "0.27", features = ["tokio-comp", "aio", "connection-manager"] }
Expand Down
1 change: 1 addition & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ RUN cargo build --release && rm -rf src
# Build real binary
COPY src/ src/
COPY migrations/ migrations/
COPY migrations_sqlite/ migrations_sqlite/
RUN touch src/main.rs && cargo build --release

# ---- Runtime stage ----
Expand Down
51 changes: 40 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,23 +11,29 @@ Routes OpenAI-compatible `/v1/chat/completions` requests to multiple upstream pr
- **User Key management** — Generate `sk-{uuid}` keys, rotate (old key instantly invalidated), soft-delete
- **Streaming** — Full SSE streaming passthrough for `stream: true` requests
- **Two-tier caching** — Redis (hot) for O(1) key validation & model routing, PostgreSQL (cold) for persistence
- **Single-instance mode** — Run with SQLite + in-memory cache when Redis and PostgreSQL are unavailable
- **Admin API** — Protected by a static admin key; manage providers, models, and user keys

## Architecture

```text
Client ──► Gateway (/v1/chat/completions) ──► Provider (OpenAI / OpenRouter / DashScope / Ark)
├─ User Key auth (Redis SET → PG fallback)
├─ Model resolution (Redis HASH → PG fallback)
├─ User Key auth (Cache → DB fallback)
├─ Model resolution (Cache → DB fallback)
└─ Request rewrite (model name) + proxy

Full mode: Cache = Redis, DB = PostgreSQL
Single-instance: Cache = In-memory, DB = SQLite
```

```text
src/
├── main.rs # Entrypoint: init, migrations, server
├── config.rs # Env-based configuration
├── state.rs # Shared AppState (PgPool, Redis, HttpClient)
├── state.rs # Shared AppState (DbPool, Cache, HttpClient)
├── db.rs # DbPool enum (PgPool | SqlitePool) + query macros
├── cache.rs # Cache enum (Redis | InMemory)
├── error.rs # Unified error type → HTTP responses
├── middleware/
│ └── auth.rs # Admin key + User key auth middleware
Expand All @@ -41,17 +47,19 @@ src/
└── services/
├── key_service.rs # Key generation, hashing, validation, rotation
├── provider_service.rs # Provider CRUD
└── model_service.rs # Model CRUD, route resolution, Redis cache
└── model_service.rs # Model CRUD, route resolution, cache
```

## Quick Start

### Prerequisites

- Rust 1.75+
- Docker & Docker Compose (for PostgreSQL and Redis)
- Docker & Docker Compose (for PostgreSQL and Redis — **not needed in single-instance mode**)

### Option A: Full deployment (PostgreSQL + Redis)

### 1. Clone and configure
#### 1. Clone and configure

```bash
git clone <repo-url> && cd llm-gateway-rs
Expand All @@ -67,20 +75,40 @@ ADMIN_KEY=my-secret-admin-key
LISTEN_ADDR=0.0.0.0:8080
```

### 2. Start dependencies
#### 2. Start dependencies

```bash
docker compose up -d
```

### 3. Run the gateway
#### 3. Run the gateway

```bash
cargo run
```

The server starts on `http://localhost:8080`. Database migrations run automatically on startup.

### Option B: Single-instance mode (SQLite, no external dependencies)

For lightweight or local deployments, the gateway can run with SQLite and an in-memory cache — no PostgreSQL or Redis required.

```bash
git clone <repo-url> && cd llm-gateway-rs

# Configure for SQLite
export DATABASE_URL="sqlite:llm_gateway.db?mode=rwc"
export ADMIN_KEY="my-secret-admin-key"
# REDIS_URL is not needed — in-memory cache is used automatically

cargo run
```

A `llm_gateway.db` file will be created in the working directory with all tables.

> **Note:** Single-instance mode stores the key/model route cache in process memory.
> It is designed for single-process deployments. For multi-instance or HA setups, use PostgreSQL + Redis.

## Admin API

All admin endpoints require `Authorization: Bearer <ADMIN_KEY>`.
Expand Down Expand Up @@ -262,8 +290,8 @@ The gateway will:

| Variable | Required | Default | Description |
| -------- | -------- | ------- | ----------- |
| `DATABASE_URL` | Yes | — | PostgreSQL connection string |
| `REDIS_URL` | No | `redis://127.0.0.1:6379` | Redis connection string |
| `DATABASE_URL` | Yes | — | PostgreSQL connection string, or `sqlite:<path>?mode=rwc` for SQLite |
| `REDIS_URL` | No | `redis://127.0.0.1:6379` (PG mode) / *none* (SQLite mode) | Redis connection string. Omit for in-memory caching with SQLite |
| `ADMIN_KEY` | Yes | — | Secret key for admin API access |
| `LISTEN_ADDR` | No | `0.0.0.0:8080` | Server listen address |

Expand All @@ -272,7 +300,8 @@ The gateway will:
- **Key format**: `sk-{uuid v4}` — 39 characters, recognizable prefix
- **Key storage**: Only SHA-256 hashes stored; plaintext returned once on create/rotate (like GitHub PATs)
- **Redis strategy**: `SET` for key hashes (`SISMEMBER` O(1)), `HASH` for model routes (`HGET` O(1))
- **Cache warm-up**: On startup, all active keys and model routes are loaded from PG into Redis
- **Single-instance mode**: When `DATABASE_URL` starts with `sqlite:`, an in-memory cache replaces Redis, and SQLite replaces PostgreSQL — zero external dependencies
- **Cache warm-up**: On startup, all active keys and model routes are loaded from DB into cache (Redis or in-memory)
- **Streaming**: Raw byte-stream passthrough — no SSE parsing, minimal latency
- **Provider API keys**: Stored in PG, listed with masked preview (`sk-x...xxxx`), never cached in plaintext outside the routing lookup

Expand Down
39 changes: 39 additions & 0 deletions migrations_sqlite/001_init.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
-- User keys: gateway-issued API keys for end users
CREATE TABLE IF NOT EXISTS user_keys (
id TEXT PRIMARY KEY,
name TEXT NOT NULL,
key_hash TEXT NOT NULL,
key_prefix TEXT NOT NULL,
is_active INTEGER NOT NULL DEFAULT 1,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);

CREATE UNIQUE INDEX IF NOT EXISTS idx_user_keys_key_hash ON user_keys (key_hash);
CREATE INDEX IF NOT EXISTS idx_user_keys_is_active ON user_keys (is_active);

-- Providers: each represents an LLM API backend (OpenAI, OpenRouter, DashScope, etc.)
CREATE TABLE IF NOT EXISTS providers (
id TEXT PRIMARY KEY,
name TEXT NOT NULL UNIQUE,
kind TEXT NOT NULL DEFAULT 'openai',
base_url TEXT NOT NULL,
api_key TEXT NOT NULL,
is_active INTEGER NOT NULL DEFAULT 1,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);

-- Models: maps user-facing model names to a provider
CREATE TABLE IF NOT EXISTS models (
id TEXT PRIMARY KEY,
name TEXT NOT NULL UNIQUE,
provider_id TEXT NOT NULL REFERENCES providers(id),
provider_model_name TEXT,
is_active INTEGER NOT NULL DEFAULT 1,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_models_name ON models (name) WHERE is_active = 1;
CREATE INDEX IF NOT EXISTS idx_models_provider_id ON models (provider_id);
26 changes: 26 additions & 0 deletions migrations_sqlite/002_request_logs.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
-- Request logs for tracking all proxy calls
CREATE TABLE IF NOT EXISTS request_logs (
id TEXT PRIMARY KEY,
request_id TEXT,
user_key_id TEXT,
user_key_hash TEXT NOT NULL,
model_requested TEXT NOT NULL,
model_sent TEXT NOT NULL,
provider_id TEXT,
provider_kind TEXT,
status_code INTEGER NOT NULL,
is_error INTEGER NOT NULL DEFAULT 0,
prompt_tokens INTEGER,
completion_tokens INTEGER,
total_tokens INTEGER,
latency_ms INTEGER NOT NULL,
is_stream INTEGER NOT NULL DEFAULT 0,
request_body TEXT,
response_body TEXT,
error_message TEXT,
created_at TEXT NOT NULL DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_request_logs_created_at ON request_logs (created_at DESC);
CREATE INDEX IF NOT EXISTS idx_request_logs_user_key ON request_logs (user_key_hash);
CREATE INDEX IF NOT EXISTS idx_request_logs_model ON request_logs (model_requested);
3 changes: 3 additions & 0 deletions migrations_sqlite/003_token_budget.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
-- Add token budget columns to user_keys
ALTER TABLE user_keys ADD COLUMN token_budget INTEGER NULL;
ALTER TABLE user_keys ADD COLUMN tokens_used INTEGER NOT NULL DEFAULT 0;
4 changes: 4 additions & 0 deletions migrations_sqlite/004_token_coefficients.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
-- Add input/output token cost coefficients to models
-- Default 1.0 means 1 raw token = 1 budget token
ALTER TABLE models ADD COLUMN input_token_coefficient REAL NOT NULL DEFAULT 1.0;
ALTER TABLE models ADD COLUMN output_token_coefficient REAL NOT NULL DEFAULT 1.0;
164 changes: 164 additions & 0 deletions src/cache.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
use std::collections::{HashMap, HashSet};
use std::sync::Arc;
use tokio::sync::RwLock;

/// Cache abstraction: either Redis (for multi-instance deployments) or
/// an in-memory store (for single-instance / SQLite mode).
#[derive(Clone)]
pub enum Cache {
Redis(Box<redis::aio::ConnectionManager>),
InMemory(Arc<InMemoryCache>),
}

/// Simple in-memory cache that mirrors the Redis SET / HASH operations
/// used by key_service and model_service.
pub struct InMemoryCache {
sets: RwLock<HashMap<String, HashSet<String>>>,
hashes: RwLock<HashMap<String, HashMap<String, String>>>,
}

impl InMemoryCache {
pub fn new() -> Self {
Self {
sets: RwLock::new(HashMap::new()),
hashes: RwLock::new(HashMap::new()),
}
}
}

impl Cache {
/// Create a new in-memory cache.
pub fn in_memory() -> Self {
Cache::InMemory(Arc::new(InMemoryCache::new()))
}

// ── SET operations ────────────────────────────────────────────────

/// Add a member to a set.
pub async fn sadd(&mut self, key: &str, member: &str) -> Result<(), crate::error::AppError> {
match self {
Cache::Redis(cm) => {
let _: () = redis::AsyncCommands::sadd(cm.as_mut(), key, member).await?;
Ok(())
}
Cache::InMemory(store) => {
let mut sets = store.sets.write().await;
sets.entry(key.to_string())
.or_default()
.insert(member.to_string());
Ok(())
}
}
}

/// Check if a member exists in a set.
pub async fn sismember(&mut self, key: &str, member: &str) -> Result<bool, crate::error::AppError> {
match self {
Cache::Redis(cm) => {
let exists: bool = redis::AsyncCommands::sismember(cm.as_mut(), key, member).await?;
Ok(exists)
}
Cache::InMemory(store) => {
let sets = store.sets.read().await;
Ok(sets.get(key).is_some_and(|s| s.contains(member)))
}
}
}

/// Remove a member from a set.
pub async fn srem(&mut self, key: &str, member: &str) -> Result<(), crate::error::AppError> {
match self {
Cache::Redis(cm) => {
let _: () = redis::AsyncCommands::srem(cm.as_mut(), key, member).await?;
Ok(())
}
Cache::InMemory(store) => {
let mut sets = store.sets.write().await;
if let Some(set) = sets.get_mut(key) {
set.remove(member);
}
Ok(())
}
}
}

// ── HASH operations ───────────────────────────────────────────────

/// Get a field from a hash.
pub async fn hget(&mut self, key: &str, field: &str) -> Result<Option<String>, crate::error::AppError> {
match self {
Cache::Redis(cm) => {
let val: Option<String> = redis::AsyncCommands::hget(cm.as_mut(), key, field).await?;
Ok(val)
}
Cache::InMemory(store) => {
let hashes = store.hashes.read().await;
Ok(hashes
.get(key)
.and_then(|h| h.get(field))
.cloned())
}
}
}

/// Set a field in a hash.
pub async fn hset(&mut self, key: &str, field: &str, value: &str) -> Result<(), crate::error::AppError> {
match self {
Cache::Redis(cm) => {
let _: () = redis::AsyncCommands::hset(cm.as_mut(), key, field, value).await?;
Ok(())
}
Cache::InMemory(store) => {
let mut hashes = store.hashes.write().await;
hashes
.entry(key.to_string())
.or_default()
.insert(field.to_string(), value.to_string());
Ok(())
}
}
}

/// Remove a field from a hash.
pub async fn hdel(&mut self, key: &str, field: &str) -> Result<(), crate::error::AppError> {
match self {
Cache::Redis(cm) => {
let _: () = redis::AsyncCommands::hdel(cm.as_mut(), key, field).await?;
Ok(())
}
Cache::InMemory(store) => {
let mut hashes = store.hashes.write().await;
if let Some(hash) = hashes.get_mut(key) {
hash.remove(field);
}
Ok(())
}
}
}

// ── Key-level operations ──────────────────────────────────────────

/// Delete an entire key (set or hash).
pub async fn del(&mut self, key: &str) -> Result<(), crate::error::AppError> {
match self {
Cache::Redis(cm) => {
let _: () = redis::cmd("DEL")
.arg(key)
.query_async(cm.as_mut())
.await?;
Ok(())
}
Cache::InMemory(store) => {
{
let mut sets = store.sets.write().await;
sets.remove(key);
}
{
let mut hashes = store.hashes.write().await;
hashes.remove(key);
}
Ok(())
}
}
}
}
Loading