Roadmap

Planned features and improvements for Nenya. Items are grouped by domain — implementation order depends on user demand and technical feasibility.

API Surface Expansion

Non-Streaming Chat Completions ✅ Completed

Synchronous non-streaming responses (stream: false) — buffers upstream SSE into complete JSON response before returning.

Responses API ✅ Completed

Full lifecycle (/v1/responses) with GET/POST/DELETE support.

Embeddings (Enhanced) ✅ Completed

Token counting, rate limiting, and usage tracking for embeddings requests.

File Operations ✅ Completed

File CRUD (/v1/files): create, list, get, delete, content download.

Batch Processing ✅ Completed

Batch API (/v1/batches): submit, check status, cancel, retrieve results.

Passthrough Proxy ✅ Completed

/proxy/{provider}/* — arbitrary HTTP method passthrough with auth injection, SSE streaming auto-detect.

OpenAPI Specification 🔜 Planned

Auto-generated spec served at /openapi.json.

Intelligence & Routing

Model Discovery ✅ Completed

Dynamic fetch of /v1/models from providers at startup and on reload.

Semantic Caching 🔜 Planned

Vector-based caching using local embeddings and cosine similarity (zero-dep, in-memory).

Auto-Fallback Intelligence 🔜 Planned

Elo rank-based fallback with capability overlap scoring.

Model Metadata 🔜 Planned

External model list with pricing, categories, rankings for cost tracking.

Admin API (for External Dashboard)

Usage Analytics API 🔜 Planned

Detailed per-agent/model/provider usage breakdowns with time-series data.

Configuration Management API 🔜 Planned

CRUD via API with internal hot-reload — manage agents, providers, and keys without editing JSON.

Circuit Breaker Management API 🔜 Planned

Inspect and manually control circuit breaker state per target.

Non-Goals

These features are explicitly out of scope for Nenya's single-user, local-first design:

Multi-tenancy — Designed for single-user, local deployment
Per-key budgets — No multi-user isolation needed
Cluster mode — Single-node by design
Admin UI — Admin APIs provided; UI is a separate project
Semantic search — Not relevant for gateway use case
Workflow engine — Agents serve a similar purpose

Implementation Principles

Zero-dependency: All features maintain Go stdlib-only policy
Backward compatibility: All new features preserve existing streaming behavior
Security: Admin APIs require client_token auth
Testing: Each feature includes unit, integration, and fuzz tests

Uh oh!

Roadmap

Roadmap

API Surface Expansion

Non-Streaming Chat Completions ✅ Completed

Responses API ✅ Completed

Embeddings (Enhanced) ✅ Completed

File Operations ✅ Completed

Batch Processing ✅ Completed

Passthrough Proxy ✅ Completed

OpenAPI Specification 🔜 Planned

Intelligence & Routing

Model Discovery ✅ Completed

Semantic Caching 🔜 Planned

Auto-Fallback Intelligence 🔜 Planned

Model Metadata 🔜 Planned

Admin API (for External Dashboard)

Usage Analytics API 🔜 Planned

Configuration Management API 🔜 Planned

Circuit Breaker Management API 🔜 Planned

Non-Goals

Implementation Principles

See Also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally