Feros Voice Agent OS is built with a clear goal: providing an open, airtight, and enterprise-grade infrastructure layer for production voice AI.
We built Feros to solve the structural problems of the current voice AI ecosystem. With a Rust runtime engineered for sub-second latency, an AI-driven builder, and a Python control plane—all in a single self-hostable monorepo—we address these barriers head-on:
| The Approach | The Barrier | The Feros Solution |
|---|---|---|
| Managed Platforms (Vapi, Retell) |
Per-minute costs compound at scale, with no path to self-host or satisfy strict data residency requirements. | Deploy the complete platform in your own infrastructure — no per-minute taxes, full control over data residency and compliance. |
| Low-Level Frameworks (Pipecat, LiveKit) |
Weeks spent building and maintaining the voice pipeline, rather than focusing on agent quality. | A production-ready voice pipeline — VAD, STT, LLM, TTS — ships on day one. Focus on the agent, not the plumbing. |
| Visual Node Builders (Legacy platforms) |
Hand-wiring agent logic, dragging nodes, and stitching call flows step-by-step becomes unmaintainable. | Describe what you want your agent to do, and the AI autonomously provisions the tools, prompts, and routing logic. |
feros.mp4
The control plane (Python) handles configuration and management. The voice runtime (Rust) handles every live call. The two layers scale independently, and every component — STT, LLM, TTS, telephony provider — is swappable without touching the rest.
flowchart LR
UI[Studio Web] --> API[Studio API]
API --> DB[(Postgres)]
TEL[Telephony / Browser] --> VS[voice-server]
VS --> VE[voice-engine]
VE --> STT[STT]
VE --> LLM[LLM]
VE --> TTS[TTS]
| Layer | Component | Purpose |
|---|---|---|
| Dashboard | studio/web | Agent builder, call monitoring, in-browser voice testing |
| Control Plane | studio/api | Agent config, integrations, evaluations, session provisioning |
| Integrations | integrations | Encrypted credential vault for CRMs, calendars, etc. — third-party secrets never leave your infrastructure in plaintext |
| Voice Runtime | voice/server | Inbound telephony and WebSocket gateway |
| voice/engine | High-performance VAD → STT → LLM → TTS orchestration | |
| Inference (optional) | inference | Self-hosted GPU STT/TTS — drop-in for cloud APIs |
All services live in a single repo and share a common Postgres database and config layer.
| Path | Purpose |
|---|---|
studio/web |
Next.js dashboard and AI-driven agent builder |
studio/api |
FastAPI control plane — agent config, integrations, evaluations, session setup |
voice/server |
Rust telephony gateway and session router |
voice/engine |
Rust runtime core — streaming STT/LLM/TTS orchestration at sub-second latency |
integrations |
Credential encryption, secret resolution, and automatic token refresh |
inference |
Optional self-hosted STT/TTS stack for cost control and data sovereignty |
proto |
Shared protobuf definitions for WebSocket message payloads |
Requirements: Docker and Docker Compose.
git clone https://github.com/ferosai/feros.git
cd feros
cp .env.example .env
# Run with prebuilt multi-arch images (Lightning fast startup)
docker compose up -d
# OR, if you need to build the Rust/Python core from source:
# docker compose -f docker-compose.yml -f docker-compose.source.yml up -d --buildOpen http://localhost:3000 to access the dashboard.
Local services:
| Service | URL |
|---|---|
| Studio Web | http://localhost:3000 |
| Studio API | http://localhost:8000 |
| Voice Server | http://localhost:8300 |
AUTH__SECRET_KEY and DATABASE__URL must be set consistently across all services in your .env.
- Outbound calls — agent-initiated dialing with retry and scheduling
- Dynamic Agent Variables — resolve runtime context at session start for personalized conversations
- Granular Usage Billing — step-level cost attribution across models and third-party services
- Gemini Live native audio — end-to-end multimodal backend
- Direct PSTN via SIP — no Twilio or Telnyx required
- Agent-to-agent evaluation — tester agent calling target agent over live audio
- Evaluation replay — run historical transcripts against new agent versions
- Audit logs — immutable trail of agent actions and config changes
- Usage analytics — per-agent cost tracking across STT, LLM, and TTS providers
Contributions are welcome. Before starting larger changes, please open an issue or discussion first — alignment before implementation saves time.
- Familiarize yourself with the voice runtime architecture before making changes to the Rust core; it has strict ordering and concurrency constraints.
- Run the affected service locally before opening a PR.
- Add or update tests when behavior changes; the voice pipeline has integration tests that catch regressions the unit tests miss.
Apache License 2.0. See LICENSE for details.
Third-party code vendored in this repository remains subject to its own license terms where noted in the source tree.