GitHub - ferosai/feros: Open-source voice agent OS. Rust runtime, AI-driven builder, sub second latency. Self-host everything.

Feros

Feros Voice Agent OS is built with a clear goal: providing an open, airtight, and enterprise-grade infrastructure layer for production voice AI.

We built Feros to solve the structural problems of the current voice AI ecosystem. With a Rust runtime engineered for sub-second latency, an AI-driven builder, and a Python control plane—all in a single self-hostable monorepo—we address these barriers head-on:

The Approach	The Barrier	The Feros Solution
Managed Platforms (Vapi, Retell)	Per-minute costs compound at scale, with no path to self-host or satisfy strict data residency requirements.	Deploy the complete platform in your own infrastructure — no per-minute taxes, full control over data residency and compliance.
Low-Level Frameworks (Pipecat, LiveKit)	Weeks spent building and maintaining the voice pipeline, rather than focusing on agent quality.	A production-ready voice pipeline — VAD, STT, LLM, TTS — ships on day one. Focus on the agent, not the plumbing.
Visual Node Builders (Legacy platforms)	Hand-wiring agent logic, dragging nodes, and stitching call flows step-by-step becomes unmaintainable.	Describe what you want your agent to do, and the AI autonomously provisions the tools, prompts, and routing logic.

feros.mp4

Architecture

The control plane (Python) handles configuration and management. The voice runtime (Rust) handles every live call. The two layers scale independently, and every component — STT, LLM, TTS, telephony provider — is swappable without touching the rest.

flowchart LR
  UI[Studio Web] --> API[Studio API]
  API --> DB[(Postgres)]

  TEL[Telephony / Browser] --> VS[voice-server]
  VS --> VE[voice-engine]
  VE --> STT[STT]
  VE --> LLM[LLM]
  VE --> TTS[TTS]

Layer	Component	Purpose
Dashboard	studio/web	Agent builder, call monitoring, in-browser voice testing
Control Plane	studio/api	Agent config, integrations, evaluations, session provisioning
Integrations	integrations	Encrypted credential vault for CRMs, calendars, etc. — third-party secrets never leave your infrastructure in plaintext
Voice Runtime	voice/server	Inbound telephony and WebSocket gateway
	voice/engine	High-performance VAD → STT → LLM → TTS orchestration
Inference (optional)	inference	Self-hosted GPU STT/TTS — drop-in for cloud APIs

Repository Structure

All services live in a single repo and share a common Postgres database and config layer.

Path	Purpose
`studio/web`	Next.js dashboard and AI-driven agent builder
`studio/api`	FastAPI control plane — agent config, integrations, evaluations, session setup
`voice/server`	Rust telephony gateway and session router
`voice/engine`	Rust runtime core — streaming STT/LLM/TTS orchestration at sub-second latency
`integrations`	Credential encryption, secret resolution, and automatic token refresh
`inference`	Optional self-hosted STT/TTS stack for cost control and data sovereignty
`proto`	Shared protobuf definitions for WebSocket message payloads

Quickstart

Requirements: Docker and Docker Compose.

git clone https://github.com/ferosai/feros.git
cd feros
cp .env.example .env

# Run with prebuilt multi-arch images (Lightning fast startup)
docker compose up -d

# OR, if you need to build the Rust/Python core from source:
# docker compose -f docker-compose.yml -f docker-compose.source.yml up -d --build

Open http://localhost:3000 to access the dashboard.

Local services:

Service	URL
Studio Web	`http://localhost:3000`
Studio API	`http://localhost:8000`
Voice Server	`http://localhost:8300`

AUTH__SECRET_KEY and DATABASE__URL must be set consistently across all services in your .env.

Roadmap

Outbound calls — agent-initiated dialing with retry and scheduling
Dynamic Agent Variables — resolve runtime context at session start for personalized conversations
Granular Usage Billing — step-level cost attribution across models and third-party services
Gemini Live native audio — end-to-end multimodal backend
Direct PSTN via SIP — no Twilio or Telnyx required
Agent-to-agent evaluation — tester agent calling target agent over live audio
Evaluation replay — run historical transcripts against new agent versions
Audit logs — immutable trail of agent actions and config changes
Usage analytics — per-agent cost tracking across STT, LLM, and TTS providers

Contributing

Contributions are welcome. Before starting larger changes, please open an issue or discussion first — alignment before implementation saves time.

Familiarize yourself with the voice runtime architecture before making changes to the Rust core; it has strict ordering and concurrency constraints.
Run the affected service locally before opening a PR.
Add or update tests when behavior changes; the voice pipeline has integration tests that catch regressions the unit tests miss.

License

Apache License 2.0. See LICENSE for details.

Third-party code vendored in this repository remains subject to its own license terms where noted in the source tree.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Architecture

Repository Structure

Quickstart

Roadmap

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github/workflows		.github/workflows
inference		inference
integrations		integrations
proto		proto
studio		studio
voice		voice
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
THIRD_PARTY_LICENSES.md		THIRD_PARTY_LICENSES.md
docker-compose.source.yml		docker-compose.source.yml
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

Architecture

Repository Structure

Quickstart

Roadmap

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages