Built for teams who want full control over their calling AI stack,
from infrastructure to data to cost.
β Drop a star to help us grow!
Siphon is a Python framework that handles the hard parts of real-time voice AI:
- β SIP + telephony integration β Connect to any SIP trunk (Twilio, Telnyx, SignalWire, etc.)
- β Streaming audio pipelines β Sub-500ms latency powered by WebRTC (LiveKit)
- β Interruptions & barge-in β Natural conversation flow with configurable turn detection
- β Agent state management β Recording, transcription, metadata persistence
- β Horizontal scaling β Run 1 or 1,000 workers with zero-config load balancing
So you can focus on agent behavior, not call plumbing.
- π€ Your LLM (OpenAI, Anthropic, Google, DeepSeek, Groq, Cerebras, Mistral, etc.)
- π€ Your STT/TTS providers (Deepgram, Cartesia, ElevenLabs, AssemblyAI, Sarvam, etc.)
- π Your SIP trunk (Twilio, Telnyx, SignalWire, or self-hosted)
- βοΈ Your infrastructure (LiveKit Cloud or self-hosted)
- π° Your margins β No per-minute markup on AI provider costs
- π Your data β Runs on your infrastructure, all logs stay with you
- π Your observability β Complete control over recording, transcription, metadata
- π Your keys β Direct integration with AI providers, no middleman
β Not a SaaS platform β You host it, you control it
β Not a black box β Open-source (Apache 2.0), inspect and modify everything
β Not a per-minute tax β No markup on your AI provider costs
β Not vendor lock-in β Swap LLM/STT/TTS providers with a config change
Voice agents listen to everything.
Your customers' calls contain sensitive information β personal details, business data, private conversations.
Traditional managed platforms route every call through their infrastructure. You pay per minute and trust them with your data.
Siphon runs on your infrastructure.
You own the keys. You control the data. You keep the margins.
| β‘ Low Latency | π‘οΈ Production Ready | π Infinite Scale |
|---|---|---|
| Powered by WebRTC (LiveKit) for sub-500ms voice interactions that feel like real human conversation. | Handles the chaotic reality of phone networksβaudio packet loss, SIP signaling, and interruptions. | Define your agent once and run it on 1 or 1,000 servers. It balances the load automatically. |
If you're new to Siphon, we recommend checking out:
- π Documentation
- β‘ Quick Start Guide
pip install siphon-aiSiphon requires LiveKit for real-time media and API keys for your AI providers.
Create a .env file:
# LiveKit (Cloud: https://cloud.livekit.io/ or Self-hosted)
LIVEKIT_URL=...
LIVEKIT_API_KEY=...
LIVEKIT_API_SECRET=...
# AI Providers
OPENAI_API_KEY=...
DEEPGRAM_API_KEY=...
CARTESIA_API_KEY=...Create a file named agent.py. This simple agent acts as a helpful assistant.
from siphon.agent import Agent
from siphon.plugins import openai, cartesia, deepgram
from dotenv import load_dotenv
load_dotenv()
# Initialize your AI stack
llm = openai.LLM()
tts = cartesia.TTS()
stt = deepgram.STT()
# Define the Agent
agent = Agent(
agent_name="Receptionist",
llm=llm,
tts=tts,
stt=stt,
system_instructions="You are a helpful receptionist. Answer succinctly.",
)
if __name__ == "__main__":
# One-time setup: downloads required files (only needed on fresh machines)
agent.download_files()
# Start the agent worker in development mode
agent.dev()
# Start the agent worker in production mode
# agent.start()For more details on configuring your Agent (latency, interruptions, VAD...etc) and exploring available Plugins (Deepgram, Cartesia, OpenAI, ElevenLabs...etc), check out the documentation.
Start your agent worker.
python agent.pyHorizontal Scaling: To scale, simply run this command on multiple servers. The worker architecture automatically detects new nodes and balances the load with Zero Configuration. Learn more about Scaling
Bind a phone number to your agent using a Dispatch rule.
import os
from siphon.telephony.inbound import Dispatch
from dotenv import load_dotenv
load_dotenv()
dispatch = Dispatch(
dispatch_name="customer-support",
agent_name="Receptionist", # Must match the name in agent.py
sip_trunk_id=os.getenv("SIP_TRUNK_ID"),
# Or: sip_number=os.getenv("SIP_NUMBER"),
)
dispatch.agent()Note: For more details, check out the Inbound Documentation. To configure numbers with providers like Twilio, see the Twilio Setup Guide.
Trigger calls programmatically from your code or API.
import os
from siphon.telephony.outbound import Call
from dotenv import load_dotenv
load_dotenv()
call = Call(
agent_name="Receptionist", # Must match the name in agent.py
sip_trunk_setup={ ... }, # Your SIP credentials
# Or: sip_trunk_id=os.getenv("SIP_TRUNK_ID"),
number_to_call="+15550199"
)
call.start()Note: For more details, check out the Outbound Documentation. To configure trunks with providers like Twilio, see the Twilio Setup Guide.
Siphon enables call recordings, transcriptions, and metadata persistence via environment variables.
# Enable saving features
CALL_RECORDING=true
SAVE_METADATA=true
SAVE_TRANSCRIPTION=true
# Configure storage location (locally, S3, Redis, Postgres, etc)
METADATA_LOCATION=Metadata # saves locally
TRANSCRIPTION_LOCATION=postgresql://..... # saves to postgresql
# Configure S3 (Call Recordings are always saved to S3)
AWS_S3_ENDPOINT=
AWS_S3_ACCESS_KEY_ID=
AWS_S3_SECRET_ACCESS_KEY=
AWS_S3_BUCKET=
AWS_S3_REGION=
AWS_S3_FORCE_PATH_STYLE=trueNote: Siphon supports multiple storage backends. For detailed configuration instructions, see the Call Data Documentation.
| Example | Description |
|---|---|
| A 24/7 AI Dental Receptionist in few lines | A fully functional AI receptionist that handles appointment booking, modifications, and cancellations with Google Calendar integration. |
More coming and stay tuned π!
For detailed documentation, visit Siphon Documentation, including a Quickstart Guide.
We love contributions from the community β€οΈ. For details on contributing or running the project for development, check out our Contributing Guide.
We are constantly improving, and more features and examples are coming soon. If you love this project, please drop us a star β at GitHub repo to stay tuned and help us grow.
Siphon is Apache 2.0 licensed.

