This repository is a Python open-source package built on LiveKit for voice AI applications. The package should be production-minded, easy to understand, well-typed, testable, and friendly for contributors.
When making changes, prioritize:
- Correctness and reliability in real-time voice flows
- Clear public APIs
- Backward compatibility for open-source users
- Strong typing and test coverage
- Minimal, maintainable abstractions
- Build reusable Python components for voice AI on LiveKit
- Keep framework-specific details isolated where possible
- Make real-time behavior predictable and observable
- Prefer explicit APIs over magic or hidden behavior
- Optimize for contributor readability, not cleverness
- Keep business logic separate from transport/runtime glue
- Keep LiveKit integration code isolated from domain logic
- Avoid tightly coupling high-level package APIs to internal implementation details
- Prefer composition over inheritance
- Keep modules focused and small
- Do not introduce global mutable state unless absolutely necessary
- Avoid singleton-heavy designs
Use this mental model when organizing code:
api/or public package surface: stable interfaces intended for userscore/: domain logic, orchestration, state handlingintegrations/orlivekit/: LiveKit-specific adapters and runtime hooksmodels/ortypes/: typed shared schemas, events, config objectsutils/: narrow helper functions only, no hidden business logic
Business logic must not live inside:
- CLI entrypoints
- example scripts
- callbacks that should delegate into core logic
- transport adapters
- Target modern Python syntax supported by the project version
- Use type hints everywhere
- Prefer explicit return types on public functions and methods
- Prefer small pure functions where practical
- Prefer
dataclassor clear typed classes over loose dictionaries for structured data - Prefer
Enumfor constrained option sets - Avoid boolean flag arguments when an enum or separate function would be clearer
- Avoid overly terse variable names
- Avoid surprising side effects
- Write code for maintainers and contributors unfamiliar with the internals
- Favor straightforward control flow over clever compact code
- Add comments only when the why is not obvious from the code
- Do not add noisy comments that restate the code
- Keep functions focused on one responsibility
- If a function needs extensive explanation, split it into smaller helpers
- Use
snake_casefor variables, functions, and modules - Use
PascalCasefor classes - Use
UPPER_SNAKE_CASEfor constants - Name functions after what they do, not how they do it
- Use domain-specific names such as
session,turn,utterance,participant,track,transcript,agent,pipeline,vad,stt,ttswhen appropriate
- Prefer absolute imports within the package
- Keep imports grouped and sorted
- Avoid circular imports by improving module boundaries rather than using late imports unless necessary
- Do not introduce heavy dependencies in core modules unless justified
- Treat public APIs as stable unless the task explicitly allows a breaking change
- Avoid renaming or reshaping public interfaces without strong justification
- If a breaking change is necessary, update docs, changelog notes, and tests
- Public APIs should be easy to discover and hard to misuse
- Prefer explicit configuration objects over long argument lists
- Prefer sensible defaults
- Validate user input early with clear error messages
- Raise precise exceptions with actionable messages
- Avoid leaking internal implementation details through public return values
- Use async only where it is justified by I/O or runtime integration
- Keep async boundaries clean and intentional
- Do not mix sync and async styles inconsistently in the same API surface
- Avoid blocking operations in async code
- Use cancellation-safe patterns where relevant
- Real-time audio paths should avoid unnecessary allocations, blocking calls, and hidden latency
- Be careful with backpressure, task buildup, and event storms
- Time-sensitive paths should be simple and observable
- Do not add logging noise in hot loops
- Make event handling explicit
- Prefer well-defined event payload types over loosely shaped dicts
- Guard against race conditions in session lifecycle logic
- Be careful around connect/disconnect, participant joins/leaves, track subscription changes, and stream interruptions
- Preserve clear boundaries between:
- audio input handling
- VAD / turn detection
- STT
- LLM or agent reasoning
- TTS
- playback/output
- Avoid coupling one stage tightly to another unless necessary
- Keep pipeline stages mockable for tests
- Make retries, fallbacks, and timeout behavior explicit
- Network/runtime failures should degrade gracefully where possible
- Never swallow exceptions silently
- Include contextual information in exceptions and logs
- Distinguish between expected runtime conditions and true failures
- All new public functions, methods, and classes must be typed
- Internal functions should also be typed unless truly trivial
- Prefer concrete types over
Any - Use
Protocolfor interface-like behavior where useful - Use
TypedDictonly when interop with dict-shaped payloads is necessary - Prefer dataclasses or typed models for internal structured data
- Keep generics understandable; avoid type complexity that hurts maintainability
Example expectations:
- Good:
def create_session(config: SessionConfig) -> AgentSession: - Good:
async def synthesize(request: TTSRequest) -> AsyncIterator[AudioFrame]: - Avoid:
def run(config): - Avoid:
def process(data: dict[str, Any]) -> Any:
- Public classes and functions should have concise docstrings
- Docstrings should explain behavior, important parameters, return values, and edge cases
- Keep docstrings practical and contributor-friendly
- Update README or package docs when changing public behavior
- Examples should reflect real package usage patterns and be runnable with minimal setup
Use a consistent style across the repo. Prefer concise Google-style or simple imperative prose.
Example:
def start(self) -> None:
"""Start the agent session and begin processing events."""- All non-trivial changes should include tests
- Prefer
pytest - Test public behavior, not private implementation details
- Keep tests deterministic and readable
- Avoid brittle timing-based tests unless unavoidable
Cover these when relevant:
- Session lifecycle
- Event ordering
- Reconnection / interruption paths
- Config validation
- Error propagation
- Async cancellation behavior
- Fallback logic
- Serialization and deserialization
- Public API behavior
- Prefer fixtures over repetitive setup
- Mock LiveKit/network boundaries cleanly
- Do not over-mock pure logic
- Add regression tests for bug fixes
- For async code, ensure tasks are awaited and cleaned up properly
- Add focused tests for buffering, queue growth, timeout handling, and cleanup when relevant
- Be cautious with sleeps in tests; prefer synchronization primitives or explicit hooks
- Use structured, meaningful logging
- Logs should help debug real-time failures without flooding output
- Avoid excessive debug logs in hot paths
- Include identifiers like session ID, participant ID, track ID, or request ID when helpful
- Never log secrets, tokens, or sensitive user content unless explicitly intended and documented
Error messages should state what failed and what the user can do next.
Prefer:
"TTS provider timed out after 10s for session {session_id}"
Avoid:
"Something went wrong"
- Prefer explicit config objects for non-trivial components
- Validate config at boundaries
- Keep defaults safe and unsurprising
- Avoid hidden behavior controlled by many environment variables
- Document all supported environment variables and config fields
- Keep dependencies minimal
- Prefer standard library where practical
- New dependencies must have clear value
- Avoid pulling large frameworks into core paths unless justified
- Favor libraries with good maintenance and permissive licenses
- Do not introduce dependencies just for small helper functionality
- Be careful with task spawning; every background task should have a clear lifecycle
- Clean up tasks, streams, and resources explicitly
- Avoid unbounded queues and silent buffering
- Prefer bounded concurrency when processing streams/events
- Watch for memory leaks in long-lived sessions
- Avoid blocking file or network I/O in the event loop
When changing concurrency code:
- Reason about cancellation
- Reason about shutdown behavior
- Reason about partial failure
- Reason about ordering guarantees
This is an open-source package. Preserve compatibility unless the change explicitly calls for a breaking release.
Before making a breaking change:
- Confirm it is necessary
- Minimize blast radius
- Update docs and migration guidance
- Update tests to reflect the new contract
If unsure, prefer additive changes over breaking changes.
- Keep the repo approachable for new contributors
- Prefer obvious file placement and predictable naming
- Add or update examples for meaningful new capabilities
- Avoid hidden conventions
- Leave code cleaner than you found it
When adding a new module:
- Ensure the name is discoverable
- Ensure its responsibility is narrow
- Add tests
- Expose it publicly only if needed
- Typed dataclasses for configuration and event payloads
- Small adapter classes around provider-specific integrations
- Explicit state transitions for session/agent lifecycle
- Dependency injection for providers like STT, TTS, VAD, and LLM backends
- Narrow interfaces for pluggable components
- Helper functions for repeated validation and normalization
- Giant manager classes with many responsibilities
- Hidden global registries
- Untyped dicts passed across layers
- Broad
except Exceptionwithout re-raising or translating meaningfully - Adding retries without timeout and observability
- Mixing demo/example code into core package modules
- Adding synchronous blocking calls inside async execution paths
When making changes:
- Keep diffs focused
- Update tests
- Update docs if public behavior changes
- Preserve compatibility unless explicitly told otherwise
- Add clear notes in code comments only where they improve maintainability
Assume these conventions unless the existing repo clearly uses something else:
- Formatting with Black-compatible style
- Linting with Ruff
- Tests with Pytest
- Package metadata in
pyproject.toml - Type checking with mypy or pyright
- Examples in
examples/ - Docs in
docs/
If the repo already has established tooling, follow the repo instead of inventing a parallel standard.
When implementing a feature or fix:
- First understand the existing public API and architecture
- Make the smallest clean change that solves the problem
- Reuse existing abstractions before adding new ones
- Add or update tests near the changed behavior
- Avoid speculative refactors unless they are necessary to complete the task safely
When editing code:
- Preserve existing style and patterns where they are already good
- Improve weak areas incrementally, not by rewriting unrelated modules
- Do not introduce unrelated renames or formatting-only churn
- Examples should demonstrate realistic voice AI workflows
- Favor examples that are minimal but idiomatic
- Use names that reflect voice-agent concepts
- Ensure examples match the current public API
- Avoid pseudo-code in user-facing docs when real code is practical
If there is a conflict, follow this priority order:
- Correctness and safety in real-time behavior
- Public API stability
- Consistency with existing repository patterns
- Simplicity and readability
- Performance optimization
When in doubt:
- Choose the simpler design
- Choose the more explicit API
- Choose the more testable implementation
- Choose the option that is friendlier to open-source contributors
OpenRTC is a single Python package (src/openrtc/) with no runtime services required for development. Normal uv sync installs the real livekit-agents wheel, so tests run against the actual SDK. A fallback shim in tests/conftest.py only applies when livekit.agents cannot be imported. No LiveKit server, API keys, or external providers are needed to run the test suite.
Python: requires-python is 3.11+ (see pyproject.toml); 3.10 is not supported.
All commands are documented in CONTRIBUTING.md. Quick reference:
- Install deps:
uv sync --group dev - Tests:
uv run pytest(self-contained; no LiveKit server required) - Lint:
uv run ruff check . - Format check:
uv run ruff format --check . - Type check:
uv run mypy src/(must pass clean; also runs in.github/workflows/lint.yml) - CLI demo:
uv run openrtc list ./examples/agents --default-stt "deepgram/nova-3:multi" --default-llm "openai/gpt-4.1-mini" --default-tts "cartesia/sonic-3"(same as--agents-dir ./examples/agents)
- The
tests/conftest.pyshim targets thelivekit-agentspin inpyproject.toml(~1.4.x today) and only implements APIs OpenRTC uses. When upgrading LiveKit or adding newlivekit.agentsusage, extend the shim or confirm tests pass with the real SDK (uv sync+uv run pytest). If imports behave oddly, check whether the shim path is active vs. the real package. - Version is derived from git tags via
hatch-vcs. In a dev checkout the version will be something like0.0.9.dev0+g<hash>. mypyis enforced in CI alongside Ruff; runuv run mypy src/before pushing type-sensitive changes.- Running
openrtc startoropenrtc devrequires a running LiveKit server and provider API keys. For development validation, useopenrtc listwhich exercises discovery and routing without network dependencies. The optional sidecar metrics TUI (openrtc tui, requiresopenrtc[tui]/ dev deps) tails./openrtc-metrics.jsonlby default (same path as--metrics-jsonlon the worker; override with--watch). pytest-covis in the dev dependency group; CI uses--cov-fail-under=80; runuv run pytest --cov=openrtc --cov-report=xml --cov-fail-under=80to match.