External voice runtime for the Genie ecosystem.
genie-voice-runtime owns the audio pipeline:
- wake word and push-to-talk activation
- voice activity detection
- capture and playback device handling
- speech-to-text
- text-to-speech
- acoustic echo control and denoise stages
- streaming voice session events
It does not own the agent.
genie-claw remains the agent layer: prompt policy, memory, tools, smart-home
intent, safety gates, audit, and conversation state. The voice runtime turns
audio into text and text back into audio; GenieClaw decides what the user meant
and what actions are allowed.
microphone / speaker
|
v
genie-voice-runtime
wake, VAD, STT, TTS, audio streaming
|
| transcript / spoken reply events
v
genie-claw
agent policy, memory, tools, smart-home actions
|
v
genie-home-runtime / Home Assistant
The protocol types in this crate are the stable contract between GenieClaw and the voice runtime. The runtime implementation can evolve without forcing GenieClaw to own audio details.
- No agent prompts, memory policy, tool routing, or smart-home authorization in this repo.
- No Home Assistant or
genie-home-runtimedevice logic in this repo. - No LLM provider logic in this repo.
- Voice hardware should be portable across NVIDIA Jetson Orin 8GB, Raspberry Pi, other SBCs, Linux laptops, and development machines where possible.
- NVIDIA Jetson Orin 8GB remains the flagship tested deployment.
Public datasets can be used, but only for the layer they actually measure:
genie-voice-runtimeowns audio, wake/VAD, STT/TTS, transcript quality, and noisy utterance robustness.genie-clawowns BFCL tool-call scoring, family memory retrieval, smart-home intent routing, and deterministic device-state questions.
See docs/evaluation-data.md for the allowed data
sources, license notes, and where each dataset belongs.
Initial boundary scaffold. The current production voice code still lives in
genie-claw and will move here in slices:
- protocol and config contract
- STT/TTS process wrappers
- streaming session events
- wake/VAD/audio-device handling
- GenieClaw external runtime client
- removal of the internal GenieClaw voice pipeline