Talk to an LLM. Hear it talk back.
SandVoice is a Python voice assistant that turns microphone input into an AI conversation — transcribing speech, routing requests to the right plugin, and playing the response through your speakers.
Run it on your laptop, or plug a Raspberry Pi into a USB mic and speaker and turn it into a private, always-on home assistant — your own voice-activated AI, without handing your home to a big tech device.
| Mode | How to start | Best for |
|---|---|---|
| Default | ./sandvoice.py |
Voice conversation that records until ESC is pressed |
| CLI | ./sandvoice.py --cli |
Text input, voice or text output |
| Wake word | ./sandvoice.py --wake-word |
Hands-free, always listening |
git clone https://github.com/spideyz0r/sandvoice
cd sandvoice
python3 -m venv env
source env/bin/activate
pip install -r requirements.txtSet your API key:
export OPENAI_API_KEY=sk-...Run:
./sandvoice.pySandVoice reads its config from ~/.sandvoice/config.yaml. On first run it will work with defaults, but you'll want to review the config section below.
Wake word mode keeps SandVoice listening in the background. Say the wake phrase, speak your request, and hear the response — no key press needed.
./sandvoice.py --wake-wordRequired config in ~/.sandvoice/config.yaml:
bot_voice: enabled
stream_responses: enabled
stream_tts: enabled
vad_enabled: enabledThe default wake phrase is "sand voice", using the model at models/sand_voice.onnx (included in the repo). You can use any built-in openWakeWord model (e.g. hey_jarvis, alexa) or train your own — see docs/CUSTOM_WAKE_WORDS.md.
Wake word mode running on a Raspberry Pi 3B is SandVoice's most compelling use case: a standalone, always-on device you can place anywhere in the house. Say the wake phrase, ask anything, and hear the response — no screen, no keyboard, no subscription.
Traditional smart speakers like Alexa and Google Home are built around fixed command sets and scripted answers. SandVoice is backed by a large language model — it can hold a real conversation, answer complex questions, explain things, help you think through a problem, and handle follow-up questions naturally. It also knows about the weather, latest news, and anything else a plugin can reach.
Unlike those devices, SandVoice runs on hardware you own. Your voice is transcribed and
answered via OpenAI, but the device, the config, and the wake word model are entirely
yours. You can use the default "sand voice" phrase, swap in a built-in model like
hey_jarvis, or train a custom wake word for any phrase you want.
What you need: Raspberry Pi 3B (or newer), a USB microphone, and any speaker (3.5mm or USB). Full setup guide: docs/raspberry-pi-setup.md
| Key | Required | Used by |
|---|---|---|
OPENAI_API_KEY |
Always | All AI features |
OPENWEATHERMAP_API_KEY |
Weather plugin only | weather plugin |
Config lives at ~/.sandvoice/config.yaml. All keys have defaults — you only need to set what you want to override.
Note: tmp_files_path and error_log_path do not expand ~ — use absolute paths. scheduler_db_path and tasks_file_path do expand ~.
# Audio recording
channels: 2 # 1 or 2; null = auto-detect
bitrate: 128 # MP3 bitrate (32-320)
rate: 44100 # sample rate in Hz
chunk: 1024 # frames per buffer
tmp_files_path: /home/user/.sandvoice/tmp/ # must end with /
# Identity and locale
botname: SandVoice
timezone: America/Toronto # IANA timezone name (e.g. America/New_York, Europe/London)
location: Toronto, ON, CA # used by weather and routing
unit: metric # metric or imperial
language: English
verbosity: brief # brief, normal, or detailed
# system_prompt_extra: | # optional: append standing instructions to every system prompt
# Always respond in a formal tone.
# You are an expert in Brazilian cuisine.
# Logging
log_level: warning # warning, info, or debug
enable_error_logging: disabled
error_log_path: /home/user/.sandvoice/error.log
# Models
llm_summary_model: gpt-5-mini
llm_route_model: gpt-4.1-nano
llm_response_model: gpt-5-mini
speech_to_text_model: whisper-1
speech_to_text_task: translate # translate (→English) or transcribe
speech_to_text_language: "" # ISO-639-1 hint, e.g. "pt"; empty = auto
speech_to_text_translate_provider: whisper # whisper or gpt
speech_to_text_translate_model: gpt-5-mini
text_to_speech_model: tts-1
bot_voice_model: nova
# Response streaming
stream_responses: disabled # enabled = stream LLM deltas (required in --wake-word)
stream_tts: disabled # enabled = play TTS while response generates (required in --wake-word)
stream_tts_boundary: sentence # sentence or paragraph
stream_tts_first_chunk_target_s: 6 # seconds of audio to buffer before first play
# I/O toggles
bot_voice: enabled # enabled = speak responses (required in --wake-word)
cli_input: disabled # enabled = text input without --cli flag
push_to_talk: disabled # enabled = wait for keypress between turns
# API reliability
api_timeout: 10
api_retry_attempts: 3
# Plugin settings
summary_words: "100"
search_sources: "4"
rss_news: https://feeds.bbci.co.uk/news/rss.xml
rss_news_max_items: "5"
# Wake word (--wake-word mode)
wake_word_enabled: enabled # global toggle; --wake-word flag is still required to start the mode
wake_phrase: sand voice
wake_word_sensitivity: 0.25 # 0.0-1.0; higher = stricter (fewer false positives), lower = more sensitive
openwakeword_model: models/sand_voice.onnx # path to .onnx model; or a built-in e.g. hey_jarvis (no file needed)
# Voice activity detection (required in --wake-word mode)
vad_enabled: enabled
vad_aggressiveness: 3 # 0-3; 3 = most aggressive noise filtering
vad_silence_duration: 1.5 # seconds of silence to end recording
vad_frame_duration: 30 # 10, 20, or 30 ms
vad_timeout: 30 # max seconds waiting for speech
# Wake UX
wake_confirmation_beep: enabled
wake_confirmation_beep_freq: 800
wake_confirmation_beep_duration: 0.1
visual_state_indicator: enabled # ANSI state display in terminal
voice_ack_earcon: disabled # short tone after recording ends
voice_ack_earcon_freq: 600
voice_ack_earcon_duration: 0.06
voice_filler_delay_ms: 800 # ms before playing a filler phrase during slow plugin calls
voice_filler_phrases: # set to [] to disable
- "One sec."
- "Got it, checking now."
- "Okay, give me a moment."
- "Let me check that."
- "Sure, one moment."
# greeting_extra: | # optional: append custom instructions to the greeting prompt
# End the greeting with a short, relevant proverb.
# Scheduler (see Scheduled Tasks section)
scheduler_enabled: disabled
scheduler_poll_interval: 30
scheduler_db_path: /home/user/.sandvoice/sandvoice.db
tasks_file_path: /home/user/.sandvoice/tasks.yaml
# Background cache (see Background Cache section)
cache_enabled: disabled
cache_weather_ttl_s: 10800 # 3 hours
cache_weather_max_stale_s: 21600 # 6 hours
# cache_auto_refresh: [] # list of plugins to warm on startup and refresh periodicallyEach plugin handles a specific type of request. When you speak, SandVoice routes your input to the right plugin based on what you said.
Plugins live under plugins/ as either a standalone .py file or a folder containing plugin.py and plugin.yaml:
plugins/
echo.py # simple single-file plugin
weather/
plugin.py # plugin logic
plugin.yaml # route description and metadata
plugin.yaml tells SandVoice how to route requests to that plugin:
name: weather
version: 1.0.0
route_description: >
The user is asking about the weather. Include location and unit keys.
route_extra_keys:
- location
- unit
env_vars:
- OPENWEATHERMAP_API_KEY
config_defaults:
location: "Toronto,ON,CA"
unit: metric
dependencies:
- requestsSingle-file plugins (no YAML) are loaded automatically and use the routes.yaml routing table.
Every plugin must implement a top-level process function, or a class named Plugin with a process method:
def process(user_input, route, s):
# s.ai — AI instance (transcription, routing, TTS, responses)
# s.config — Config instance
# s.plugins — dict of all loaded plugins
# s.cache — VoiceCache instance (or None if disabled)
return "response string" # should return a string; exceptions abort the current run| Plugin | What it does | Extra env var needed |
|---|---|---|
weather |
Current conditions and forecast | OPENWEATHERMAP_API_KEY |
hacker-news |
Top stories from Hacker News | — |
news |
Headlines from an RSS feed | — |
realtime_websearch |
Live web search via Responses API | — |
technical |
Technical/code questions | — |
greeting |
Greetings and small talk | — |
realtime |
Web search (scraping-based, legacy) | — |
Create a folder under plugins/ with a plugin.yaml and a plugin.py:
- Write
plugin.yamlwithname,route_description, and anyenv_varsorconfig_defaults - Implement
process(user_input, route, s)inplugin.py— return a string - List any PyPI dependencies under
dependenciesin the YAML
Constraints:
- The folder name must match the
nameinplugin.yamlafter normalizing hyphens to underscores (e.g.name: my-plugin→ foldermy_plugin). Both sides are normalized before comparison, somy-pluginandmy_pluginare treated as equivalent. Plugins that fail this check are skipped with a warning. dependenciesis informational only. SandVoice does not install packages automatically; runpip installfor any listed dependency before starting.
No changes to routes.yaml or any other file needed.
SandVoice can speak reminders or trigger plugins on a schedule — without any user interaction.
In ~/.sandvoice/config.yaml:
scheduler_enabled: enabled
tasks_file_path: /home/user/.sandvoice/tasks.yamlIn ~/.sandvoice/tasks.yaml:
# Speak a reminder every weekday at 9 AM
- name: morning-reminder
schedule_type: cron
schedule_value: "0 9 * * 1-5"
action_type: speak
action_payload:
text: "Good morning! Don't forget to check your calendar."
# Refresh weather cache every hour (silently)
- name: hourly-weather
schedule_type: interval
schedule_value: "3600"
action_type: plugin
action_payload:
plugin: weather
query: weather
refresh_only: true
# One-time reminder
- name: meeting-reminder
schedule_type: once
schedule_value: "2026-06-01T09:00:00-05:00"
action_type: speak
action_payload:
text: "Your meeting starts in 15 minutes."schedule_type |
schedule_value |
Runs |
|---|---|---|
interval |
seconds, e.g. "3600" |
repeatedly, every N seconds |
cron |
cron expression, e.g. "0 9 * * *" |
at specific times (like crontab) |
once |
ISO 8601 timestamp | exactly once |
┌─ minute (0-59)
│ ┌─ hour (0-23)
│ │ ┌─ day of month (1-31)
│ │ │ ┌─ month (1-12)
│ │ │ │ ┌─ day of week (0-6, Sun=0)
0 9 * * 1-5 weekdays at 09:00
*/15 * * * * every 15 minutes
0 8,18 * * * 08:00 and 18:00 daily
- Tasks are keyed by
name. Rename a task in the YAML to force re-registration; changing only the schedule or payload of an existing name has no effect on the running DB row. - An empty
tasks.yaml([]) removes all scheduled tasks from the DB on the next startup. - If
tasks.yamldoes not exist, startup succeeds and existing DB tasks continue running. oncetasks are retried on transient errors (e.g. network timeout); config errors mark them completed immediately.- All timestamps are stored in UTC; log output uses your
timezone.
The cache stores plugin responses in SQLite so repeated queries are answered instantly. The weather, hacker-news, and news plugins all support caching; other plugins can use s.cache directly.
cache_enabled: enabledFor weather queries made outside of cache_auto_refresh, set TTL via the dedicated config keys:
cache_weather_ttl_s: 10800 # fresh for 3 hours (default)
cache_weather_max_stale_s: 21600 # serve stale for up to 6 hours (default)| Age | Result |
|---|---|
| ≤ TTL | Returned immediately, no API call |
| TTL < age ≤ max_stale | Returned from cache (stale) |
| > max_stale | Live API call, cache updated |
Add cache_auto_refresh to warm a plugin's cache on startup and refresh it silently in the background:
cache_auto_refresh:
- plugin: hacker-news
query: "hacker news"
interval_s: 28800 # refresh every 8 hours
- plugin: news
query: "latest news"
rss_url: "https://feeds.bbci.co.uk/news/rss.xml" # optional; overrides rss_news config
interval_s: 7200 # refresh every 2 hours
- plugin: weather
query: "weather"
interval_s: 10800 # refresh every 3 hoursOn startup SandVoice fetches each listed plugin immediately in parallel background threads and waits for all of them to finish (up to cache_warmup_timeout_s seconds) before becoming ready. A startup message is printed while waiting and "Ready." is printed when done. Any thread that does not finish in time keeps running in the background. When the scheduler is also enabled, a background task named cache_refresh:<cache_key> is auto-registered to repeat the refresh every interval_s seconds — also silent.
ttl_s and max_stale_s are supported by all three caching plugins (weather, hacker-news, news) and control the freshness of the entry written during the startup warmup. Both default to interval_s and int(interval_s * 1.5) respectively if omitted. Periodic scheduler-driven refreshes use the plugin's built-in defaults (the scheduler dispatch does not forward them).
For the news plugin, rss_url overrides the rss_news config value and is used as the cache key discriminator — two entries with different rss_url values are cached independently. Entries with rss_url, location, or unit overrides only run the startup warmup; no periodic scheduler task is registered for them (since the scheduler cannot forward these fields to the plugin).
| Key | Default | Description |
|---|---|---|
cache_warmup_timeout_s |
15 |
Max seconds to wait for all warmup threads. Set to 0 for fire-and-forget (old behaviour). |
cache_warmup_retries |
3 |
Max attempts per plugin before giving up on warmup. |
cache_warmup_retry_delay_s |
2 |
Seconds between retry attempts. |
Note:
cache_auto_refreshrequirescache_enabled: enabled. If the scheduler is disabled, the startup warmup still runs but no periodic tasks are registered.