Hi! Cool project running voice AI on ESP32!
Would you consider FunASR or SenseVoice as a server-side STT option?
Why?
- Ultra-low latency — SenseVoice-Small: ~70ms for 10s audio
- WebSocket server — FunASR provides a production-ready WebSocket server that ESP32 can connect to
- Streaming — Paraformer-streaming processes audio frame-by-frame
- Self-hosted — No API costs, fully offline server
- 50+ languages — Apache 2.0
WebSocket server for embedded clients
# Start server that ESP32 can connect to
funasr-wss-server --model paraformer-zh-streaming --port 10095
References
Hi! Cool project running voice AI on ESP32!
Would you consider FunASR or SenseVoice as a server-side STT option?
Why?
WebSocket server for embedded clients
# Start server that ESP32 can connect to funasr-wss-server --model paraformer-zh-streaming --port 10095References