Skip to content

Feature: Add SenseVoice support via Sherpa-ONNX for on-device ASR #30

@LauraGPT

Description

@LauraGPT

Motivation

ElatoAI runs realtime voice AI on ESP32. SenseVoice (8K+ stars) is available through Sherpa-ONNX which already supports embedded/IoT platforms including ESP32-compatible inference.

Why SenseVoice for ElatoAI

  • Non-autoregressive: Single forward pass, minimal compute per chunk
  • SenseVoice-Small (234M): 50+ languages with auto detection
  • ONNX format: Runs via Sherpa-ONNX on embedded devices
  • Built-in VAD: No separate voice activity detection needed
  • Emotion detection: Detect user emotions from speech — useful for companion AI

Integration via Sherpa-ONNX

Sherpa-ONNX provides C/C++ API suitable for embedded:

// C API for embedded devices
SherpaOnnxOfflineRecognizer *recognizer = 
    SherpaOnnxCreateOfflineRecognizer(&config);
// Process audio frames → get text

For server-side processing (when ESP32 sends audio to a server):

pip install funasr vllm
funasr-server --device cuda  # OpenAI-compatible at :8000

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions