Feature: FunASR/SenseVoice as STT option — fast, multilingual, small model

## Feature Request

ElatoAI supports 100+ models for AI devices. Adding **FunASR/SenseVoice** as a STT option would give users a fast, compact alternative for voice recognition:

- **SenseVoice-Small**: 234M params, 50+ languages, non-autoregressive (constant-time decoding)
- **Up to 25x faster** than Whisper-large — important for real-time device interactions
- **Built-in extras**: Emotion detection + audio event classification at no extra cost
- **OpenAI-compatible API**: `funasr-server` exposes `/v1/audio/transcriptions`
- **Edge options**: Available via [Sherpa-ONNX](https://github.com/k2-fsa/sherpa-onnx) for on-device inference (iOS, Android, Raspberry Pi)

The Sherpa-ONNX path could be especially relevant for ESP32-connected devices where a lightweight ONNX model runs on the edge or a nearby server.

**References:**
- FunASR: https://github.com/modelscope/FunASR (16K+ stars)
- Sherpa-ONNX: https://github.com/k2-fsa/sherpa-onnx (edge deployment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: FunASR/SenseVoice as STT option — fast, multilingual, small model #29

Feature Request

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Feature: FunASR/SenseVoice as STT option — fast, multilingual, small model #29

Description

Feature Request

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions