Skip to content

Feature: FunASR/SenseVoice as STT option — fast, multilingual, small model #29

@LauraGPT

Description

@LauraGPT

Feature Request

ElatoAI supports 100+ models for AI devices. Adding FunASR/SenseVoice as a STT option would give users a fast, compact alternative for voice recognition:

  • SenseVoice-Small: 234M params, 50+ languages, non-autoregressive (constant-time decoding)
  • Up to 25x faster than Whisper-large — important for real-time device interactions
  • Built-in extras: Emotion detection + audio event classification at no extra cost
  • OpenAI-compatible API: funasr-server exposes /v1/audio/transcriptions
  • Edge options: Available via Sherpa-ONNX for on-device inference (iOS, Android, Raspberry Pi)

The Sherpa-ONNX path could be especially relevant for ESP32-connected devices where a lightweight ONNX model runs on the edge or a nearby server.

References:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions