Skip to content

Add Sarvam AI integration for AI-171 mobile STT & TTS evaluation#91

Open
gyanLM10 wants to merge 1 commit intoopenMF:devfrom
gyanLM10:feature/sarvam-integration-only
Open

Add Sarvam AI integration for AI-171 mobile STT & TTS evaluation#91
gyanLM10 wants to merge 1 commit intoopenMF:devfrom
gyanLM10:feature/sarvam-integration-only

Conversation

@gyanLM10
Copy link

@gyanLM10 gyanLM10 commented Mar 5, 2026

Add Sarvam AI (native Indic STT & TTS) as the foundation for evaluating multilingual STT/TTS support on mobile devices.


Jira Ticket: https://mifosforge.jira.com/browse/AI-171 Evaluate multilingual support for Sarvam AI


Implementation of Sarvam Saaras & Bulbul APIs:

  • Mobile focus: Fast API endpoints, fully unblocks mobile clients (iOS/Android) from local storage/VRAM bottlenecks.
  • Performance: Resolves Out-Of-Memory (OOM) crashes on 8GB machines and provides sub-second inference latency critical for edge devices.
  • Tooling: Dedicated run_multilingual_eval.py script for measuring end-to-end pipeline inference speed and accuracy.
  • Edge optimization: Starter foundation for offloading heavy native TTS/STT generation to optimized cloud inference for Indic languages.

What's included:

  • Sarvam_Integration/: Directory isolating the integration experiments and Langchain tools.
  • Sarvam_Integration/VoiceBanking_Results/: Submodule containing the Sarvam STT & TTS implementation scripts.
  • Sarvam_Integration/VoiceBanking_Results/run_multilingual_eval.py: Minimal wrapper evaluation script.
  • Sarvam_Integration/VoiceBanking_Results/RESULTS.md: Documentation hub for findings & reproduction.
Screenshot 2026-03-05 at 12 50 19 PM

⚙️ Development & Code Accountability

  • AI Model Used: Gemini 2.5 Pro (via Google Deepmind)


Full Accountability:
All code in this PR remains under complete human (my) responsibility.


How to Reproduce:

git clone https://github.com/gyanLM10/community-ai.git
cd community-ai
git checkout feature/sarvam-integration-only
cd Sarvam_Integration/
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cd VoiceBanking_Results
echo "SARVAM_API_KEY=your_key" > .env
python run_multilingual_eval.py

@gyanLM10 gyanLM10 marked this pull request as ready for review March 5, 2026 10:14
@gyanLM10 gyanLM10 requested a review from a team March 5, 2026 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant