feat: integrate TTS voice responses and improve response loading UX by Afdaan · Pull Request #44 · Afdaan/Alya-Bot-Telegram

Afdaan · 2026-03-15T08:20:28Z

Summary

This Pull Request introduces two major improvements to the bot:

Voice response support using the Alya-TTS microservice
Enhanced user experience through animated loading placeholders

These changes significantly improve the responsiveness and interaction quality of the bot, especially for longer AI responses and voice generation workflows.

The update also includes internal refactoring to simplify loading animations, improve maintainability, and support asynchronous processing more reliably.

Key Improvements

Voice Response (TTS Integration)

The bot now supports voice responses through an external Alya-TTS microservice.

Key capabilities include:

Text responses can be converted into voice messages
Voice generation is processed through a FastAPI-based microservice
The bot sends requests asynchronously and receives generated voice audio for delivery to Telegram

Additional improvements:

Language-aware voice support
User voice preferences stored in the database
Removal of bundled local RVC models from the main bot repository
Clear separation between the bot service and the TTS service

Animated Loading UX

The bot now sends an initial placeholder message while waiting for AI responses instead of relying only on Telegram’s default typing indicator.

This loading message updates dynamically while the response is being generated, providing users with immediate feedback that processing is in progress.

This behavior applies to multiple interaction types:

Standard conversations
Media and document analysis
Roast commands
Voice generation

Technical Improvements

Centralized Loading Animation Helper

All loading animation logic has been consolidated into a shared helper function:

start_loading_animation()

Location:

utils/telegram_helpers.py

This removes duplicated animation loops across handlers and ensures consistent behavior across the application.

Async Task Lifecycle Protection

A global set is used to maintain strong references to active animation tasks. This prevents Python’s garbage collector from terminating background animations prematurely.

_active_animations

Telegram Rate Limit Handling

Animation updates now properly handle Telegram flood limits by respecting the Retry-After header when the API rate limit is reached.

TTS Microservice Payload Support

Voice generation requests now include:

loading_message_id

This allows the Alya-TTS microservice to delete the loading placeholder before sending the final voice message to the user.

Refactoring

Several components were refactored to improve modularity and maintainability:

Conversation handlers
Voice handlers
Context management
Persona prompt formatting
Loading animation utilities

Files Affected

Major updates were made to the following components:

handlers/conversation.py
handlers/voice.py
utils/analyze.py
utils/roast.py
utils/telegram_helpers.py
utils/tts_queue.py

Deployment Notes

The Alya-TTS microservice must be restarted after deployment to apply the updated request payload schema.

fix: resolve formatting issues and NLP sentiment detection crash

fix: resolve language preference not applied in conversation responses

Fix ConversationHandler & Expand Emotion Triggers

Introduce voice-language and RVC-based voice features across the app. Adds a new DB migration (migrate_add_voice_language.py) and adds voice_language column to the User model (database/models.py); also updates the existing migration for voice_enabled. DatabaseManager gained getters/setters for user voice language and refactors for user creation/message saving. Core changes: settings reorganized, NLPEngine made resilient when transformers/torch are unavailable, Application now initializes a VoiceProcessor (when enabled) and registers VoiceHandler. Commands and responses updated to expose /voicelang and interactive language selection (callbacks), and new libs/rvc_python plus utility helpers were added. Run the new migration script from project root to update the database before enabling voice features.

Move voice generation to an external TTS microservice and remove the bundled rvc_python library. Key changes: - .env.example: add TTS_SERVICE_URL, reorganize voice/NLP/DB settings and defaults. - README: document new Alya-TTS microservice-based voice setup and update voice feature text. - Removed libs/rvc_python package and related API/CLI/config files to stop bundling RVC internals. - Added utils/tts_queue.py and integrated dispatching to queue TTS jobs rather than inlining heavy audio work. - handlers/voice.py: refactored to translate and dispatch TTS via the queue, use DEFAULT_LANGUAGE, and avoid direct file-based TTS handling. - handlers/conversation.py & core/bot.py: removed inline voice processing from conversation flow and updated handler wiring. - database/database_manager.py: get_user_voice_language now falls back to user's language_code when voice_language is unset. - handlers/admin.py: removed deployment manager/status command and cleaned up registration/logging. - handlers/response/lang.py & handlers/commands.py: trimmed supported languages and updated voicelang labels. - core/nlp.py: clarified torch <2.6 warning message. Overall this refactor isolates audio workloads to a microservice, simplifies the bot code, and removes the bundled RVC implementation in favor of a separate TTS service and queued processing.

feat: add animated loading placeholder for AI response generation

Afdaan and others added 27 commits December 10, 2025 03:06

Merge pull request #38 from Afdaan/fix/reply-quote-formatting

ae3d70a

fix: resolve formatting issues and NLP sentiment detection crash

Merge pull request #40 from Afdaan/development

e7ee411

fix: resolve language preference not applied in conversation responses

Merge pull request #42 from Afdaan/development

f14f0e2

Fix ConversationHandler & Expand Emotion Triggers

feat: Enhance voice model setup and processing

6e6d91d

feat: Refactor requirements.txt for improved organization and clarity

1949d22

fix: lower torchaudio minimum to 2.6.0

685d334

fix: downgrade faiss-cpu to 1.7.3

469da08

fix: Pin pip <24.1 and uninstall omegaconf

a0f7903

fix: precompute persona traits & relationship text

5567152

fix: add __init__.py for rvc_python packages

d852989

fix: rvc module

25700a3

fix: force reload rvc modules to bypass site-packages

2b8930c

fix: rvc module reload and absolute paths

40c1c93

fix: unignore internal lib folder and force add rvc files

448b0a5

fix: Refactor DB imports, add RVC fix and DB migration

bf7ca4f

fix: Extend users table with mood and voice columns

6522f79

feat: add ChatActionSender and voice/mood DB support

c0923aa

fix: Format context & use ContextManager in voice handler

c4e582a

feat: switch Japanese lang key to 'jp' and prefer RVC

4d91f7e

fix: Remove RVC and local voice model settings

7ad7713

Remove test_vp_init.py

89afc0c

fix: Language-aware persona prompts and handler refactor

24d5fe0

feat: Add animated loading UI and edit responses

e858c7d

feat: centralize loading animations in helpers

102aac8

Merge pull request #43 from Afdaan/feat/ui

84b752e

feat: add animated loading placeholder for AI response generation

Afdaan merged commit 2282272 into development Mar 15, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: integrate TTS voice responses and improve response loading UX#44

feat: integrate TTS voice responses and improve response loading UX#44
Afdaan merged 27 commits into
developmentfrom
feat/voice

Afdaan commented Mar 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant