Skip to content

feat: integrate TTS voice responses and improve response loading UX#44

Merged
Afdaan merged 27 commits into
developmentfrom
feat/voice
Mar 15, 2026
Merged

feat: integrate TTS voice responses and improve response loading UX#44
Afdaan merged 27 commits into
developmentfrom
feat/voice

Conversation

@Afdaan
Copy link
Copy Markdown
Owner

@Afdaan Afdaan commented Mar 15, 2026

Summary

This Pull Request introduces two major improvements to the bot:

  1. Voice response support using the Alya-TTS microservice
  2. Enhanced user experience through animated loading placeholders

These changes significantly improve the responsiveness and interaction quality of the bot, especially for longer AI responses and voice generation workflows.

The update also includes internal refactoring to simplify loading animations, improve maintainability, and support asynchronous processing more reliably.

Key Improvements

Voice Response (TTS Integration)

The bot now supports voice responses through an external Alya-TTS microservice.

Key capabilities include:

  • Text responses can be converted into voice messages
  • Voice generation is processed through a FastAPI-based microservice
  • The bot sends requests asynchronously and receives generated voice audio for delivery to Telegram

Additional improvements:

  • Language-aware voice support
  • User voice preferences stored in the database
  • Removal of bundled local RVC models from the main bot repository
  • Clear separation between the bot service and the TTS service

Animated Loading UX

The bot now sends an initial placeholder message while waiting for AI responses instead of relying only on Telegram’s default typing indicator.

This loading message updates dynamically while the response is being generated, providing users with immediate feedback that processing is in progress.

This behavior applies to multiple interaction types:

  • Standard conversations
  • Media and document analysis
  • Roast commands
  • Voice generation

Technical Improvements

Centralized Loading Animation Helper

All loading animation logic has been consolidated into a shared helper function:

start_loading_animation()

Location:

utils/telegram_helpers.py

This removes duplicated animation loops across handlers and ensures consistent behavior across the application.

Async Task Lifecycle Protection

A global set is used to maintain strong references to active animation tasks. This prevents Python’s garbage collector from terminating background animations prematurely.

_active_animations

Telegram Rate Limit Handling

Animation updates now properly handle Telegram flood limits by respecting the Retry-After header when the API rate limit is reached.

TTS Microservice Payload Support

Voice generation requests now include:

loading_message_id

This allows the Alya-TTS microservice to delete the loading placeholder before sending the final voice message to the user.

Refactoring

Several components were refactored to improve modularity and maintainability:

  • Conversation handlers
  • Voice handlers
  • Context management
  • Persona prompt formatting
  • Loading animation utilities

Files Affected

Major updates were made to the following components:

  • handlers/conversation.py
  • handlers/voice.py
  • utils/analyze.py
  • utils/roast.py
  • utils/telegram_helpers.py
  • utils/tts_queue.py

Deployment Notes

The Alya-TTS microservice must be restarted after deployment to apply the updated request payload schema.

Afdaan and others added 27 commits December 10, 2025 03:06
fix: resolve formatting issues and NLP sentiment detection crash
fix: resolve language preference not applied in conversation responses
Fix ConversationHandler & Expand Emotion Triggers
Introduce voice-language and RVC-based voice features across the app. Adds a new DB migration (migrate_add_voice_language.py) and adds voice_language column to the User model (database/models.py); also updates the existing migration for voice_enabled. DatabaseManager gained getters/setters for user voice language and refactors for user creation/message saving. Core changes: settings reorganized, NLPEngine made resilient when transformers/torch are unavailable, Application now initializes a VoiceProcessor (when enabled) and registers VoiceHandler. Commands and responses updated to expose /voicelang and interactive language selection (callbacks), and new libs/rvc_python plus utility helpers were added. Run the new migration script from project root to update the database before enabling voice features.
Move voice generation to an external TTS microservice and remove the bundled rvc_python library. Key changes:

- .env.example: add TTS_SERVICE_URL, reorganize voice/NLP/DB settings and defaults.
- README: document new Alya-TTS microservice-based voice setup and update voice feature text.
- Removed libs/rvc_python package and related API/CLI/config files to stop bundling RVC internals.
- Added utils/tts_queue.py and integrated dispatching to queue TTS jobs rather than inlining heavy audio work.
- handlers/voice.py: refactored to translate and dispatch TTS via the queue, use DEFAULT_LANGUAGE, and avoid direct file-based TTS handling.
- handlers/conversation.py & core/bot.py: removed inline voice processing from conversation flow and updated handler wiring.
- database/database_manager.py: get_user_voice_language now falls back to user's language_code when voice_language is unset.
- handlers/admin.py: removed deployment manager/status command and cleaned up registration/logging.
- handlers/response/lang.py & handlers/commands.py: trimmed supported languages and updated voicelang labels.
- core/nlp.py: clarified torch <2.6 warning message.

Overall this refactor isolates audio workloads to a microservice, simplifies the bot code, and removes the bundled RVC implementation in favor of a separate TTS service and queued processing.
feat: add animated loading placeholder for AI response generation
@Afdaan Afdaan merged commit 2282272 into development Mar 15, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant