Skip to content

feat: Automated Multilingual Translation Layer for Emergency Reports#203

Open
Gopisokk wants to merge 1 commit intofireform-core:mainfrom
Gopisokk:feat/multilingual-translation
Open

feat: Automated Multilingual Translation Layer for Emergency Reports#203
Gopisokk wants to merge 1 commit intofireform-core:mainfrom
Gopisokk:feat/multilingual-translation

Conversation

@Gopisokk
Copy link

@Gopisokk Gopisokk commented Mar 8, 2026

Resolves #107

📝 Description

This PR introduces a dedicated translation module to FireForm to support global responders submitting voice notes in their native languages.

Incoming text is automatically detected (e.g., French, Arabic, Spanish) and translated into standardized English before being processed by the LLM. This ensures the Master Schema remains consistent across all international missions and that the final generated PDFs are always in English.

🛠️ Changes Made

  • New Translator Module: Added src/translator.py which uses langdetect for automatic language recognition and deep-translator for translating to English.
  • Pipeline Integration: Intercepted the input in src/file_manipulator.py to route text through the Translator before building the LLM.
  • LLM Prompt Update: Adjusted the system prompt in src/llm.py to notify the model that the input has been pre-translated.
  • Data Persistence: Updated api/db/models.py and api/schemas/forms.py to store and return the detected_language.
  • Dependencies: Added deep-translator and langdetect to requirements.txt.
  • Documentation: Added the "🌍 Multilingual Support" section to README.md.

🛡️ Fallback & Error Handling

To ensure the pipeline is never blocked:

  • If translation or language detection fails (e.g., due to network issues), the app logs the failure and gracefully passes the original text to the LLM.
  • English text is passed through untouched to save latency.

🧪 Testing

  • Added comprehensive unit tests in tests/test_translation.py covering multiple languages (French, Arabic, Spanish, English) and mocked failure states.
  • Added integration tests to tests/test_forms.py to verify the new endpoints save and return the detected_language correctly.
  • All tests pass locally.

@Gopisokk
Copy link
Author

Gopisokk commented Mar 8, 2026

@marcvergees I've integrated langdetect and deep-translator into the data extraction pipeline (src/file_manipulator.py
), so any incoming text is automatically detected and translated to English before it hits the LLM.

Please check the pr and changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEAT]: Automated Multilingual Translation Layer for Emergency Reports

1 participant