Winner – 1st Place at the IITM Azure Learning Series Hackathon
EduBotAI+ is an AI-powered smart learning assistant that transforms how students engage with study material. It supports PDFs, images, handwritten notes, and slides—using Azure's services to extract content. Leveraging NLP techniques, it performs text extraction,grammar correction, summarization, translation, and flashcard generation. With added voice-based input/output, EduBotAI+ simplifies complex concepts into interactive, easy-to-understand formats—making learning faster and more personalized. l
Users can upload PDFs, images, or audio, and choose from:
- 📝 Text Converter – Extracts text from documents and images using OCR.
- 🧠 Summary Creator – Cleans and summarizes large texts for quick understanding.
- 🎙️ Voice Translator – Converts speech to text, translates, and generates a subtitle video or audio.
- Uses: Azure Speech-to-Text
- Supports: Direct microphone input or uploaded
.wav/.mp3files - Output: Recognized transcript saved as
voice_transcript.txt
- Converts summarized content or notes into Q&A flashcards
- Extracts key concepts using:
- Keyword analysis
- Sentence similarity
- Bloom’s taxonomy (optional)
- Output format: Readable list and downloadable
.txtor.json
- Input: User-typed or extracted text
- Language Options:
- English 🇬🇧
- Tamil 🇮🇳
- Hindi 🇮🇳
- Telugu 🇮🇳
- Output:
- 🎧 Audio (.mp3)
- 🎞️ MP4 video with:
- Subtitles in selected language
- Custom or default background image
- Python
- Flask
- Lightweight Python web framework used to handle routing, templating, and RESTful API services
- Azure Services:
- Azure Speech Service (STT + TTS) – Free Tier
- Azure Computer Vision OCR – Free Tier
- Azure Blob Storage – For storing processed files
- Azure Cosmos DB – For storing metadata and logs
- Azure Translator - For translating text to speech - Free Tier
- HTML/CSS/JavaScript
- Responsive UI for uploads, operations, and results display
- Users speak or upload an audio file
- Azure STT converts speech to text
- Saved as transcript for next steps
- PDFs, scanned images, and handwritten notes are processed
- Azure OCR extracts raw text from uploaded files
- Removes noise (special characters, formatting issues)
- Outputs clean paragraphs
- Summarization done using OpenAI or rule-based logic
- Smart Q&A generated from clean text
- Exportable for revision or quiz practice
- User text is converted to speech (TTS)
- Subtitle video generated with selected language and image
- Stored in Azure Blob Storage for later access/download
All outputs including:
- Cleaned text
- Summaries
- Flashcards (.txt/.json)
- Transcripts
- MP4 videos with subtitles
...are automatically uploaded and organized in Azure Blob Storage, with metadata tracked in Azure Cosmos DB for easy access and audit.
A student uploads a PDF chapter and selects “Summary Creator” → EduBotAI+ extracts and cleans the text, summarizes it, and generates flashcards. They can also listen to the summary in Hindi or generate a video with Tamil subtitles for revision.
🔎 1. Semantic Search on Summarized/Corrected Content
Enable intelligent search using vector embeddings (e.g., SentenceTransformers):
- 🔍 Search similar summaries based on meaning, not just keywords
- ❓ Ask questions across multiple documents
- 🧠 Smart retrieval: “Find notes where cybersecurity attacks were mentioned”
🧠 2. LLM-Powered Summary QA (Question-Answering) Integrate with LLMs like OpenAI GPT or Anthropic Claude for enhanced interaction:
- 📝 "Summarize this. Now give 3 bullet points and a call to action."
- ❓ Ask clarifying questions on the summary
- ✅ Improve summary coherence and factual accuracy
🧪 3. Multimodal Input Handling Extend input types beyond PDF/Image/PPTX:
- 🎧 Audio Support (MP3, WAV):
- Transcribe using Whisper / Google STT
- Clean and summarize the transcript
- 📹 YouTube or Video Link Support:
- Frame-by-frame OCR for visual content
- Summarize lecture visuals/texts
- Clone the repository
- Install the virtual environment and dependencies
- Train the model and run the flask Application
| Name | GitHub Username |
|---|---|
| Shobhana S | @Shobhanashankar |
| Gautham R | @gautham-here |
| Sri Ranjana C | @sriranjanac |