EvaLectureAR is an advanced AI-powered educational platform designed to assess and improve Arabic reading skills for primary school students (grades 1-6) in the Moroccan educational system. The system integrates cutting-edge technologies including Automatic Speech Recognition (ASR), Natural Language Processing (NLP), Retrieval-Augmented Generation (RAG), and real-time audio processing to provide comprehensive reading assessment and personalized feedback.
The platform addresses critical challenges in Arabic language education by providing automated, objective, and detailed assessment of pronunciation, fluency, accuracy, and comprehension. It leverages cloud-based AI services, machine learning models, and advanced text processing techniques to deliver real-time feedback and generate adaptive learning recommendations.
Traditional Arabic reading assessment methods in primary education face several limitations:
- Subjectivity: Manual assessment by teachers introduces inconsistency
- Scalability: Individual assessment is time-consuming and resource-intensive
- Real-time Feedback: Delayed feedback reduces learning effectiveness
- Detailed Analysis: Limited granular analysis of specific pronunciation errors
- Adaptive Learning: Lack of personalized learning paths based on individual progress
The primary objectives of EvaLectureAR are:
- Automated Assessment: Provide objective, consistent evaluation of Arabic reading skills
- Real-time Processing: Deliver immediate feedback during reading sessions
- Comprehensive Analysis: Evaluate pronunciation, fluency, accuracy, and comprehension
- Personalized Learning: Generate adaptive recommendations based on individual performance
- Knowledge Integration: Utilize RAG technology to enhance educational content generation
- Progress Tracking: Monitor student development over time with detailed analytics
┌─────────────────────────────────────────────────────────────────────────────────┐
│ EvaLectureAR System Architecture │
└─────────────────────────────────────────────────────────────────────────────────┘
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Web Frontend │ │ Mobile Client │ │ Admin Panel │
│ (HTML/JS) │ │ (Optional) │ │ (Teachers) │
└─────────┬───────┘ └─────────┬───────┘ └─────────┬───────┘
│ │ │
└──────────────────────┼──────────────────────┘
│
┌──────────────────────────────────────────────────────────────────────────────────┐
│ API Gateway Layer │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Flask Routes │ │ WebSocket API │ │ REST API │ │
│ │ (HTTP/HTTPS) │ │ (Real-time) │ │ (CRUD Ops) │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
└──────────────────────────┬───────────────────────────────────────────────────────┘
│
┌──────────────────────────────────────────────────────────────────────────────────┐
│ Core Service Layer │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │Speech Recognition│ │ AI Assessment │ │Feedback Generator│ │
│ │ - Wav2Vec2 │ │ - Gemini AI │ │ - Azure TTS │ │
│ │ - Azure STT │ │ - NLP Engine │ │ - Audio Synth │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │Learning Mgmt Sys│ │ RAG Pipeline │ │Realtime Processor│ │
│ │- Progress Track │ │ - Vector Store │ │- WebSocket Mgmt │ │
│ │- Recommendations│ │ - Doc Embeddings│ │- Stream Process │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
└──────────────────────────┬───────────────────────────────────────────────────────┘
│
┌──────────────────────────────────────────────────────────────────────────────────┐
│ Data Layer │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ PostgreSQL DB │ │ Redis Cache │ │ File Storage │ │
│ │ - User Data │ │ - Sessions │ │ - Audio Files │ │
│ │ - Assessments │ │ - Temp Data │ │ - Documents │ │
│ │ - Progress │ │ - Real-time │ │ - Models │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Vector Store │ │ ML Models │ │
│ │ - FAISS Index │ │ - Embeddings │ │
│ │ - Embeddings │ │ - Checkpoints │ │
│ └─────────────────┘ └─────────────────┘ │
└──────────────────────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────────────────┐
│ External Services Integration │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Google AI API │ │Azure Cognitive │ │ Hugging Face │ │
│ │ - Gemini LLM │ │ Services │ │ - Models │ │
│ │ - Embeddings │ │ - Speech STT │ │ - Transformers│ │
│ │ │ │ - TTS │ │ │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
└──────────────────────────────────────────────────────────────────────────────────┘
- Web Interface: Responsive HTML5/CSS3/JavaScript interface with RTL support for Arabic
- Real-time Communication: WebSocket implementation for live audio streaming
- Audio Recording: Web Audio API integration for high-quality audio capture
- User Interface: Modern, child-friendly design with visual feedback mechanisms
- Flask Framework: RESTful API endpoints for all system operations
- Socket.IO: Real-time bidirectional communication for live assessment
- CORS Support: Cross-origin resource sharing for web client integration
- Rate Limiting: Request throttling and abuse prevention mechanisms
- Speech Recognition Service: Multi-model ASR with Wav2Vec2 and Azure Speech Services
- AI Assessment Engine: Comprehensive evaluation using Google Gemini AI
- Feedback Generation Service: Automated feedback creation with Azure TTS
- Learning Management System: Progress tracking and adaptive learning algorithms
- RAG Pipeline: Knowledge retrieval and content generation system
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Audio │───▶│ Speech │───▶│ Text │───▶│ Assessment │
│ Recording │ │Recognition │ │Transcription│ │ Analysis │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
│
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ Feedback │◀───│ Report │◀───│ Score │◀──────────┘
│ Generation │ │Generation │ │Calculation │
└─────────────┘ └─────────────┘ └─────────────┘
│
▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Audio │ │ Database │ │ Progress │
│ Synthesis │ │ Storage │ │ Tracking │
└─────────────┘ └─────────────┘ └─────────────┘
The RAG system enhances the educational platform by providing intelligent content generation and question-answering capabilities based on educational materials.
┌──────────────────────────────────────────────────────────────────────┐
│ RAG Pipeline Architecture │
└──────────────────────────────────────────────────────────────────────┘
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ PDF Documents │───▶│ Document │───▶│ Text │
│ (Educational │ │ Processing │ │ Extraction │
│ Materials) │ │ (PyMuPDF) │ │ & Cleaning │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
┌─────────────────┐ ┌─────────────────┐ │
│ Chunking │◀───│ Text │◀──────────┘
│ Strategy │ │ Preprocessing │
│ (Semantic Split)│ │ (Arabic NLP) │
└─────────────────┘ └─────────────────┘
│
▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Embedding │───▶│ Vector │───▶│ FAISS │
│ Generation │ │ Representation │ │ Index │
│ (Sentence-BERT) │ │ (768-dim) │ │ (Similarity) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
┌─────────────────┐ │
│ User Query │ │
│ (Arabic Text) │ │
└─────────────────┘ │
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Response │◀───│ LLM │◀───│ Retrieval │
│ Generation │ │ Generation │ │ Process │
│ (Gemini AI) │ │ (Context + │ │ (Top-K Docs) │
└─────────────────┘ │ Query) │ └─────────────────┘
└─────────────────┘
- PDF Extraction: Utilizes PyMuPDF for accurate Arabic text extraction
- Text Cleaning: Removes formatting artifacts and normalizes Arabic text
- Chunking Strategy: Semantic chunking preserving context and meaning
- Metadata Enrichment: Adds document source, page numbers, and content type
- Model: Sentence-Transformers with Arabic language support
- Dimension: 768-dimensional dense vectors for semantic representation
- Storage: Efficient embedding storage with metadata mapping
- Indexing: FAISS-based similarity search with configurable distance metrics
- Backend: FAISS (Facebook AI Similarity Search)
- Index Types: Support for flat, IVF, and HNSW indices
- Scalability: Designed for growing document collections
- Persistence: Serialized storage for quick system initialization
- Similarity Search: Cosine similarity for semantic matching
- Ranking: Relevance scoring with confidence thresholds
- Context Window: Configurable context size for LLM input
- Filtering: Grade-level and difficulty-based content filtering
- LLM: Google Gemini 1.5 Flash for fast, accurate responses
- Prompt Engineering: Optimized prompts for educational content
- Context Integration: Seamless integration of retrieved context
- Output Formatting: Structured responses with source attribution
- Educational Content Generation: Create grade-appropriate reading materials
- Question Answering: Answer student queries about lesson content
- Assessment Question Creation: Generate comprehension questions
- Adaptive Learning: Personalize content based on student performance
- Teacher Resources: Provide additional explanations and materials
The system implements a dual-model approach for robust Arabic speech recognition:
Primary Model: Wav2Vec2
- Model: Facebook's wav2vec2-large-xlsr-53
- Language Support: Multilingual model with Arabic fine-tuning
- Processing: Local inference for privacy and speed
- Accuracy: Optimized for children's voice patterns
Secondary Model: Azure Speech Services
- Service: Microsoft Azure Cognitive Services
- Language: Arabic (Saudi Arabia) - ar-SA
- Features: Real-time streaming, noise reduction
- Fallback: Primary backup when Wav2Vec2 unavailable
# Audio Preprocessing Steps
1. Format Conversion (to WAV 16kHz)
2. Noise Reduction (Spectral Subtraction)
3. Normalization (Dynamic Range Compression)
4. Silence Removal (Adaptive Thresholding)
5. Feature Extraction (MFCC, Spectral Features)The system evaluates four key aspects of reading:
Pronunciation Assessment
- Phonetic accuracy analysis
- Diacritic mark detection
- Common mispronunciation patterns
- Voice quality evaluation
Fluency Assessment
- Reading speed calculation (WPM)
- Pause pattern analysis
- Rhythm and intonation evaluation
- Stuttering detection
Accuracy Assessment
- Word-level accuracy scoring
- Error type classification
- Omission and substitution detection
- Context-aware corrections
Comprehension Assessment
- Content understanding evaluation
- Question-answer generation
- Semantic similarity analysis
- Knowledge retention testing
- Primary AI: Google Gemini 1.5 Flash
- Capabilities: Natural language understanding, multilingual support
- Tasks: Error analysis, feedback generation, comprehension assessment
- Configuration: Temperature: 0.3, Top-p: 0.8, Max tokens: 1024
-- Core Tables Structure
CREATE TABLE students (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(120) UNIQUE NOT NULL,
grade_level INTEGER DEFAULT 1,
difficulty_level VARCHAR(20) DEFAULT 'easy',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE texts (
id SERIAL PRIMARY KEY,
title VARCHAR(200) NOT NULL,
content TEXT NOT NULL,
content_with_diacritics TEXT,
grade_level INTEGER DEFAULT 1,
difficulty_level VARCHAR(20) DEFAULT 'easy',
category VARCHAR(50),
word_count INTEGER,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE assessments (
id SERIAL PRIMARY KEY,
student_id INTEGER REFERENCES students(id),
text_id INTEGER REFERENCES texts(id),
transcribed_text TEXT,
overall_score FLOAT,
pronunciation_score FLOAT,
fluency_score FLOAT,
accuracy_score FLOAT,
comprehension_score FLOAT,
errors_detected JSONB,
feedback_text TEXT,
recommendations JSONB,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);POST /api/assessments
Content-Type: multipart/form-data
Parameters:
- audio: Audio file (WAV, MP3, M4A, OGG, FLAC)
- student_id: Student identifier
- text_id: Text identifier for assessment
Response:
{
"success": true,
"assessment_id": 123,
"message": "Assessment created successfully"
}GET /api/assessments/{assessment_id}/results
Response:
{
"success": true,
"assessment": {
"overall_score": 85.5,
"pronunciation_score": 88.0,
"fluency_score": 82.0,
"accuracy_score": 90.0,
"comprehension_score": 83.0,
"errors_detected": [...],
"feedback_text": "...",
"recommendations": [...]
}
}POST /api/rag/upload-pdf
Content-Type: multipart/form-data
Parameters:
- pdf: PDF document for knowledge base
Response:
{
"success": true,
"message": "Successfully processed document",
"stats": {
"total_documents": 150,
"status": "ready"
}
}POST /api/rag/generate-text
Content-Type: application/json
Body:
{
"grade_level": 3,
"difficulty_level": "medium",
"topic": "القراءة",
"prompt": "أنشئ نصاً تعليمياً عن أهمية القراءة"
}
Response:
{
"success": true,
"generated_text": "...",
"sources": [...],
"confidence": 0.92
}// Client Events
socket.emit('start_recording', {
student_id: 123,
text_id: 456
});
socket.emit('audio_chunk', {
audio_data: base64_encoded_audio,
chunk_index: 1
});
// Server Events
socket.on('assessment_complete', (data) => {
console.log('Assessment results:', data.results);
});
socket.on('realtime_feedback', (data) => {
console.log('Live feedback:', data.feedback);
});System Requirements:
- Python 3.10 or higher
- PostgreSQL 13+
- Redis 6.0+
- FFmpeg (for audio processing)
- Git
API Keys Required:
- Google AI API Key (Gemini)
- Azure Speech Services Key
- Azure Speech Region
git clone https://github.com/your-username/EvaLectureAR.git
cd EvaLectureARpython -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install -r requirements.txtCreate .env file in project root:
# Database Configuration
DATABASE_URL=postgresql://username:password@localhost:5432/arabic_assessment
# API Keys
GOOGLE_API_KEY=your_google_ai_api_key
AZURE_SPEECH_KEY=your_azure_speech_key
AZURE_SPEECH_REGION=your_azure_region
# Flask Configuration
SECRET_KEY=your_secret_key_here
FLASK_ENV=development
# Redis Configuration
REDIS_URL=redis://localhost:6379/0# Create database
createdb arabic_assessment
# Initialize database
flask db init
flask db migrate -m "Initial migration"
flask db upgrade# Place your PDF documents in rag_documents/pdfs/
python setup_rag.pypython app.pyThe application will be available at http://localhost:5000
# Start all services
docker-compose up -d
# View logs
docker-compose logs -f app
# Stop services
docker-compose down- Web Interface: http://localhost:5000
- Database: localhost:5433
- Redis: localhost:6380
FLASK_ENV=production
DATABASE_URL=postgresql://user:pass@prod-db:5432/arabic_assessment
REDIS_URL=redis://prod-redis:6379/0
SECRET_KEY=production_secret_key- Use HTTPS in production
- Configure proper firewall rules
- Enable database encryption
- Implement rate limiting
- Set up monitoring and logging
- Navigate to the main page
- Fill in student information:
- Full Name (in Arabic)
- Email Address
- Grade Level (1-6)
- Difficulty Preference
- System recommends appropriate texts based on grade level
- Teachers can upload custom texts
- Texts include diacritical marks for pronunciation guidance
- Click "Start Recording" button
- Read the displayed text aloud
- Real-time visual feedback during recording
- Click "Stop Recording" when finished
- Automatic speech recognition
- AI-powered error analysis
- Comprehensive scoring calculation
- Feedback generation
- Detailed score breakdown
- Specific error identification
- Personalized improvement recommendations
- Audio feedback for corrections
- Individual student progress tracking
- Class-wide performance analytics
- Trend analysis over time
- Exportable reports
- Upload custom reading texts
- Manage difficulty levels
- Create assessment templates
- RAG-powered content generation