Skip to content

Mapicx/Beacon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

55 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎯 BEACON - Government Policy Intelligence Platform

AI-powered platform for Ministry of Education (MoE) and Higher-Education institutions to retrieve, understand, compare, explain, and audit government policies.

Status Version Python React


πŸ“š Documentation Structure

This project uses a phase-based documentation system for better organization:

Core Documentation

  • README.md (this file) - Quick start and overview
  • PROJECT_DESCRIPTION.md - Comprehensive technical documentation

Phase Documentation

  1. PHASE_1_SETUP_AND_AUTHENTICATION.md (7 documents)

    • Email verification system
    • Two-step registration
    • University email domain validation
    • Authentication setup guides
  2. PHASE_2_DOCUMENT_MANAGEMENT.md (15 documents)

    • Document approval workflows
    • Draft and review processes
    • Access control and security
    • Status visibility and badges
    • Search and sorting features
  3. PHASE_3_INSTITUTION_AND_ROLE_MANAGEMENT.md (22 documents)

    • Institution hierarchy management
    • Ministry and university relationships
    • Role-based permissions
    • Institution deletion workflows
    • User management strategies
  4. PHASE_4_ADVANCED_FEATURES_AND_OPTIMIZATIONS.md (61 documents)

    • Chat system and voice queries
    • Notification system
    • RAG and vector store optimizations
    • Performance improvements (Redis, caching, indexing)
    • External data sources
    • Analytics and insights
    • UI/UX fixes and enhancements
    • Security audits and fixes

✨ Key Features

Document Management

  • πŸ“„ Multi-format Support: PDF, DOCX, PPTX, Images (with OCR)
  • πŸ” Smart Search: Hybrid retrieval (semantic + keyword)
  • ⚑ Lazy RAG: Instant uploads, on-demand embedding
  • πŸ“š Citation Tracking: All answers include source documents
  • πŸ” Role-Based Access: Hierarchical document visibility

AI-Powered Intelligence

  • πŸ€– AI Chat Assistant: Natural language queries with cited sources
  • 🎀 Voice Queries: Ask questions via audio (98+ languages)
  • 🌍 Multilingual: 100+ languages including Hindi, Tamil, Telugu, Bengali
  • πŸ“Š Policy Analysis: Compare documents, detect conflicts, check compliance

User & Institution Management

  • πŸ‘₯ Role Hierarchy: Developer β†’ Ministry Admin β†’ University Admin β†’ Document Officer β†’ Student
  • πŸ›οΈ Institution Types: Universities, Hospitals, Research Centers, Defense Academies
  • βœ… Approval Workflows: Multi-level document and user approval system
  • πŸ“§ Email Verification: Secure two-step registration process

Advanced Features

  • πŸ”” Real-time Notifications: Hierarchical notification routing
  • πŸ“ˆ Analytics Dashboard: System health, activity tracking, user insights
  • πŸ”— External Data Sync: Connect to ministry databases
  • 🎨 Theme Support: Light/dark mode with persistent preferences

πŸš€ Quick Start

Prerequisites

  • Python 3.11+
  • PostgreSQL 15+ with pgvector extension
  • Node.js 18+
  • Supabase account (or S3-compatible storage)
  • Google API key (Gemini)

1. Clone Repository

git clone <repository-url>
cd Beacon__V1

2. Backend Setup

# Create virtual environment
python -m venv venv

# Activate (Windows)
venv\Scripts\activate

# Activate (Linux/Mac)
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

3. Configure Environment

Create .env file in root directory:

# Database
DATABASE_HOSTNAME=your-db-host
DATABASE_PORT=5432
DATABASE_NAME=postgres
DATABASE_USERNAME=your-username
DATABASE_PASSWORD=your-password

# Supabase Storage
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_KEY=your-supabase-key
SUPABASE_BUCKET_NAME=Docs

# AI Service
GOOGLE_API_KEY=your-google-api-key

# JWT Authentication
JWT_SECRET_KEY=your-secret-key
JWT_ALGORITHM=HS256
JWT_EXPIRATION_MINUTES=1440

# Email (Optional - for verification)
SMTP_HOST=smtp.gmail.com
SMTP_PORT=587
SMTP_USER=your-email@gmail.com
SMTP_PASSWORD=your-app-password
FROM_EMAIL=your-email@gmail.com
FROM_NAME=BEACON System
FRONTEND_URL=http://localhost:5173

# Redis (Optional - for caching)
REDIS_URL=redis://localhost:6379

4. Database Setup

# Enable pgvector extension
python scripts/enable_pgvector.py

# Run migrations
alembic upgrade head

# Initialize developer account (optional)
python backend/init_developer.py

5. Start Backend

uvicorn backend.main:app --reload --host 127.0.0.1 --port 8000

Backend will be available at: http://localhost:8000

6. Frontend Setup

cd frontend

# Install dependencies
npm install

# Create .env file
echo "VITE_API_BASE_URL=http://localhost:8000/api" > .env

# Start development server
npm run dev

Frontend will be available at: http://localhost:5173


πŸ—οΈ System Architecture

Technology Stack

Backend:

  • FastAPI (Python 3.11+)
  • PostgreSQL with pgvector extension
  • SQLAlchemy ORM
  • Alembic migrations
  • JWT authentication

Frontend:

  • React 18 with Vite
  • TailwindCSS + shadcn/ui components
  • Zustand state management
  • React Router v6
  • Axios for API calls

AI/ML:

  • Google Gemini 2.0 Flash (LLM)
  • BGE-M3 embeddings (multilingual, 1024-dim)
  • OpenAI Whisper (voice transcription)
  • EasyOCR (image text extraction)
  • pgvector (vector similarity search)

Storage:

  • Supabase S3 (document storage)
  • PostgreSQL (metadata + embeddings)

RAG Architecture

Upload β†’ Process β†’ Extract Metadata β†’ Store
                                        ↓
Query β†’ Search Metadata β†’ Rerank β†’ Embed (if needed) β†’ Search β†’ Answer + Citations

Lazy Embedding Strategy:

  • Documents uploaded instantly (no waiting for embedding)
  • Embeddings generated on first query
  • Subsequent queries use cached embeddings
  • Multi-machine support via PostgreSQL storage

πŸ‘₯ User Roles & Hierarchy

Developer (Super Admin)
    ↓
Ministry Admin (MoE Officials)
    ↓
University Admin (Institution Heads)
    ↓
Document Officer (Upload/Manage Docs)
    ↓
Student (Read-Only Access)
    ↓
Public Viewer (Limited Access)

Role Permissions

Feature Developer Ministry Admin University Admin Document Officer Student
View all documents βœ… βœ… (restricted) βœ… (institution) βœ… (institution) βœ… (public)
Upload documents βœ… βœ… (auto-approved) βœ… (needs approval) βœ… (needs approval) ❌
Approve documents βœ… βœ… βœ… (institution) ❌ ❌
Manage users βœ… βœ… (limited) βœ… (institution) ❌ ❌
System health βœ… ❌ ❌ ❌ ❌
Analytics βœ… βœ… βœ… (institution) ❌ ❌

πŸ“‘ API Endpoints

Authentication

  • POST /api/auth/register - User registration
  • POST /api/auth/login - User login
  • POST /api/auth/verify-email/{token} - Email verification
  • GET /api/auth/me - Get current user

Documents

  • POST /api/documents/upload - Upload document
  • GET /api/documents/list - List documents (role-filtered)
  • GET /api/documents/{id} - Get document details
  • GET /api/documents/{id}/download - Download document
  • DELETE /api/documents/{id} - Delete document

Approvals

  • GET /api/approvals/pending - Get pending documents
  • POST /api/approvals/{id}/approve - Approve document
  • POST /api/approvals/{id}/reject - Reject document

Chat & AI

  • POST /api/chat/query - Ask AI question
  • POST /api/voice/query - Voice query (audio upload)
  • GET /api/chat/sessions - Get chat history

Institutions

  • GET /api/institutions/list - List institutions
  • POST /api/institutions/create - Create institution
  • DELETE /api/institutions/{id} - Delete institution

Notifications

  • GET /api/notifications/list - List notifications
  • GET /api/notifications/unread-count - Unread count
  • POST /api/notifications/{id}/mark-read - Mark as read

Analytics

  • GET /api/analytics/stats - System statistics
  • GET /api/analytics/activity - Activity feed
  • GET /api/audit/logs - Audit logs

Full API Documentation: http://localhost:8000/docs


πŸ§ͺ Testing

# Run all tests
python tests/run_all_tests.py

# Individual tests
python tests/test_embeddings.py
python tests/test_voice_query.py
python tests/test_multilingual_embeddings.py
python tests/test_compliance_api.py
python tests/test_conflict_detection_api.py

πŸ“Š Performance Metrics

Operation Time Notes
Document Upload 3-7s Instant response
Query (embedded) 4-7s Fast
Query (first time) 12-19s Includes embedding
Voice transcription 5-10s 1 min audio
User Login <1s JWT generation

πŸ” Security Features

  • βœ… JWT-based authentication
  • βœ… Email verification required
  • βœ… Role-based access control (RBAC)
  • βœ… Document-level permissions
  • βœ… Audit logging for all actions
  • βœ… SQL injection prevention (SQLAlchemy ORM)
  • βœ… XSS protection (React escaping)
  • βœ… Soft deletes (preserve audit trail)

πŸ“ Project Structure

Beacon__V1/
β”œβ”€β”€ Agent/                      # AI/ML Components
β”‚   β”œβ”€β”€ embeddings/            # BGE-M3 embeddings
β”‚   β”œβ”€β”€ voice/                 # Whisper transcription
β”‚   β”œβ”€β”€ rag_agent/             # ReAct agent
β”‚   β”œβ”€β”€ retrieval/             # Hybrid search
β”‚   β”œβ”€β”€ lazy_rag/              # On-demand embedding
β”‚   β”œβ”€β”€ vector_store/          # pgvector integration
β”‚   └── tools/                 # Search tools
β”‚
β”œβ”€β”€ backend/                    # FastAPI Backend
β”‚   β”œβ”€β”€ routers/               # API endpoints
β”‚   β”œβ”€β”€ utils/                 # Helper functions
β”‚   β”œβ”€β”€ database.py            # SQLAlchemy models
β”‚   └── main.py                # FastAPI app
β”‚
β”œβ”€β”€ frontend/                   # React Frontend
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ components/        # Reusable components
β”‚   β”‚   β”œβ”€β”€ pages/             # Route pages
β”‚   β”‚   β”œβ”€β”€ services/          # API calls
β”‚   β”‚   └── stores/            # Zustand stores
β”‚   └── package.json
β”‚
β”œβ”€β”€ alembic/                    # Database migrations
β”œβ”€β”€ scripts/                    # Utility scripts
β”œβ”€β”€ tests/                      # Test suite
β”œβ”€β”€ .env                        # Environment variables
β”œβ”€β”€ requirements.txt            # Python dependencies
β”œβ”€β”€ README.md                   # This file
└── PROJECT_DESCRIPTION.md      # Detailed documentation

πŸ› Troubleshooting

Database Connection Issues

# Check PostgreSQL is running
psql -h HOST -U USER -d DATABASE

# Verify .env file has correct credentials
# Test connection: python test_redis_connection.py

GPU Not Detected

# Install PyTorch with CUDA support
pip install torch --index-url https://download.pytorch.org/whl/cu118

Voice Not Working

# Install FFmpeg
# Windows: Download from https://ffmpeg.org/download.html
# Linux: sudo apt install ffmpeg
# Mac: brew install ffmpeg

Email Verification Not Sending

# For Gmail:
# 1. Enable 2-Factor Authentication
# 2. Generate App Password: https://myaccount.google.com/apppasswords
# 3. Use App Password as SMTP_PASSWORD in .env

πŸ”„ Recent Updates

Version 2.0.0 (December 2025)

  • βœ… Migrated from FAISS to pgvector for multi-machine support
  • βœ… Implemented lazy RAG for instant document uploads
  • βœ… Added email verification system
  • βœ… Enhanced notification system with hierarchical routing
  • βœ… Improved analytics dashboard with system health monitoring
  • βœ… Optimized performance with Redis caching
  • βœ… Added voice query support (98+ languages)
  • βœ… Implemented document approval workflows
  • βœ… Enhanced role-based access control

πŸ“ž Support

  • Documentation: See phase documentation files for detailed guides
  • API Docs: http://localhost:8000/docs
  • Logs: Agent/agent_logs/
  • Tests: python tests/run_all_tests.py

🎯 Key Achievements

βœ… Multi-format document processing
βœ… Multilingual embeddings (100+ languages)
βœ… Voice query system (98+ languages)
βœ… Lazy RAG (instant uploads)
βœ… Hybrid retrieval (semantic + keyword)
βœ… External data ingestion
βœ… Citation tracking
βœ… Production-ready


Built with ❀️ for Government Policy Intelligence

Version: 2.0.0 | Status: βœ… Production Ready | Last Updated: December 5, 2025

About

SIH25254 - Beacon is an AI-powered knowledge retrieval system that enables ministries and educational institutions to instantly search, analyze, and derive insights from policies, regulations, and official documents using Retrieval-Augmented Generation (RAG).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors