Advanced RAG-powered medical assistant with professional-grade diagnostic capabilities
- Document-Based Responses: AI answers backed by medical literature and documents
- Vector Search: FAISS-powered similarity search across medical knowledge base
- Intelligent Fallback: Automatic fallback to standard AI when RAG unavailable
- Source Attribution: Every response includes relevant source documents
- Medical Consultation: Evidence-based medical guidance and information
- Symptom Analysis: Comprehensive symptom assessment and recommendations
- Emergency Detection: Automatic triage and emergency protocol guidance
- Drug Information: Medication guidance and interaction checking
- FastAPI: Modern Python web framework with lifespan events
- LangChain: RAG pipeline framework for AI applications
- Google Gemini 2.5 Flash: Latest AI model with 100K token capacity
- FAISS: Vector similarity search with AVX2 optimization
- HuggingFace Embeddings: sentence-transformers/all-MiniLM-L6-v2
- Vector Store: FAISS index with 60K+ medical document embeddings
- Document Processing: Automatic PDF processing and intelligent chunking
- Intelligent Retrieval: MMR-based document retrieval with relevance scoring
- Fallback System: Multi-tier response system ensuring 100% availability
- No Database: Pure file-based system with zero external dependencies
- Python 3.9+
- Google API Key (for Gemini)
- 4GB+ RAM (for embeddings)
- Medical PDF documents (optional)
git clone https://github.com/royxlead/cura-python.git
cd cura-python
python -m venv myenv
myenv/Scripts/activate # Windows
# source myenv/bin/activate # Linux/Mac
pip install -r requirements.txt# Create .env file with your Google API key
echo "GOOGLE_API_KEY=your_google_api_key_here" > .envpython run_server.pyNote: The project includes a pre-built vector store with 60K+ medical documents ready to use!
Visit http://localhost:8000 for the web interface!
# AI Configuration
GOOGLE_API_KEY=your_google_api_key_here
LLM_MODEL=gemini-2.5-flash
LLM_MAX_TOKENS=100000
LLM_TEMPERATURE=0.7
# Vector Store Configuration
DEVICE=cpu
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
VECTOR_STORE_PATH=faiss_index
PDF_DATA_PATH=data/pdfs
# Server Configuration
HOST=0.0.0.0
PORT=8000
DEBUG=Truecura-python/
βββ app/
β βββ services/ # Business logic services
β βββ simple_ai_service.py # Enhanced RAG-enabled AI service
β βββ __init__.py
βββ chains/ # RAG pipeline
β βββ rag_pipeline.py # Medical RAG implementation
βββ utils/ # Utility functions
β βββ vector_store.py # Vector store management & CLI
β βββ prompts.py # Medical prompt templates
βββ data/ # Medical documents
β βββ pdfs/ # PDF documents for RAG (60K+ docs)
βββ faiss_index/ # Pre-built vector store
β βββ index.faiss # FAISS vector index
β βββ index.pkl # Document metadata
βββ frontend/ # Progressive Web App
β βββ index.html # Main interface
β βββ app.js # Client logic
β βββ app.css # Styling
βββ myenv/ # Virtual environment
βββ run_server.py # Main FastAPI server
βββ requirements.txt # Python dependencies
GET /- Web interfaceGET /api/health- Health checkGET /api/status- Service status including RAG
POST /api/chat- RAG-enabled chat{ "message": "What are the symptoms of diabetes?", "use_rag": true }POST /api/search- Direct document search{ "query": "diabetes symptoms", "k": 5 }
- Modern FastAPI: Uses lifespan events instead of deprecated startup events
- Zero Database Dependencies: Pure file-based system with no external database requirements
- Pre-built Vector Store: Ready-to-use with 60K+ medical documents indexed
- Enhanced Error Handling: Comprehensive error responses and logging
- Source Attribution: Every RAG response includes relevant source documents
- Intelligent Fallback: Automatic detection and graceful degradation when RAG unavailable
# Test chat endpoint
curl -X POST http://localhost:8000/api/chat \
-H "Content-Type: application/json" \
-d '{"message": "What are diabetes symptoms?", "use_rag": true}'
# Test document search
curl -X POST http://localhost:8000/api/search \
-H "Content-Type: application/json" \
-d '{"query": "diabetes", "k": 3}'
# Check service status
curl http://localhost:8000/api/statusfrom utils.vector_store import build_vector_store, load_documents_from_directory
# Load custom documents
documents = load_documents_from_directory("path/to/your/pdfs")
vector_store = build_vector_store(documents)from chains.rag_pipeline import rag_pipeline
# Initialize and use RAG
rag_pipeline.initialize()
response = rag_pipeline.get_rag_response("What is hypertension?")
print(response["answer"])- RAG Response Time: ~2-5 seconds for document-backed responses
- Fallback Time: ~1-2 seconds for standard AI responses
- Vector Search: ~100ms for similarity search across 60K+ documents
- Memory Usage: ~2-4GB with embeddings loaded (sentence-transformers)
- Index Size: 60,698 medical documents pre-indexed
- Startup Time: ~10-15 seconds for full system initialization
IMPORTANT: This AI assistant provides general medical information only and should never replace professional medical advice, diagnosis, or treatment. Always consult qualified healthcare professionals for medical concerns.
- API Rate Limiting: Built-in request throttling
- Input Validation: Comprehensive request validation
- CORS Protection: Configurable cross-origin policies
- Error Handling: Secure error responses
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
# Start development server (with auto-reload)
python run_server.py
# Build new vector store (if adding documents)
python -m utils.vector_store
# Check system status
curl http://localhost:8000/api/status
# Test RAG functionality
curl -X POST http://localhost:8000/api/chat \
-H "Content-Type: application/json" \
-d '{"message": "What is diabetes?", "use_rag": true}'
# Search medical documents directly
curl -X POST http://localhost:8000/api/search \
-H "Content-Type: application/json" \
-d '{"query": "diabetes symptoms", "k": 5}'Built with β€οΈ for better healthcare accessibility