A Python application that provides an easy-to-use interface for Google's MedGemma model, a specialized AI model for medical question answering and healthcare assistance.
- 🏥 Medical question answering using Google's MedGemma model
- 💻 Interactive command-line interface
- 🔧 Single question mode for automation
- 🚀 GPU acceleration support (CUDA)
- 🔐 Secure Hugging Face authentication
- Python 3.8+
- Hugging Face account with access to MedGemma models (see below)
- Sufficient RAM/VRAM (at least 16GB recommended for 4B model)
MedGemma models are gated and require approval from Google. To get access:
- Visit the MedGemma model page
- Click "Request access" and fill out the form
- Wait for approval (may take a few days)
- Once approved, your Hugging Face token will work with the model
If you don't have access to MedGemma yet, you can use alternative open medical models:
medicalai/ClinicalBERT- Clinical text understandingCannae-AI/MedicalLlama3.2-vision-11B-IT- Medical vision modelIntelligent-Internet/II-Medical-8B- General medical assistant
Use these with: python medgemma_assistant.py --model medicalai/ClinicalBERT
- Clone this repository:
git clone <your-repo-url>
cd medgemma-project- Create and activate a virtual environment:
python -m venv venv
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate- Install dependencies:
pip install -r requirements.txt- Set up your Hugging Face token:
- Copy
venv/.env.localto.envin the project root - Or set the
HF_TOKENenvironment variable - Or modify the
.env.localfile with your token
- Copy
- Test the setup:
python demo.py - Get MedGemma access (see prerequisites)
- Run the assistant:
python medgemma_assistant.py - Try examples:
python example.py
python demo.pyThis will verify that your environment is properly configured.
python medgemma_assistant.pypython medgemma_assistant.py --question "What are the symptoms of diabetes?"# Use CPU instead of GPU
python medgemma_assistant.py --device cpu
# Use a different model (if available)
python medgemma_assistant.py --model google/medgemma-4b-it- Model: MedGemma-4B-IT (Instruction Tuned)
- Provider: Google
- Purpose: Medical question answering and healthcare assistance
- Size: ~8GB (4B parameter model)
🔒 Privacy: Be careful with sensitive medical information. The model responses should not be considered as medical advice.
-
"Authentication failed"
- Ensure your Hugging Face token is valid and has access to MedGemma
- Check that the token is properly set in
.env.local
-
"CUDA out of memory"
- Use
--device cputo run on CPU - Or use a machine with more VRAM
- Use
-
"Model download failed"
- Check your internet connection
- Ensure you have sufficient disk space (~4GB)
- Use GPU acceleration for faster responses
- For CPU-only machines, consider using smaller quantized models
- Close other applications to free up memory
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
- Google for the MedGemma model
- Hugging Face for the transformers library
- The open-source AI community
This repo includes a demo-friendly fallback so you can present without gated HF access.
- Use the included
run_demo.ps1to run the server in mock/demo mode (Windows PowerShell):
# create venv if needed
python -m venv venv
.\venv\Scripts\Activate.ps1
pip install -r requirements.txt
# build frontend (if you edit it)
cd frontend\ai_diagnostic_assistant
npm ci --legacy-peer-deps
npm run build
cd ..\..
# run demo (mock responses)
.\run_demo.ps1- To use real Hugging Face inference (if you have access to the gated model): set
HUGGINGFACEHUB_API_TOKENin your environment before running. The server will try local -> HF -> mock automatically.
Supported upload types for demo: txt, png, jpg, jpeg, gif, dcm, pdf.