This project uses machine learning to predict the likelihood of a disease (diabetes) based on patient health data. It applies a Random Forest classifier to identify patterns in patient features and determine if they have the disease or not.
The model is trained on the Diabetes dataset from the UCI Machine Learning Repository, containing features such as glucose levels, blood pressure, BMI, age, and more.
- Data preprocessing and feature scaling
- Model training using Random Forest classifier
- Model evaluation with accuracy, precision, recall, and F1-score
- Prediction on new patient data
- Python 3.x
- pandas
- scikit-learn
Install dependencies with:
pip install pandas scikit-learn