Identifying clinical drivers for hospital readmissions in diabetic patients to improve patient outcomes using Python and Tableau.
This project analyzes a dataset representing 10 years (1999-2008) of clinical care at 130 US hospitals. The goal is to identify factors that lead to high 30-day readmission rates among diabetic patients, providing actionable insights for hospital administrators to improve transition-of-care protocols.
- Language: Python 3.x
- Libraries: Pandas (Data Wrangling), Seaborn/Matplotlib (Visualization), NumPy
- Business Intelligence: Tableau (Interactive Dashboard)
- Insight 1: [e.g., Patients aged 70+ show a 15% higher readmission rate]
- Insight 2: [e.g., Specific medication changes correlate with lower bounce-back rates]
To ensure high-quality analysis, the following steps were taken:
- Handling Missing Values: Replaced
?placeholders withNaNand assessed column integrity. - De-duplication: Filtered for the first encounter per patient to prevent data leakage.
- Feature Categorization: Grouped ICD-9 diagnosis codes into clinical categories (e.g., Circulatory, Respiratory).
- Clone the repository.
- Install dependencies:
pip install pandas matplotlib seaborn. - Open
analysis.ipynbto view the step-by-step EDA.
This project uses an anonymized, public dataset. In a real-world setting, this analysis would be performed in compliance with HIPAA regulations to ensure patient privacy.
This project was developed with the assistance of Gemini (Google AI), which served as a technical collaborator for:
- Architecting the repository structure and documentation.
- Refining data cleaning strategies for healthcare-specific datasets.
- Debugging and optimizing Python analysis workflows.