This project aims to classify fetal health status based on Cardiotocography (CTG) features. By analyzing a dataset of fetal heart rate and uterine contraction patterns, the goal is to develop a machine learning model that can accurately predict fetal health as 'Normal', 'Suspect', or 'Pathological'. This is a critical task for obstetric monitoring and clinical decision-making.
- Dataset: Kaggle - Fetal Health Classification
- Size: 2126 entries, 22 columns.
- Key Features:
- CTG Metrics:
baseline value,accelerations,fetal_movement,uterine_contractions, and a host of histogram-based features.
- CTG Metrics:
- Approach:
- Data Cleaning: The dataset was clean, with no missing values. The code drops duplicate rows.
- Exploratory Data Analysis: A correlation heatmap, box plots, and pair plots were used to visualize data distributions and relationships between features and the target class. A count plot showed the class distribution, which is imbalanced.
- Multi-class Classification: The target variable
fetal_healthhas three categories: 'Normal' (1), 'Suspect' (2), and 'Pathological' (3). The code adjusts these to be zero-indexed (0, 1, 2) for model training. - Data Standardization:
MinMaxScalerwas applied to the features for normalization. - Model Used:
- An XGBoost Classifier model was trained.
- Best Accuracy:
- The XGBoost model achieved an accuracy of 96.0% on the test set. The model's performance demonstrates its strong ability to classify fetal health status.
- Obstetric Monitoring: Provide a tool for clinicians to quickly and accurately assess fetal well-being from CTG data.
- Early Warning System: Assist in identifying at-risk fetuses, enabling timely medical intervention.
- Clinical Decision Support: Support data-driven decision-making in labor and delivery.
- Research: Serve as a foundational model for studying the relationship between CTG signals and fetal health outcomes.
Clone the repository and download the dataset.
Install the necessary libraries:
pip install pandas numpy seaborn matplotlib scikit-learn xgboostWe welcome contributions to improve the project. You can help by:
- Performing comprehensive hyperparameter tuning and cross-validation for the XGBoost model to ensure robustness.
- Exploring more advanced strategies for handling the class imbalance.
- Investigating the impact of different feature selection or transformation techniques on model performance.
- Adding explainability (e.g., SHAP or LIME) to understand which CTG parameters are the most critical for predicting fetal health status.