Machine Learning Heart Patient Classification

Summary

The Real-world Heart Patient Data Project was a significant undertaking where I leveraged my skills in data analytics and machine learning to analyze a comprehensive dataset related to heart patients. The dataset included a multitude of predictor variables aimed at predicting whether an individual had heart disease or not.

Project Details

Data Exploration

Initial Exploration: Conducted an in-depth exploration of the dataset, examining variable distributions, identifying outliers, and assessing data quality.
Preprocessing: Loaded necessary libraries, checked dataset dimensions, and handled any missing data.
Exploratory Data Analysis (EDA): Focused on the predictor variables most relevant to the target variable, exploring their relationships with each other and with the target variable. This step provided initial insights and allowed for inferences, particularly regarding variable correlations.

Data Splitting

Training and Testing Split: Split the data into training and testing sets to facilitate model validation.
Recipes: Developed and applied data preprocessing recipes.
Stratified Sampling: Ensured balanced representation in the training and testing datasets.
Correlation Analysis: Explored correlations between continuous variables and the target variable to inform feature selection.

Model Building

Machine Learning Models: Developed multiple predictive models including:
- Logistic Regression
- Decision Trees
- Random Forests
- K-Nearest Neighbors (KNN)
- Linear Discriminant Analysis (LDA)
- Quadratic Discriminant Analysis (QDA)

Model Fitting and Tuning

Model Tuning: Enhanced model accuracy using hyperparameter optimization.
Cross-Validation: Employed techniques like cross-validation with random forests, ranger, and XGBoost engines, fitting models to folded data for robust evaluation.

Model Selection and Performance Evaluation

Performance Metrics: Evaluated model performance based on metrics such as the area under the ROC curve (AUC) for both training and testing datasets.
Heatmaps: Used heatmaps to visualize model performance and understand predictor relationships.

Insights and Recommendations

Key Predictors: Identified significant predictors of heart disease from the model outcomes.
Actionable Recommendations: Provided recommendations based on the insights generated from the analysis.
Conclusion: Summarized the findings and insights, offering potential real-world applications of the model for heart disease prediction.

Conclusion

This project showcased my proficiency in machine learning, model evaluation, and predictive analytics in the healthcare domain. The Heart Patient Data Project demonstrates my ability to work with complex datasets and derive actionable insights, making a tangible impact in the prediction and diagnosis of heart disease.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
FinalProject.Rmd		FinalProject.Rmd
PSTAT131FinalProject.pdf		PSTAT131FinalProject.pdf
README.md		README.md
heart.csv		heart.csv
heart_codebook.pdf		heart_codebook.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning Heart Patient Classification

Summary

Project Details

Data Exploration

Data Splitting

Model Building

Model Fitting and Tuning

Model Selection and Performance Evaluation

Insights and Recommendations

Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Machine Learning Heart Patient Classification

Summary

Project Details

Data Exploration

Data Splitting

Model Building

Model Fitting and Tuning

Model Selection and Performance Evaluation

Insights and Recommendations

Conclusion

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages