Comparative Analysis of Machine Learning Algorithms for Heart Disease Prediction Using CDC and BRFSS Data: A Focus on Oversampling and Ensemble Techniques
No Thumbnail Available
Date
2024-11-25
Authors
Amokun R.A.
Arowolo O.T.
Eke J.I.
Journal Title
Journal ISSN
Volume Title
Publisher
MIRG
Abstract
This work studies machine learning methods to predict heart disease based on data obtained from the CDC via the Behavioral Risk Factor Surveillance System. Accordingly, it compared various models, including Logistic Regression and Random Forest models, which can be further tuned for better outcomes in heart disease treatment and prevention. With a view to handling the class imbalance problem in heart disease classification, the SMOTE technique was applied, and model performance was evaluated on metrics such as Accuracy and Precision, among others. The high marks the F Score and ROC Area under the Receiver Operating Characteristic curve used in the evaluations were notably displayed by the XGBoost model with an F Score of 0.80. 0.92 ROC area Further application of SMOTE contributed to the identification of the minority cases; therefore, the models can assure balanced and reliable predictions. This study has shown how machine learning and techniques of oversampling can be used to better the diagnosis of heart disease, thus having health professionals equipped with tools for early diagnosis and timely treatment. Many algorithms are used in this study, as well as ensemble techniques, which provide a really strong basis for predictive modeling in the healthcare sector.
Description
Scholarly article
Keywords
Citation
Amokun R.A., Arowolo O.T. & Eke J.I. (2024). “Comparative Analysis of Machine Learning Algorithms for Heart Disease Prediction Using CDC and BRFSS Data: A Focus on Oversampling and Ensemble Techniques". In Proceedings of the International Conference on Artificial Intelligence and Robotics (MIRG-ICAIR 2024), pp. 107-117, MIRG