Use of Robust Machine Learning Approach in Prediction of Stroke
DOI:
https://doi.org/10.48047/Keywords:
: Machine learning, stroke prediction, health care, risk assessment, accuracy, intervention.Abstract
Background: Machine Learning transforms healthcare by analyzing varied data to predict conditions like stroke, aiding in prevention. Stroke, a major cause of disability and death, stems from abrupt brain blood flow interruption, underscoring the need for early risk detection. This study aimed to explore the correlation between socio-demographic, clinical, and lifestyle elements with stroke while employing machine learning techniques to predict stroke occurrence based on assessed risk factors. Methods: A case-control study involving 1360 individuals (50% stroke patients) gathered patient data using medical records. Statistical methods (Chi-square, correlation, t-test, Wilcoxon rank) were used to explore stroke associations. Four machine learning algorithms (Decision Tree, Naïve Bayes, Random Forest, and Logistic Regression) were applied to build a predictive stroke model, evaluated by measures like sensitivity, specificity, and F1 score for performance assessment.
Results: The study identified stark variations between stroke and non-stroke groups in various health indicators: BMI, fasting blood glucose, triglycerides, total cholesterol, LDL Chol/HDL ratio (6.00 vs. 3.00), and LDL/HDL ratio (3.34 vs.1.56). Conversely, HDL and VLDL levels were notably lower in stroke cases: (43.28 vs. 59.93) and (8.50 vs. 42.68), respectively, with no significant differences observed in age and HbA1C. Among the Machine Learning algorithms employed, the random forest displayed exceptional performance, achieving accuracy, precision, sensitivity, specificity, F1-score, and area under the curve of 92.09%, 92.11%, 90.21%, 93.64%, 91.15%, and 91.93%, respectively, considering all attributes.
Conclusion: This study indicates that Machine Learning models can predict stroke occurrence using patient data, including sociodemographic, medical history, and lifestyle factors, potentially enabling future stroke probability predictions based on risk factors and medical consultations.