Majority voting ensemble approach for predicting diabetes mellitus in female patients from unbalanced dataset
Date
2023
Authors
Muntasir, F.
Anower, M.S.
Nahiduzzaman, M.
Editors
Advisors
Journal Title
Journal ISSN
Volume Title
Type:
Conference paper
Citation
3rd International Conference on Electrical, Computer and Communication Engineering, ECCE 2023, 2023, pp.1-6
Statement of Responsibility
Conference Name
3rd International Conference on Electrical, Computer and Communication Engineering, ECCE 2023 (23 Feb 2023 - 25 Feb 2023 : Chittagong)
Abstract
Diabetes is a common yet deadly disease among women. In 2017, about one in every nine women were found to be exposed to diabetes in the USA alone. It is needless to say the importance to predict it before it can turn deadlier for the body. In this paper, we used machine learning algorithms to predict the possibility of diabetes in women based on the behavioral and medical diagnosis data related to diabetes. The dataset used here was the Pima Indians Diabetes Dataset by National Institute's of Diabetes and Digestive and Kidney Diseases. The unbalance in the dataset was dealt using synthetic minority oversampling technique (SMOTE). Various machine learning algorithms were opted as base for the prediction. The research work focused on applying hard voting or majority voting ensemble technique on the various combinations of those base algorithms. The aggregation of XGBoost, KNN and Random Forest gave the best performance. According to the classification reports, the precision, recall and, f-1 scores were found to be very good compared to most of the recent and earlier works. The model could predict diabetes with an accuracy of 86%. SHAP analysis was then used to finding out the impact of features on the model performance
School/Discipline
Dissertation Note
Provenance
Description
Access Status
Rights
Copyright 2023 IEEE