Majority voting ensemble approach for predicting diabetes mellitus in female patients from unbalanced dataset

Date

2023

Authors

Muntasir, F.
Anower, M.S.
Nahiduzzaman, M.

Editors

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Conference paper

Citation

3rd International Conference on Electrical, Computer and Communication Engineering, ECCE 2023, 2023, pp.1-6

Statement of Responsibility

Conference Name

3rd International Conference on Electrical, Computer and Communication Engineering, ECCE 2023 (23 Feb 2023 - 25 Feb 2023 : Chittagong)

Abstract

Diabetes is a common yet deadly disease among women. In 2017, about one in every nine women were found to be exposed to diabetes in the USA alone. It is needless to say the importance to predict it before it can turn deadlier for the body. In this paper, we used machine learning algorithms to predict the possibility of diabetes in women based on the behavioral and medical diagnosis data related to diabetes. The dataset used here was the Pima Indians Diabetes Dataset by National Institute's of Diabetes and Digestive and Kidney Diseases. The unbalance in the dataset was dealt using synthetic minority oversampling technique (SMOTE). Various machine learning algorithms were opted as base for the prediction. The research work focused on applying hard voting or majority voting ensemble technique on the various combinations of those base algorithms. The aggregation of XGBoost, KNN and Random Forest gave the best performance. According to the classification reports, the precision, recall and, f-1 scores were found to be very good compared to most of the recent and earlier works. The model could predict diabetes with an accuracy of 86%. SHAP analysis was then used to finding out the impact of features on the model performance

School/Discipline

Dissertation Note

Provenance

Description

Access Status

Rights

Copyright 2023 IEEE

License

Grant ID

Call number

Persistent link to this record