Gestational Diabetes Prediction in Pregnancy: A Machine Learning and Data Preprocessing Approach

Tanver Hasan Riyed, Tasnia Nabi,Aishwariya Dutta,Md. Kamrul Hasan, Ferdous Wahid Anik, Akid Ornob

2023 26th International Conference on Computer and Information Technology (ICCIT)(2023)

引用 0|浏览1
暂无评分
摘要
Gestational diabetes mellitus (GDM) is characterized by glucose intolerance during pregnancy, resulting in an elevated blood glucose level and short-term and long-term health burdens. Therefore, early screening would aid in reducing complications associated with GDM and adverse pregnancy outcomes. Machine learning (ML) algorithms are a promising alternative to manual GDM early-stage assessment. In this article, we propose a machine learning (ML) pipeline that employs five distinct classifiers: decision trees (DT), linear discriminant analysis (LDA), logistic regression (LR), XGBoost (XGB), and Gaussian naive bayes (GNB). Our framework incorporates the essential preprocessing stages, such as filling in missing values, selecting important features, tuning hyperparameters, and applying stratified K-fold cross-validation to improve the model’s robustness and precision. The K-Nearest Neighbors (KNN) method outperforms the other strategies in the proposed framework based on a comprehensive analysis of three distinct missing data imputation techniques. In addition, eight out of fifteen features are chosen, implementing a procedure for feature selection. Finally, when the XGB classifier is combined with the presented preprocessing, the performance improves by significant margins, yielding the utmost achievable accuracy of 0.9719 and an area under the ROC curve of 0.9982. This promising result makes our pipeline useful for GDM prediction in the earliest stages.
更多
查看译文
关键词
Diabetes prediction framework,Missing value imputation,Feature selection,ML classifiers,GDM dataset
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要