The Jeopardy of Learning from Over-Sampled Class-Imbalanced Medical Datasets.

Ahmad B. A. Hassanat,Ghada Awad Altarawneh,Ibraheem M. Alkhawaldeh,Yasmeen Jamal Alabdallat,Amir F. Atiya,Ahmad Abujaber,Ahmad S. Tarawneh

ISCC（2023）

引用 0|浏览13

暂无评分

摘要

The usefulness of the oversampling approach to class-imbalanced structured medical datasets is discussed in this paper. In this regard, we basically look into the oversampling approach's prevailing assumption that synthesized instances do belong to the minority class. We used an off-the-shelf oversampling validation system to test this assumption. According to the experimental results from the validation system, at least one of the three medical datasets used had newly generated samples that were not belonging to the minority class as a result of the oversampling methods validated. Additionally, the error rate varied based on the dataset and oversampling method tested. Therefore, we claim that synthesizing new instances without first confirming that they are aligned with the minority class is a risky approach, especially in medical fields where misdiagnosis can have serious repercussions. As alternatives to oversampling, ensemble, data partitioning, and method-level approaches are advised since they do not make false assumptions.

查看译文

关键词

machine learning,class imbalance,Medical apps,Easy ensemble,overfitting,misdiagnosis

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要