Undersampling of approaching the classification boundary for imbalance problem.

Concurr. Comput. Pract. Exp.(2023)

引用 0|浏览32
暂无评分
摘要
Using imbalanced data in classification affect the accuracy. If the classification is based on imbalanced data directly, the results will have large deviations. A common approach to dealing with imbalanced data is to re-structure the raw dataset via undersampling method. The undersampling method usually uses random or clustering approaches to trimming the majority class in the dataset, since some data in the majority class makes not contribute to classification model. In this paper a revised undersampling approach is proposed. First, we perform space compression in the vertical direction of the separating hyperplane. Then, a weighted random sampling hybrid ensemble learning method is carried out to make the sampled objects spread more widely near the separating hyperplane. Experiments with 7 under-sampling methods on 21 imbalanced datasets show that our method has achieved good results.
更多
查看译文
关键词
classification,imbalanced data,separation hyperplane,undersampling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要