Fuzzy clustering decomposition of genetic algorithm-based instance selection for regression problems

Information Sciences(2022)

引用 12|浏览4
暂无评分
摘要
Data selection, which includes feature and instance selection, is often an important step in building prediction systems. Genetic algorithms (GA) frequently allow finding better solutions than classical methods in many areas. This is also true for the instance selection task. The main difficulties and challenges in GA-based instance selection are high computational complexity and decreasing performance with the dataset size growth. This is caused by the fact that each instance is encoded in one chromosome position. Hence bigger datasets result in longer chromosomes. The main contribution of this paper addresses the above problems in a three-step process. In the first step the dataset is divided into several consistent regions by fuzzy clustering. Then GA-based instance selection is performed independently within each cluster. Finally ensemble voting provides seamless aggregation of the partial results from the overlapping clusters. This improves dataset exploitation by more localized search and also takes the advantage of ensemble methods. This method significantly improves the predictive model performance and data reduction in comparison to instance selection performed on the whole training dataset.
更多
查看译文
关键词
Instance selection,Genetic algorithms,Clustering,Ensembles
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要