Self-adaptive parameter and strategy based particle swarm optimization for large-scale feature selection problems with multiple classifiers

Applied Soft Computing(2020)

引用 99|浏览56
暂无评分
摘要
Feature selection has been widely used in classification for improving classification accuracy and reducing computational complexity. Recently, evolutionary computation (EC) has become an important approach for solving feature selection problems. However, firstly, as the datasets processed by classifiers become increasingly large and complex, more and more irrelevant and redundant features may exist and there may be more local optima in the large-scale feature space. Therefore, traditional EC algorithms which have only one candidate solution generation strategy (CSGS) with fixed parameter values may not perform well in searching for the optimal feature subsets for large-scale feature selection problems. Secondly, many existing studies usually use only one classifier to evaluate feature subsets. To show the effectiveness of evolutionary algorithms for feature selection problems, more classifiers should be tested. Thus, in order to efficiently solve large-scale feature selection problems and to show whether the EC-based feature selection method is efficient for more classifiers, a self-adaptive parameter and strategy based particle swarm optimization (SPS-PSO) algorithm is proposed in this paper using multiple classifiers. In SPS-PSO, a representation scheme of solutions and five CSGSs have been used. To automatically adjust the CSGSs and their parameter values during the evolutionary process, a strategy self-adaptive mechanism and a parameter self-adaptive mechanism are employed in the framework of particle swarm optimization (PSO). By using the self-adaptive mechanisms, the SPS-PSO can adjust both CSGSs and their parameter values when solving different large-scale feature selection problems. Therefore, SPS-PSO has good global and local search ability when dealing with these large-scale problems. Moreover, four classifiers, i.e., k-nearest neighbor (KNN), linear discriminant analysis (LDA), extreme learning machine (ELM), and support vector machine (SVM), are individually used as the evaluation functions for testing the effectiveness of feature subsets generated by SPS-PSO. Nine datasets from the UCI Machine Learning Repository and Causality Workbench are used in the experiments. All the nine datasets have more than 600 dimensions, and two of them have more than 5,000 dimensions. The experimental results show that the strategy and parameter self-adaptive mechanisms can improve the performance of the evolutionary algorithms, and that SPS-PSO can achieve higher classification accuracy and obtain more concise solutions than those of the other algorithms on the large-scale feature problems selected in this research. In addition, feature selection can improve the classification accuracy and reduce computational time for various classifiers. Furthermore, KNN is a better surrogate model compared with the other classifiers used in these experiments.
更多
查看译文
关键词
00-01,99-00
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要