Feature selection for classification tasks: Expert knowledge or traditional methods?

JOURNAL OF INTELLIGENT & FUZZY SYSTEMS(2018)

引用 14|浏览13
暂无评分
摘要
Recently, available data has increased explosively in both number of samples and dimensionality. The huge number of high dimensional data generates the presence of noisy, redundant and irrelevant dimensions. Such dimensions can increase the time and computational cost in the learning process and even degenerate the performance of learning tasks. One of the ways to reduce dimensionality is by Feature Selection (FS). The aim of this paper is study the feature selection based on expert knowledge and traditional methods (filter, wrapper and embedded) and analyze their performance in classification tasks. Three datasets related to cancer domain in humans were used for feature selection: Breast Cancer (BC), Primary Tumor (PT) and Central Nervous System (CNS). C4.5, K-Nearest Neighbors, Support Vector Machine and Multi Layer Perceptron were trained with the best subset of features for each cancer dataset. The subset of features selected by the wrapper method presents the best average accuracy in the datasets BC and PT, while the subset of features selected by the embedded method reaches the highest average accuracy in the CNS dataset.
更多
查看译文
关键词
Feature selection,expert knowledge,traditional methods,filter,wrapper,embedded
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要