The Impact of Feature Selection Techniques on Software Defect Identification Models

2021 IEEE 12th International Conference on Software Engineering and Service Science (ICSESS)（2021）

引用 0|浏览4

暂无评分

摘要

Defect identification is an important task for ensuring the quality of software. Recently, researchers have begun to utilize artificial intelligence techniques to improve the usability of static analysis tools by automatically identifying true defects from the reported SA alarms. Existing methods mainly focus on using the static code features to represent the defective code. However, a challenge that threatens the performance of these machine learning methods is the irrelevant and redundant features. Feature selection techniques can be applied to alleviate this problem. Since many feature selection methods have been proposed, this paper conducts a rigorous experimental evaluation on the impact of feature selection techniques for defect identification and explores whether there is a smallest ratio when using the feature selection techniques for building defect identification models with acceptable performance. Additionally, this paper proposes an effective feature selection approach based on the idea of majority voting, combing the output results of different feature selection techniques. The experimental results for five open-source projects show that there is a best ratio (20%) for feature selection which achieves satisfied performance with far fewer features used for defect identification. This finding can serve as a practical guideline for software defect identification.

查看译文

关键词

Software Defect Identification,Feature Selection,Static Analysis,Machine Learning,Model Evaluation

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要