Code smell detection using feature selection and stacking ensemble: An empirical investigation

Information and Software Technology(2021)

引用 23|浏览26
暂无评分
摘要
Context: Code smell detection is the process of identifying code pieces that are poorly designed and implemented. Recently more research has been directed towards machine learning-based approaches for code smells detection. Many classifiers have been explored in the literature, yet, finding an effective model to detect different code smells types has not yet been achieved. Objective: The main objective of this paper is to empirically investigate the capabilities of stacking heterogeneous ensemble model in code smell detection. Methods: Gain feature selection technique was applied to select relevant features in code smell detection. Detection performance of 14 individual classifiers was investigated in the context of two class-level and four method-level code smells. Then, three stacking ensembles were built using all individual classifiers as base classifiers, and three different meta-classifiers (LR, SVM and DT). Results: GP, MLP, DT and SVM(Lin) classifiers were among the best performing classifiers in detecting most of the code smells. On the other hand, SVM(Sig), NB(B), NB(M), and SGD were among the least accurate classifiers for most smell types. The stacking ensemble with LR and SVM meta-classifiers achieved a consistent high detection performance in class-level and method-level code smells compared to all individual models. Conclusion: This paper concludes that the detection performance of the majority of individual classifiers varied from one code smell type to another. However, the detection performance of the stacking ensemble with LR and SVM meta-classifiers was consistently superior over all individual classifiers in detecting different code smell types.
更多
查看译文
关键词
Code smell detection,Classification,Ensemble learning,Stacking,Feature selection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要