Revisiting Code Smell Severity Prioritization using learning to rank techniques

EXPERT SYSTEMS WITH APPLICATIONS(2024)

引用 0|浏览9
暂无评分
摘要
Code Smell Severity Prioritization (CSSP) is crucial in helping software developers minimize software maintenance costs and enhance software quality, particularly when faced with limited refactoring resources. Traditional code smell prioritization methods rely heavily on manual and semi -automatic approaches based on developer experience, often demanding considerable time and effort from experienced experts. Leveraging automated machine learning techniques can effectively overcome these limitations. However, most existing machine learning -based CSSP works have only considered limited pointwise Learning To Rank (LTR) algorithms and have used inappropriate metrics (e.g., Accuracy, Spearman, and MAE) to assess the performance of models. To address these limitations, we make a comprehensive comparison of 41 pointwise, 4 pairwise, and 4 listwise LTR algorithms for CSSP on four code smell severity datasets. Furthermore, we propose the adoption of Severity@20% and Cumulative Lift Chart (CLC) as the primary evaluation metrics to assess CSSP models more effectively. The results show that: (1) The ordinal Bagging (O -Bagging) algorithm demonstrates the highest performance for CSSP, achieving superior results in Severity@20% and CLC. (2) The ordinal classification method can help the top -performing base classification algorithms Bagging and XGBoost achieve better performance for CSSP tasks. (3) A higher (lower) Accuracy, higher (lower) Spearman, and lower (higher) MAE do not reliably indicate better (worse) performance for CSSP. This further underscores that Accuracy, Spearman, and MAE are unsuitable metrics for evaluating CSSP models' effectiveness. To summarize, our study suggest that developers employ the O -Bagging algorithm for CSSP, with Severity@20% and CLC serving as the primary evaluation metrics.
更多
查看译文
关键词
Code smell,Severity prioritization,Learning to rank,Empirical study
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要