Understanding and Detecting Harmful Code.

Rodrigo Lima,Jairo Souza,Baldoino Fonseca,Leopoldo Teixeira,Rohit Gheyi,Márcio Ribeiro,Alessandro F. Garcia,Rafael Maiani de Mello

SBES（2020）

引用 2|浏览77

暂无评分

摘要

Code smells typically indicate poor design implementation and choices that may degrade software quality. Hence, they need to be carefully detected to avoid such poor design. In this context, some studies try to understand the impact of code smells on the software quality, while others propose rules or machine learning-based techniques to detect code smells. However, none of those studies or techniques focus on analyzing code snippets that are really harmful to software quality. This paper presents a study to understand and classify code harmfulness. We analyze harmfulness in terms of CLEAN, SMELLY, BUGGY, and HARMFUL code. By HARMFUL CODE, we define a SMELLY code element having one or more bugs reported. These bugs may have been fixed or not. Thus, the incidence of HARMFUL CODE may represent a increased risk of introducing new defects and/or design problems during its fixing. We perform our study with 22 smell types, 803 versions of 13 open-source projects, 40,340 bugs and 132,219 code smells. The results show that even though we have a high number of code smells, only 0.07% of those smells are harmful. The Abstract Function Call From Constructor is the smell type more related to HARMFUL CODE. To cross-validate our results, we also perform a survey with 60 developers. Most of them (98%) consider code smells harmful to the software, and 85% of those developers believe that code smells detection tools are important. But, those developers are not concerned about selecting tools that are able to detect HARMFUL CODE. We also evaluate machine learning techniques to classify code harmfulness: they reach the effectiveness of at least 97% to classify HARMFUL CODE. While the Random Forest is effective in classifying both SMELLY and HARMFUL CODE, the Gaussian Naive Bayes is the less effective technique. Our results also suggest that both software and developers' metrics are important to classify HARMFUL CODE.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要