On the Influence of Biases in Bug Localization: Evaluation and Benchmark

2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)(2022)

引用 1|浏览33
暂无评分
摘要
Bug localization is the task of identifying parts of the source code that needs to be changed to resolve a bug report. As this task is difficult, automatic bug localization tools have been proposed. The development and evaluation of these tools rely on the availability of high-quality bug report datasets. In 2014, Kochhar et al. identified three biases in datasets used to evaluate bug localization techniques: (1) misclassified bug report, (2) already localized bug report, and (3) incorrect ground truth file in a bug report. They reported that already localized bug reports statistically significantly and substantially impact bug localization results, and thus should be removed. However, their evaluation is still limited, as they only investigated 3 projects written in Java. In this study, we replicate the study of Kochhar et al. on the effect of biases in bug report dataset for bug localization. Further investigation on this topic is necessary as new and larger bug report datasets have been proposed without being checked for these biases. We conduct our analysis on a collection of 2,913 bug reports taken from the recently released Bugzbook dataset that fix Python files. To investigate the prevalence of the biases, we check the bias distributions. For each bias, we select and label a set of bug reports that may contain the bias and compute the proportion of bug reports in the set that exhibit the bias. We find that 5%, 23%, and 30% of the bug reports that we investigated are affected by biases 1, 2, and 3 respectively. Then, we investigate the effect of the three biases on bug localization by measuring the performance of IncBL, a recent bug localization tool, and the classical Vector Space Model (VSM) based bug localization tool, which was used in the Kochhar et al. study. Our experiment results highlight that bias 2 significantly impact the bug localization results, while bias 1 and 3 do not have a significant impact. We also find that the effect sizes of bias 2 to IncBL and VSM are different, where IncBL has a higher effect size than VSM. Our findings corroborate the result reported by Kochhar et al. and demonstrate that bias 2 not only affects the 3 Java projects investigated in their study, but also others in another programming language (i.e., Python). This highlights the need to eliminate bias 2 from the evaluation of future bug localization tools. As a by-product of our replication study, we have released a benchmark dataset, which we refer to as CAPTURED, that has been cleaned from the three biases. CAPTURED contains Python programs and therefore augments the cleaned dataset released by Kochhar et al., which only contains Java programs.
更多
查看译文
关键词
Bug Report,Bug Localization,Bias,Python
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要