Information and Software Technology(2022)

引用 0|浏览2
Learning-based automatic program repair techniques are showing promise to provide quality fix suggestions for detected bugs in the source code of the software. These tools mostly exploit historical data of buggy and fixed code changes and are heavily dependent on bug localizers while applying to a new piece of code. With the increasing popularity of code review, dependency on bug localizers can be reduced. Besides, the code review-based bug localization is more trustworthy since reviewers’ expertise and experience are reflected in these suggestions. The natural language instructions scripted on the review comments are enormous sources of information about the bug’s nature and expected solutions. However, none of the learning-based tools has utilized the review comments to fix programming bugs to the best of our knowledge. In this study, we investigate the performance improvement of repair techniques using code review comments. We train a sequence-to-sequence model on 55,060 code reviews and associated code changes. We also introduce new tokenization and preprocessing approaches that help to achieve significant improvement over state-of-the-art learning-based repair techniques. We boost the top-1 accuracy by 20.33% and top-10 accuracy by 34.82%. We could provide a suggestion for stylistics and non-code errors unaddressed by prior techniques. We believe that the automatic fix suggestions along with code review generated by our approach would help developers address the review comment quickly and correctly and thus save their time and effort.
Automatic program repair,Deep learning,Code review
AI 理解论文
Chat Paper