Assessing and Improving Dataset and Evaluation Methodology in Deep Learning for Code Clone Detection

2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE)(2023)

引用 0|浏览11
暂无评分
摘要
Code clone detection is a task that identifies whether two code snippets are semantically identical. In recent years, deep learning models have shown high performance in detecting Type-3 and Type-4 code clones, and received increasing attention from the research community. However, compared with the attention given to the model design by the researchers, there is little research work on the quality of the datasets and the evaluation methodology (the way of dividing the dataset into training set and test set), which poses a challenge to the credibility of deep learning models.In this paper, we conduct experiments to evaluate the performance of the existing state-of-the-art models in multi-perspectives. At the same time, we release two new datasets for code clone detection, namely ConBigCloneBench and Google-CodeJam2 based on the existing datasets BigCloneBench and GoogleCodeJam, respectively. Our experiments show that the performance of the same model decreases up to 0.5 F1 score (from 0.9 to 0.4) on different evaluation perspectives and datasets, and the performance of some models is only similar to the simple MLP model. We analyze reasons for the performance decline further, and provide suggestions for future research to improve the performance of deep learning models from multi-perspectives.
更多
查看译文
关键词
code clone detection,datasets,evaluation method,neural networks
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要