Early Experience with Transformer-Based Similarity Analysis for DataRaceBench

Winson Chen,Tristan Vanderbruggen,Pei-Hung Lin,Chunhua Liao,Murali Emani

2022 IEEE/ACM Sixth International Workshop on Software Correctness for HPC Applications (Correctness)（2022）

引用 1|浏览47

暂无评分

摘要

DataRaceBench (DRB) is a dedicated benchmark suite to evaluate tools aimed to find data race bugs in OpenMP programs. Using microbenchmarks with or without data races, DRB is able to generate standard quality metrics and provide systematic and quantitative assessments of data race detection tools. However, as the number of microbenchmarks grows, it is challenging to manually identify similar code patterns for DRB, within the context of identifying duplicated kernels or guiding the additions of new kernels. In this paper, we experiment with a transformer-based, deep learning approach to similarity analysis. A state-of-the-art transformer model, CodeBERT, has been adapted to find similar OpenMP code regions. We explore the challenges and the solutions when applying transformer-based similarity analysis to new source codes which are unseen by pre-trained transformers. Using comparative experiments of different variants of similarity analysis, we comment on the strengths and limitations of the transformer-based approach and point out future research directions.

查看译文

关键词

Benchmarks,OpenMP,Data Races,Tools,Deep Learning,Transformers

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要