Plagiarism Detection of Multi-Threaded Programs via Siamese Neural Networks

IEEE ACCESS(2020)

引用 5|浏览29
暂无评分
摘要
Widespread intentional or unintentional software plagiarisms have posed serious threats to the healthy development of software industry. In order to detect such evolving software plagiarism, software dynamic birthmark techniques of better anti-obfuscation ability serve as one of the most promising methods. However, due to the perturbation caused by non-deterministic thread scheduling in multi-threaded programs, existing dynamic approaches optimized for sequential programs may suffer from the randomness in multi-threaded program plagiarism detection. Some thread-aware birthmarking methods have been then proposed to address this issue, which nevertheless largely rely on manual feature engineering and empirical observations without any ground-truth training, and thus require domain knowledge, making them inflexible to be deployed in the wild. Inspired by the success of self-guided optimization using deep neural networks and their superior feature learning ability, in this article, we transform multiple execution traces for each multi-threaded program under a specified input to the plain feature matrix, and feed it to the deep learning framework to learn latent representation as thread-aware birthmark that enjoys better semantic richness and perturbation resistance; instead of empirically determining the plagiarism over direct birthmark similarity metric, we further build up sophisticated siamese neural networks to supervise birthmark construction, similarity measurement, and decision making. Integrating our proposed method, a system called NeurMPD is developed to perform Neural network-based Multi-threaded program Plagiarism Detection. The experimental results based on a public software plagiarism sample set demonstrate that NeurMPD copes better with multi-threaded plagiarism detection than alternative approaches.
更多
查看译文
关键词
Plagiarism,Instruction sets,Neural networks,Dynamic scheduling,Feature extraction,Semantics,Software plagiarism detection,multi-threaded programs,dynamic birthmark,semantic behaviors,deep learning,siamese neural network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要