Stealthy Backdoors as Compression Artifacts

IEEE Transactions on Information Forensics and Security(2022)

引用 15|浏览56
暂无评分
摘要
Model compression is a widely-used approach for reducing the size of deep learning models without much accuracy loss, enabling resource-hungry models to be compressed for use on resource-constrained devices. In this paper, we study the risk that model compression could provide an opportunity for adversaries to inject stealthy backdoors. In a backdoor attack on a machine learning model, an adversary produces a model that performs well on normal inputs but outputs targeted misclassifications on inputs containing a small trigger pattern. We design stealthy backdoor attacks such that the full-sized model released by adversaries appears to be free from backdoors (even when tested using state-of-the-art techniques), but when the model is compressed it exhibits a highly effective backdoor. We show this can be done for two common model compression techniques—model pruning and model quantization—even in settings where the adversary has limited knowledge of how the particular compression will be done. Our findings demonstrate the importance of performing security tests on the models that will actually be deployed not in their precompressed version. Our implementation is available at https://github.com/yulongtzzz/Stealthy-Backdoors-as-Compression-Artifacts .
更多
查看译文
关键词
Deep learning,neural network compression,backdoor attack
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要