Reliability Evaluation of Compressed Deep Learning Models

2020 IEEE 11th Latin American Symposium on Circuits & Systems (LASCAS)(2020)

引用 12|浏览42
暂无评分
摘要
Neural networks are becoming deeper and more complex, making it harder to store and process such applications on systems with limited resources. Model pruning and data quantization are two effective ways to simplify the necessary hardware by compressing the network with relevant-only nodes and reducing the required data precision. Such optimizations, however, might come at a cost of reliability since critical nodes are now more exposed to faults and the network is more sensitive to small changes. In this work, we present an extensive empirical investigation of transient faults on compressed deep convolutional neural networks (CNNs). We evaluate the impact of a single bit flip over three CNN models with different sparsity configurations and integer-only quantizations. We show that pruning can increase the resilience of the system by 9× when compared to the dense model. Quantization can outperform the 32-bit floating-point baseline by adding 27.4× more resilience to the overall network and up to 108.7× when combined with pruning. This makes model compression an effective way to provide resilience to deep learning workloads during inference, mitigating the need for explicit error correction hardware.
更多
查看译文
关键词
Resilience,Soft Error,Transient Fault,Neural Network,Deep Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要