Drop-Connect as a Fault-Tolerance Approach for RRAM-based Deep Neural Network Accelerators

Mingyuan Xiang, Xuhan Xie,Pedro Savarese, Xin Yuan,Michael Maire, Yanjing Li

2024 IEEE 42nd VLSI Test Symposium (VTS)(2024)

引用 0|浏览9
Resistive random-access memory (RRAM) is widely recognized as a promising emerging hardware platform for deep neural networks (DNNs). Yet, due to manufacturing limitations, current RRAM devices are highly susceptible to hardware defects, which poses a significant challenge to their practical applicability. In this paper, we present a machine learning technique that enables the deployment of defect-prone RRAM accelerators for DNN applications, without necessitating modifying the hardware, retraining of the neural network, or implementing additional detection circuitry/logic. The key idea involves incorporating a drop-connect inspired approach during the training phase of a DNN, where random subsets of weights are selected to emulate fault effects (e.g., set to zero to mimic stuck-at-1 faults), thereby equipping the DNN with the ability to learn and adapt to RRAM defects with the corresponding fault rates. Our results demonstrate the viability of the drop-connect approach, coupled with various algorithm and system-level design and trade-off considerations. We show that, even in the presence of high defect rates (e.g., up to 30 compared to that of the fault-free version, while incurring minimal system-level runtime/energy costs.
fault tolerance,neural network,machine learning,RRAM
AI 理解论文
Chat Paper