Monitor Placement for Fault Localization in Deep Neural Network Accelerators
CoRR(2023)
摘要
Systolic arrays are a prominent choice for deep neural network (DNN)
accelerators because they offer parallelism and efficient data reuse. Improving
the reliability of DNN accelerators is crucial as hardware faults can degrade
the accuracy of DNN inferencing. Systolic arrays make use of a large number of
processing elements (PEs) for parallel processing, but when one PE is faulty,
the error propagates and affects the outcomes of downstream PEs. Due to the
large number of PEs, the cost associated with implementing hardware-based
runtime monitoring of every single PE is infeasible. We present a solution to
optimize the placement of hardware monitors within systolic arrays. We first
prove that 2N-1 monitors are needed to localize a single faulty PE and we
also derive the monitor placement. We show that a second placement optimization
problem, which minimizes the set of candidate faulty PEs for a given number of
monitors, is NP-hard. Therefore, we propose a heuristic approach to balance the
reliability and hardware resource utilization in DNN accelerators when number
of monitors is limited. Experimental evaluation shows that to localize a single
faulty PE, an area overhead of only 0.33
systolic array.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要