DiagNNose: Toward Error Localization in Deep Learning Hardware-Based on VTA-TVM Stack

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS(2024)

引用 0|浏览1
暂无评分
摘要
Low-level hardware faults manifested in a Deep learning (DL) accelerator usher in graceless degradation of high-level classification accuracy, which can eventuate to catastrophic circumstances. This violates the crucial Functional Safety (FuSa) of the DL accelerator, maintaining which is imperative in high-assurance applications. Conventional techniques for error localization incur high-test efforts, without regards to the unique challenges posed by DL systems. In this direction, we propose DiagNNose, a two-tier machine learning-based error localization framework for on-line fault management in DL accelerators. We develop a novel diagnostic pattern selection algorithm to obtain a minimal subset of functional test patterns, that are executed in the accelerator in mission mode. By extracting and analyzing dataflow-based features from the intermediate computations of the general matrix multiply (GEMM) core, a lightweight multilayer perceptron accomplishes bit-level error localization in 8-bit, 16-bit, and 32-bit datapath units with high fidelity. We have limited ourselves to a single accelerator design, i.e., the versatile tensor accelerator (VTA) architecture to evaluate our proposed DiagNNose framework. On executing state-of-the-art deep neural networks trained on ImageNet; error localization using only 30 diagnostic functional test patterns demonstrate up to 98.4% diagnosability, thereby demonstrating an improvement of 54.63% over a random test pattern set, with as low as 4.95% overhead in the DL accelerator in mission mode.
更多
查看译文
关键词
Circuit faults,Location awareness,Life estimation,Computational modeling,Tensors,Feature extraction,Deep learning,Deep learning (DL),fault diagnosis,functional safety (FuSa),versatile tensor accelerator (VTA)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要