A Dynamic Timing Enhanced DNN Accelerator With Compute-Adaptive Elastic Clock Chain Technique

IEEE Journal of Solid-State Circuits（2021）

引用 6|浏览14

暂无评分

摘要

This article presents a deep neural network (DNN) accelerator using an adaptive clocking technique (i.e., elastic clock chain) to exploit the dynamic timing margin for the 2-D processing element (PE) array-based DNN accelerator. To address two major challenges on exploiting dynamic timing margin for modern deep learning accelerators (i.e., diminishing dynamic timing margin on a large array and strong timing dependence on runtime operands), in this work, we proposed an elastic clock chain scheme to provide a flexible multi-domain clock management scheme for in situ compute adaptability. More specifically, a total of 16 clock domains have been created for the 2-D PE array with the clock periods dynamically adjusted based on both runtime instructions and operands. The multidomain clock sources are generated from a multi-phase delaylocked loop (DLL) and delivered by a global clock bus. The clock offsets between neighboring domains are deliberately managed to maintain the synchronization among clock domains. A 16 × 8 PE array that supports different DNN dataflows and bit-precisions was fabricated using a 65-nm CMOS process. The measurement results on MNIST and CIFAR-10 data sets showed that the effective operating frequency was improved by up to 19% for a single instruction multiple data (SIMD) data flow by enabling the operation of the proposed elastic clock chain. The performance improvement was converted into up to 34% energy saving. Compared with SIMD data flow, the systolic dataflow shows reduced performance improvement of up to 11% due to the consideration of all in-flight operand values.

查看译文

关键词

Adaptive clocking,deep neural network (DNN) accelerator,dynamic timing margin,multiple clock domains,processing element (PE),systolic array

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要