Identifying and Exploiting Ineffectual Computations to Enable Hardware Acceleration of Deep Learning

2018 16th IEEE International New Circuits and Systems Conference (NEWCAS)(2018)

引用 1|浏览128
暂无评分
摘要
This article summarizes somde of our work on hardware accelerators for inference with Deep Learning Neural Networks (DNNs). Early success in hardware acceleration for DNNs exploited the computation structure and the significant reuse in their access stream. Our approach to further boost benefits has been to first identify properties in the value stream of DNNs which we can exploit at the hardware level to improve execution time, reduce off- and on-chip communication and storage, resulting in higher energy efficiency and execution time reduction. We have been focusing on properties that are difficult or impossible to discern in advance. These properties include values that are zero or near zero and that prove ineffectual, values that have reduced precision needs, or even the bit-level content of values that lead to ineffectual computations. The presented designs cover a spectrum of choices in terms of area cost, energy efficiency, and relative execution time performance and target a variety of hardware devices from embedded systems to server class machines. A key characteristic of these designs is that they reward but do not requires advances in model design that increase the aforementioned properties (such as reduced precision or sparsity) and thus provide a safe path to innovation.
更多
查看译文
关键词
hardware level,on-chip communication,storage,execution time reduction,bit-level content,hardware devices,hardware accelerators,DNNs,energy efficiency,hardware acceleration,deep learning neural networks,off-chip communication,area cost,embedded systems,server class machines
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要