End-to-End DNN Training with Block Floating Point Arithmetic.
arXiv (Cornell University)(2018)
摘要
DNNs are ubiquitous datacenter workloads, requiring orders of magnitude more
computing power from servers than traditional workloads. As such, datacenter
operators are forced to adopt domain-specific accelerators that employ
half-precision floating-point (FP) numeric representations to improve
arithmetic density. Unfortunately, even these representations are not dense
enough, and are, therefore, sub-optimal for DNNs. We propose a hybrid approach
that employs dense block floating-point (BFP) arithmetic on dot product
computations and FP arithmetic elsewhere. While using BFP improves the
performance of dot product operations, that compose most of DNN computations,
allowing values to freely float between dot product operations leads to a
better choice of tensor exponents when converting values to back BFP. We show
that models trained with hybrid BFP-FP arithmetic either match or outperform
their FP32 counterparts, leading to more compact models and denser arithmetic
in computing platforms.
更多查看译文
关键词
training,end-to-end
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要