Serving Recurrent Neural Networks Efficiently with a Spatial Accelerator

Zhao Tian
Zhao Tian
Zhang Yaqi
Zhang Yaqi

MLSys, 2019.

Cited by: 0|Views4
EI

Abstract:

Recurrent Neural Network (RNN) applications form a major class of AI-powered, low-latency data center workloads. Most execution models for RNN acceleration break computation graphs into BLAS kernels, which lead to significant inter-kernel data movement and resource underutilization. We show that by supporting more general loop construct...More

Code:

Data:

Full Text
Bibtex
Your rating :
0

 

Tags
Comments