Chrome Extension
WeChat Mini Program
Use on ChatGLM

POSTER: Pairing Up CNNs for High Throughput Deep Learning

2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT)(2019)

Cited 0|Views33
No score
To facilitate the efficient execution of convolutional neural networks (CNNs) on cloud servers, this paper proposes Yin Yang (YY), an input-driven synergistic deep learning system, which dynamically distributes CNN computation between a complex (Yang) and a simple (Yin) CNN. YY runs most of the inferences on Yin, while Yang is invoked only when Yin has low confidence. On average, compared to the traditional CNN as a service approach, YY improves datacenter throughput by 1.8× and reduces inference latency by 31% on an NVIDIA TITAN X GPU without any accuracy loss across 21 CNNs.
Translated text
Key words
efficient neural network, inference, cloud servers
AI Read Science
Must-Reading Tree
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined