Analyzing Machine Learning Workloads Using a Detailed GPU Simulator.

Jonathan Lew,Deval Shah,Suchita Pati,Shaylin Cattell,Mengchi Zhang,Amruth Sandhupatla,Christopher Ng,Negar Goli,Matthew D. Sinclair,Timothy G. Rogers,Tor M. Aamodt

2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)（2019）

引用 58|浏览103

暂无评分

摘要

Machine learning (ML) has recently emerged as an important application driving future architecture design. Traditionally, architecture research has used detailed simulators to model and measure the impact of proposed changes. However, current open-source, publicly available simulators lack support for running a full ML stack like PyTorch. High-confidence, cycle-accurate simulations are crucial for architecture research and without them, it is difficult to rapidly prototype new ideas. In this paper, we describe changes we made to GPGPU-Sim, a popular, widely used GPU simulator, to run ML applications that use cuDNN and PyTorch, two widely used frameworks for running Deep Neural Networks (DNNs). This work has the potential to enable significant microarchitectural research into GPUs for DNNs. Our results show that the modified simulator, which has been made publicly available with this paper 1 Source code available at https://github.com/gpgpu-sim/gpgpu-sim_distribution (dev branch), provides execution time results within 18% of real hardware. We further use it to study other ML workloads and demonstrate how the simulator identifies opportunities for architectural optimization that prior tools are unable to provide.

查看译文

关键词

GPGPU-Sim,Machine Learning,cuDNN,Py-Torch

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要