Benchmarking the Performance of Accelerators on National Cyberinfrastructure Resources for Artificial Intelligence / Machine LearningWorkloads

Abhinand S. Nasari,Hieu T. Le, Richard Lawrence,Zhenhua He, Xin Yang,Mario M. Krell,Alex Tsyplikhin,Mahidhar Tatineni,Tim Cockerill,Lisa M. Perez,Dhruva K. Chakravorty,Honggao Liu

Practice and Experience in Advanced Research Computing（2022）

引用 2|浏览11

暂无评分

摘要

Upcoming regional and National Science Foundation (NSF)-funded Cyberinfrastructure (CI) resources will give researchers opportunities to run their artificial intelligence / machine learning (AI/ML) workflows on accelerators. To effectively leverage this burgeoning CI-rich landscape, researchers need extensive benchmark data to maximize performance gains and map their workflows to appropriate architectures. This data will further assist CI administrators, NSF program officers, and CI allocation-reviewers make informed determinations on CI-resource allocations. Here, we compare the performance of two very different architectures: the commonly used Graphical Processing Units (GPUs) and the new generation of Intelligence Processing Units (IPUs), by running training benchmarks of common AI/ML models. We leverage the maturity of software stacks, and the ease of migration among these platforms to learn that performance and scaling are similar for both architectures. Exploring training parameters, such as batch size, however finds that owing to memory processing structures, IPUs run efficiently with smaller batch sizes, while GPUs benefit from large batch sizes to extract sufficient parallelism in neural network training and inference. This comes with different advantages and disadvantages as discussed in this paper.As such considerations of inference latency, inherent parallelism and model accuracy will play a role in researcher selection of these architectures. The impact of these choices on a representative image compression model system is discussed.

查看译文

关键词

ResNet50, ACES (Accelerating Computing for Emerging Sciences), Expanse, Graphics Processing Unit, Intelligence Processing Unit, PopVision, Classification, Convolution Neural Network, Optimization, Frontera, LoneStar6

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要