Embedded GPU Cluster Computing Framework for Inference of Convolutional Neural Networks

Evan Kain, Diego Wildenstein,Andrew C. Pineda

2019 IEEE High Performance Extreme Computing Conference (HPEC)（2019）

引用 1|浏览2

暂无评分

摘要

The growing need for on-board image processing for space vehicles requires computing solutions that are both low-power and high-performance. Parallel computation using low-power embedded Graphics Processing Units (GPUs) satisfy both requirements. Our experiment involves the use of OpenMPI domain decomposition of an image processing algorithm based upon a pre-trained convolutional neural network (CNN) developed by the U.S. Air Force Research Laboratory (AFRL). Our testbed consists of six NVIDIA Jetson TX2 development boards operating in parallel. This parallel framework results in a speedup of $4.3 \times $ on six processing nodes. This approach also leads to a linear decay in parallel efficiency as more processing nodes are added to the network. By replicating the data across processors in addition to distributing, we also characterize the best-case impact of adding triple modular redundancy (TMR) to our application.

查看译文

关键词

NVIDIA Jetson TX2 development boards,GPU cluster computing framework,inference,convolutional neural networks,on-board image processing,space vehicles,parallel computation,graphics processing units,OpenMPI domain decomposition,Air Force Research Laboratory,triple modular redundancy

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要