On The Feasibility Of Using Reduced-Precision Tensor Core Operations For Graph Analytics

2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC)(2020)

引用 4|浏览23
暂无评分
摘要
Today's data-driven analytics and machine learning workload have been largely driven by the General-Purpose Graphics Processing Units (GPGPUs). To accelerate dense matrix multiplications on the GPUs, Tensor Core Units (TCUs) have been introduced in recent years. In this paper, we study linear-algebra-based and vertex-centric algorithms for various graph kernels on the GPUs with an objective of applying this new hardware feature to graph applications. We identify the potential stages in these graph kernels that can be executed on the Tensor Core Units. In particular, we leverage the reformulation of the reduction and scan operations in terms of matrix multiplication [1] on the TCUs. We demonstrate that executing these operations on the TCUs, available inside different graph kernels, can assist in establishing an end-to-end pipeline on the GPGPUs without depending on hand-tuned external libraries and still can deliver comparable performance for various graph analytics.
更多
查看译文
关键词
TCUs,different graph kernels,GPGPUs,graph analytics,reduced-precision tensor core operations,data-driven analytics,machine learning workload,General-Purpose Graphics Processing Units,dense matrix multiplications,GPUs,Tensor Core Units,vertex-centric algorithms,scan operations,matrix multiplication
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要