Static detection of uncoalesced accesses in GPU programs

Formal Methods in System Design(2021)

引用 2|浏览27
暂无评分
摘要
GPU programming has become popular due to the high computational capabilities of GPUs. Obtaining significant performance gains with GPU is however challenging and the programmer needs to be aware of various subtleties of the GPU architecture. One such subtlety lies in accessing GPU memory, where certain access patterns can lead to poor performance. Such access patterns are referred to as uncoalesced global memory accesses . This work presents a light-weight compile-time static analysis to identify such accesses in GPU programs. The analysis relies on a novel abstraction which tracks the access pattern across multiple threads. The abstraction enables quick prediction while providing correctness guarantees. We have implemented the analysis in LLVM and compare it against a dynamic analysis implementation. The static analysis identifies 95 pre-existing uncoalesced accesses in Rodinia, a popular benchmark suite of GPU programs, and finishes within seconds for most programs, in comparison to the dynamic analysis which finds 69 accesses and takes orders of magnitude longer to finish.
更多
查看译文
关键词
GPU performance, Static analysis, Abstract execution
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要