High Precision Integer Multiplication with a GPU

Parallel and Distributed Processing Workshops and Phd Forum(2011)

引用 5|浏览0
暂无评分
摘要
We have improved our prior implementation of Strassen's algorithm for high performance multiplication of very large integers on a general purpose graphics processor (GPU). A combination of algorithmic and implementation optimizations result in a factor of 2.3 speed improvement over our previous work, running on an NVIDIA 295. We have also reoptimized the implementation for an NVIDIA 480, from which we obtain a factor of up to 10 speedup in comparison with a Core i7 processor of the same technology generation. This paper discusses how we adapted the algorithm to operate within the limitations of the GPU and how we dealt with other issues encountered in the implementation process, as well as reporting performance results for a multiplications ranging from 255K bits, to 24.512M bits in size.
更多
查看译文
关键词
computer graphic equipment,coprocessors,fast Fourier transforms,general purpose computers,matrix multiplication,multiprocessing systems,optimisation,Core i7 processor,GPU,NVIDIA 295,NVIDIA 480,Strassen algorithm,general purpose graphics processor,high precision integer multiplication,optimization,speed improvement,technology generation,very large integer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要