ddRingAllreduce: a high-precision RingAllreduce algorithm

CCF Trans. High Perform. Comput.(2023)

引用 0|浏览0
暂无评分
摘要
For complex problems in scientific computing, parallel computing is almost the only way to solve them, in which global reduction is one of the most frequently used operations. Due to the existence of floating-point rounding errors, the existing global reduction algorithm may result in inaccurate or different between two runs, which are difficult to meet the needs of complex applications. Since the communication cost of RingAllreduce is a constant, independent of the number of processes, it is an effective algorithm when a large amount of data needs to be communicated. However, it faces the same problem as the general global reduction operation, and it is necessary to develop a high-precision RingAllreduce algorithm. In this paper, by combining double-double arithmetic and RingAllreduce algorithm, we propose a high-precision RingAllreduce algorithm, called ddRingAllreduce algorithm. The theoretical error of the proposed algorithm is analyzed and the compact error bounds are derived. We have carried out a large number of parallel numerical experiments and obtained numerical results consistent with the theoretical analysis, and ddRingAllreduce is accurate in the case that RingAllreduce is inaccurate or miscalculated. At the same time, we also analyze the relationship between the problem size and the cost of using double-double arithmetic through experiments, at a small scale, the ddRingAllreduce algorithm can achieve higher accuracy with relatively less time overhead.
更多
查看译文
关键词
RingAllreduce,ddRingAllreduce,Collective communication,Double-double arithmetic,High precision
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要