谷歌浏览器插件
订阅小程序
在清言上使用

A CUDA-MPI Hybrid Bitonic Sorting Algorithm for GPU Clusters

Parallel Processing Workshops(2012)

引用 22|浏览2
暂无评分
摘要
We present a hybrid CUDA-MPI sorting algorithm that makes use of GPU clusters to sort large data sets. Our algorithm has two phases. In the first phase each node sorts a portion of the data on its GPU using a parallel bitonic sort. In the second phase the sorted subsequences are merged together in parallel using a reduction sorting network implemented in MPI across the cluster nodes. Performance results comparing our sorting algorithm to sequential quick sort yield speed-up values of up to 9.8 for sorting 4GB of data on a 32 node GPU cluster. We anticipate even better speed-up values using our algorithm on larger data sets and larger sized clusters.
更多
查看译文
关键词
application program interfaces,data reduction,graphics processing units,parallel architectures,pattern clustering,sorting,CUDA,GPU cluster,MPI,cluster node,data set,hybrid bitonic sorting algorithm,parallel bitonic sort,reduction sorting network,GPU clusters,hybrid CUDA-MPI,parallel sorting algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要