Work-Stealing Prefix Scan: Addressing Load Imbalance in Large-Scale Image Registration

IEEE Transactions on Parallel and Distributed Systems(2022)

引用 4|浏览49
暂无评分
摘要
Parallelism patterns (e.g., map or reduce) have proven to be effective tools for parallelizing high-performance applications. In this article, we study the recursive registration of a series of electron microscopy images - a time consuming and imbalanced computation necessary for nano-scale microscopy analysis. We show that by translating the image registration into a specific instance of the prefix scan, we can convert this seemingly sequential problem into a parallel computation that scales to over thousand of cores. We analyze a variety of scan algorithms that behave similarly for common low-compute operators and propose a novel work-stealing procedure for a hierarchical prefix scan. Our evaluation shows that by identifying a suitable and well-optimized prefix scan algorithm, we reduce time-to-solution on a series of 4,096 images spanning ten seconds of microscopy acquisition from over 10 hours to less than 3 minutes (using 1024 Intel Haswell cores), enabling derivation of material properties at nanoscale for long microscopy image series.
更多
查看译文
关键词
Prefix sum,parallel algorithms,work stealing,load balancing,image registration
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要