SegAlign: A Scalable GPU-Based Whole Genome Aligner

SC(2020)

引用 10|浏览2
暂无评分
摘要
Pairwise Whole Genome Alignment (WGA) is a crucial first step to understanding evolution at the DNA sequence-level. Pairwise WGA of thousands of currently available species genomes could help make biological discoveries, however, computing them for even a fraction of the millions of possible pairs is prohibitive - WGA of a single pair of vertebrate genomes (human-mouse) takes 11 hours on a 96-core Amazon Web Services (AWS) instance (c5.24xlarge). This paper presents SegAlign - a scalable, GPU-accelerated system for computing pairwise WGA. SegAlign is based on the standard seed-filter-extend heuristic, in which the filtering stage dominates the runtime (e.g. 98% for human-mouse WGA), and is accelerated using GPU(s). Using three vertebrate genome pairs, we show that SegAlign provides a speedup of up to 14× on an 8-GPU, 64-core AWS instance (p3.16xlarge) for WGA and nearly 2.3× reduction in dollar cost. SegAlign also allows parallelization over multiple GPU nodes and scales efficiently.
更多
查看译文
关键词
Whole Genome Alignment,Graphics Processing Unit (GPU),Comparative Genomics,Apache Spark
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要