An FPGA-based hardware accelerator supporting sensitive sequence homology filtering with profile hidden Markov models

bioRxiv (Cold Spring Harbor Laboratory)(2023)

引用 0|浏览0
暂无评分
摘要
Background Sequence alignment lies at the heart of genome sequence annotation. While the BLAST suite of alignment tools has long held an important role in alignment-based sequence database search, greater sensitivity is achieved through the use of profile hidden Markov models (pHMMs). The Forward algorithm that provides much of pHMMs’ sensitivity is relatively slow, motivating extensive efforts to increase speed. Numerous researchers have devised methods to improve pHMM alignment speed using hardware accelerators such as graphics processing units (GPUs) and field programmable gate arrays (FPGAs). Here, we describe an FPGA hardware accelerator for a key bottleneck step in the analysis pipeline employed by the popular pHMM aligment tool, HMMER. HMMER accelerates pHMM Forward alignment by screening most sequence with a series of filters that rapidly approximate the result of computing full Forward alignment. The first of these filters, the Single Segment ungapped Viterbi (SSV) algorithm, is designed to filter out 98% of non-related inputs and accounts for 70% of the overall runtime of the DNA search tool nhmmer in common use cases. SSV is an ideal target for hardware acceleration due to its limited data dependency structure. Results We present Hardware Accelerated single segment Viterbi Additional Coprocessor (HAVAC), an FPGA-based hardware accelerator for the SSV algorithm. The core HAVAC kernel calculates the SSV matrix at 1739 GCUPS on a Xilinx Alveo U50 FPGA accelerator card, ∼ 227x faster than the optimized SSV implementation in nhmmer . Accounting for PCI-e data transfer data processing, HAVAC is 65x faster than nhmmer’s SSV with one thread and 35x faster than nhmmer with four threads, and uses ∼ 31% the energy of a traditional high end Intel CPU. Because these computations are performed on a co-processor, the host CPU remain free to simultaneously compute downstream pHMM alignment and later post-processing. Author summary Sequence alignment lies at the heart of genome sequence annotation, and must be both fast and accurate. Signals of relationships between sequences are obscured over time by mutational forces, so that alignment and annotation of the full diversity of life demands highly sensitive tools. Profile hidden Markov models (pHMMs) provide the greatest sensitivity in the face of diversity, but are relatively slow. Here, we describe an approach to improving the speed of pHMM search that leverages field programmable gate arrays - hardware devices that can be configured to implement arbitrary digital circuits to achieve impressive parallelism and energy efficiency. Our tool, HAVAC, accelerates one key bottleneck step in the analysis pipeline employed by the popular pHMM aligment tool, HMMER. HAVAC produces a ∼ 60x speedup over the analogous stage in HMMER. HAVAC can be implemented as a part of a larger sequence homology search tool for faster search times and reduced energy usage. Interested users can download HAVAC on github at . ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
sensitive sequence homology,hardware accelerator,markov models,fpga-based
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要