RawHash2: Accurate and Fast Mapping of Raw Nanopore Signals using a Hash-based Seeding Mechanism

arXiv (Cornell University)(2023)

引用 0|浏览6
暂无评分
摘要
Raw nanopore signals can be analyzed while they are being generated, a process known as real-time analysis. Real-time analysis of raw signals is essential to utilize the unique features that nanopore sequencing provides, enabling the early stopping of the sequencing of a read or the entire sequencing run based on the analysis. The state-of-the-art mechanism, RawHash, offers the first hash-based efficient and accurate similarity identification between raw signals and a reference genome by quickly matching their hash values. In this work, we introduce RawHash2, which provides major improvements over RawHash, including a more sensitive chaining implementation, weighted mapping decisions, frequency filters to reduce ambiguous seed hits, minimizers for hash-based sketching, and support for the R10.4 flow cell version and various data formats such as POD5. Compared to RawHash, RawHash2 provides the best F1 accuracy in all datasets and provides better average throughput by 2.5x.
更多
查看译文
关键词
rawhash2 nanopore signals,fast mapping,hash-based
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要