RawHash2: Mapping Raw Nanopore Signals Using Hash-Based Seeding and Adaptive Quantization
arxiv(2023)
Abstract
Summary: Raw nanopore signals can be analyzed while they are being generated,
a process known as real-time analysis. Real-time analysis of raw signals is
essential to utilize the unique features that nanopore sequencing provides,
enabling the early stopping of the sequencing of a read or the entire
sequencing run based on the analysis. The state-of-the-art mechanism, RawHash,
offers the first hash-based efficient and accurate similarity identification
between raw signals and a reference genome by quickly matching their hash
values. In this work, we introduce RawHash2, which provides major improvements
over RawHash, including a more sensitive quantization and chaining
implementation, weighted mapping decisions, frequency filters to reduce
ambiguous seed hits, minimizers for hash-based sketching, and support for the
R10.4 flow cell version and various data formats such as POD5 and SLOW5.
Compared to RawHash, RawHash2 provides better F1 accuracy (on average by 10.57
and up to 20.25
than RawHash. Availability and Implementation: RawHash2 is available at
https://github.com/CMU-SAFARI/RawHash. We also provide the scripts to fully
reproduce our results on our GitHub page.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined