谷歌浏览器插件
订阅小程序
在清言上使用

WENGAN: Efficient and high quality hybrid de novo assembly of human genomes

biorxiv(2019)

引用 4|浏览2
暂无评分
摘要
The continuous improvement of long-read sequencing technologies along with the development of ad-doc algorithms has launched a new assembly era that promises high-quality genomes. However, it has proven difficult to use only long reads to generate accurate genome assemblies of large, repeat-rich human genomes. To date, most of the human genomes assembled from long error-prone reads add accurate short reads to further polish the consensus quality. Here, we report the development of a novel algorithm for hybrid assembly, W, and the assembly of four human genomes using a combination of sequencing data generated on ONT PromethION, PacBio Sequel, Illumina and MGI technology. W implements efficient algorithms that exploit the sequence information of short and long reads to tackle assembly contiguity as well as consensus quality. The resulting genome assemblies have high contiguity (contig NG50:16.67-62.06 Mb), few assembly errors (contig NGA50:10.9-45.91 Mb), good consensus quality (QV:27.79-33.61), and high gene completeness (B complete: 94.6-95.1%), while consuming low computational resources (CPU hours:153-1027). In particular, the W assembly of the haploid CHM13 sample achieved a contig NG50 of 62.06 Mb (NGA50:45.91 Mb), which surpasses the contiguity of the current human reference genome ( contig NG50:57.88 Mb). Providing highest quality at low computational cost, W is an important step towards the democratization of the assembly of human genomes. The W assembler is available at
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要