The Human Pangenome’s sequence conservation reveals a landscape of polymorphic structural variations

biorxiv(2022)

引用 0|浏览17
暂无评分
摘要
The Human Pangenome is a new reference build that addresses many of the limitations of the current reference. The first release is based on 94 high-quality haploid assemblies from individuals with diverse backgrounds. To facilitate the Pangenomes adoption in the wider research community, we conducted a multiple genome comparative analysis against the current GRCh38 reference. We applied a k-mer indexing strategy to identify highly conserved sequences that are omnipresent across all Pangenome assemblies, the reference and CHM13, a reference assembly with telomere-to-telomere chromosomes. Pan-conserved tag segments provided an informative set of universally conserved sequences. Examining the intervals between pairs of these segments defined highly conserved segments of the genome versus ones that have structurally related polymorphisms. We identified a Pangenome landscape of 60,764 polymorphic intervals with unique and geo-ethnic features. Overall, this study of the Pangenome revealed the conserved versus divergent features including the landscape of polymorphic structural variants. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要