The GIAB genomic stratifications resource for human reference genomes

biorxiv(2023)

引用 0|浏览12
暂无评分
摘要
Stratification of the genome into different genomic contexts is useful when developing bioinformatics software like variant callers, to assess performance in difficult regions in the human genome. Here we describe a set of genomic stratifications for the human reference genomes GRCh37, GRCh38, and T2T-CHM13v2.0. Generating stratifications for the new complete CHM13 reference genome is critical to understanding improvements in variant caller performance when using this new complete reference. The GIAB stratifications can be used when benchmarking variant calls to analyze difficult regions of the human genome in a standardized way. Here we present stratifications in the CHM13 genome in comparison to GRCh37 and GRCh38, highlighting expansions in hard-to-map and GC-rich stratifications which provide useful insight for accuracy of variants in these newly-added regions. To evaluate the reliability and utility of the new stratifications, we used the stratifications of the three references to assess accuracy of variant calls in diverse, challenging genomic regions. The means to generate these stratifications are available as a snakemake pipeline at . ### Competing Interest Statement FJS receives research support from Genetech, Illumina, ONT and Pacbio. BB is a full-time employee of DNAnexus.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要