Improved Centromere Assemblies for RefGen_v4

biorxiv(2022)

引用 0|浏览1
暂无评分
摘要
Genome assemblies based on long read sequencing technology have revolutionized the assembly of repeat-rich centromere regions. However, because maize centromeres are highly enriched for the tandem repeat CentC and centromeric retrotransposons (CR), automated genome assembly left gaps even in the excellent B73 RefGen\_v4 reference genome constructed from long-read data. Manual editing of >140 Mb spanning the ten centromeres of maize inbred B73 resulted in the closure of 127 sequence gaps and the addition of >8.4 Mb of previously unanchored sequence (unitigs and reads) containing 24 genes, 2 Mb of CR repeat and 887 kb of CentC without including any additional sequence data. The functional centromeres of five maize chromosomes were closed completely, including a 7 Mb region spanning the extremely CR2-rich CEN2. This improved assembly, B73 RefGen\_v4CEN , was completed in February 2019 and has been available at , both as pseudomolecules and as centromere assemblies alone. Thus, the manual editing of existing sequence data significantly improved the centromere regions of the B73 RefGen\_v4 reference genome. These data were used for centromere analyses until the release of RefGen\_v5. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
improved centromere assemblies
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要