New targets acquired: improving locus recovery from the Angiosperms353 probe set

bioRxiv (Cold Spring Harbor Laboratory)(2020)

引用 0|浏览0
暂无评分
摘要
Universal target enrichment kits maximise utility across wide evolutionary breadth while minimising the number of baits required to create a cost-efficient kit. Locus assembly requires a target reference, but the taxonomic breadth of the kit means that target references files can be phylogenetically sparse. The Angiosperms353 kit has been successfully used to capture loci throughout angiosperms but includes sequence information from 6–18 taxa per locus. Consequently, reads sequenced from on-target DNA molecules may fail to map to references, resulting in fewer on-target reads for assembly, reducing locus recovery. We expanded the Angiosperms353 target file, incorporating sequences from 566 transcriptomes to produce a ‘mega353’ target file, with each gene represented by 17–373 taxa. This mega353 file is a drop-in replacement for the original Angiosperms353 file in HybPiper analyses. We provide tools to subsample the file based on user-selected taxon groups, and to incorporate other transcriptome or protein-coding gene datasets. Compared to the default Angiosperms353 file, the mega353 file increased the percentage of on-target reads by an average of 31%, increased loci recovery at 75% length by 61.9%, and increased the total length of the concatenated loci by 30%. The mega353 file and associated scripts are available at: https://github.com/chrisjackson-pellicle/NewTargets
更多
查看译文
关键词
locus recovery,probe,new targets
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要