Exploring neighborhoods in large metagenome assembly graphs reveals hidden sequence diversity

bioRxiv(2019)

引用 16|浏览118
暂无评分
摘要
Genomes computationally inferred from large metagenomic data sets are often incomplete and may be missing functionally important content and strain variation. We introduce an information retrieval system for large metagenomic data sets that exploits the sparsity of DNA assembly graphs to efficiently extract subgraphs surrounding an inferred genome. We apply this system to recover missing content from genome bins and show that substantial genomic sequence variation is present in a real metagenome. Our software implementation is available at https://github.com/spacegraphcats/spacegraphcats under the 3-Clause BSD License.
更多
查看译文
关键词
metagenomics,sequence assembly,strain variation,bounded expansion,dominating set
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要