KPop: Accurate, assembly-free, and scalable comparative analysis of microbial genomes

biorxiv(2024)

引用 0|浏览11
暂无评分
摘要
The recent explosion in the amount of available sequencing data challenges existing analysis techniques. Here we introduce KPop, a novel versatile method based on full k -mer spectra and dataset-specific transformations, through which thousands of assembled or unassembled microbial genomes can be quickly compared. Unlike minimizer-based methods that produce distances and have lower resolution, KPop is able to accurately map sequences onto a low-dimensional space. Extensive validation on simulated and real-life viral and bacterial datasets shows that KPop can correctly separate sequences at both species and sub-species levels even when the overall genomic diversity is low. KPop also rapidly identifies related sequences and systematically outperforms minimizer-based methods. The KPop open-source code is available on GitHub at https://github.com/PaoloRibeca/KPop. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要