A fast machine-learning-guided primer design pipeline for selective whole genome amplification

biorxiv(2022)

引用 1|浏览23
暂无评分
摘要
Addressing many of the major outstanding questions in the fields of microbial evolution and pathogenesis will require analyses of populations of microbial genomes. Although population genomic studies provide the analytical resolution to investigate evolutionary and mechanistic processes at fine spatial and temporal scales – precisely the scales at which these processes occur – microbial population genomic research is currently hindered by the practicalities of obtaining sufficient quantities of the relatively pure microbial genomic DNA necessary for next-generation sequencing. Here we present swga2.0, an optimized and parallelized pipeline to design selective whole genome amplification (SWGA) primer sets. Unlike previous methods, swga2.0 incorporates active and machine learning methods to evaluate the amplification efficacy of individual primers and primer sets. Additionally, swga2.0 optimizes primer set search and evaluate strategies, including parallelization at each stage of the pipeline, to dramatically decrease program runtime from weeks to minutes. Here we describe the swga2.0 pipeline, including the empirical data used to identify primer and primer set characteristics, that improve amplification performance. Additionally, we evaluated the novel swga2.0 pipeline by designing primers sets that successfully amplify Prevotella melaninogenica , an important component of the lung microbiome in cystic fibrosis patients, from samples dominated by human DNA. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
genome amplification,primer design pipeline,machine-learning-guided
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要