CAMAMED: a pipeline for composition-aware mapping-based analysis of metagenomic data

Mohammad H. Norouzi-Beirami,Sayed-Amir Marashi, Ali M. Banaei-Moghaddam,Kaveh Kavousi

NAR GENOMICS AND BIOINFORMATICS（2021）

引用 4|浏览18

暂无评分

摘要

Metagenomics is the study of genomic DNA recovered from a microbial community. Both assembly-based and mapping-based methods have been used to analyze metagenomic data. When appropriate gene catalogs are available, mapping-based methods are preferred over assembly based approaches, especially for analyzing the data at the functional level. In this study, we introduce CA-MAMED as a composition-aware mapping-based metagenomic data analysis pipeline. This pipeline can analyze metagenomic samples at both taxonomic and functional profiling levels. Using this pipeline, metagenome sequences can be mapped to non-redundant gene catalogs and the gene frequency in the samples are obtained. Due to the highly compositional nature of metagenomic data, the cumulative sum-scaling method is used at both taxa and gene levels for compositional data analysis in our pipeline. Additionally, by mapping the genes to the KEGG database, annotations related to each gene can be extracted at different functional levels such as KEGG ortholog groups, enzyme commission numbers and reactions. Furthermore, the pipeline enables the user to identify potential biomarkers in case-control metagenomic samples by investigating functional differences. The source code for this software is available from https://github.com/mhnb/ camamed. Also, the ready to use Docker images are available at https://hub.docker.com.

查看译文

关键词

composition-aware,mapping-based

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要