Hecatomb: An End-to-End Research Platform for Viral Metagenomics
bioRxiv (Cold Spring Harbor Laboratory)(2022)
摘要
Background Analysis of viral diversity using modern sequencing technologies offers extraordinary opportunities for discovery. However, these analyses present a number of bioinformatic challenges due to viral genetic diversity and virome complexity. Due to the lack of conserved marker sequences, metagenomic detection of viral sequences requires a non-targeted, random (shotgun) approach. Annotation and enumeration of viral sequences relies on rigorous quality control and effective search strategies against appropriate reference databases. Virome analysis also benefits from the analysis of both individual metagenomic sequences as well as assembled contigs. Combined, virome analysis results in large amounts of data requiring sophisticated visualization and statistical tools.
Results Here we introduce Hecatomb, a bioinformatics platform enabling both read and contig based analysis. Hecatomb integrates query information from both amino acid and nucleotide reference sequence databases. Hecatomb integrates data collected throughout the workflow enabling analyst driven virome analysis and discovery. Hecatomb is available on GitHub at .
Conclusions Hecatomb provides a single, modular software solution to the complex tasks required of many virome analysis. We demonstrate the value of the approach by applying Hecatomb to both a host-associated (enteric) and an environmental (marine) virome data set. Hecatomb provided data to determine true- or false-positive viral sequences in both data sets and revealed complex virome structure at distinct marine reef sites.
### Competing Interest Statement
The authors have declared no competing interest.
* AIDS
: acquired immunodeficiency syndrome
SIV
: simian immunodeficiency virus
HPC
: high-performance computing
NCBI
: National Center for Biotechnology Information
RPKM
: reads per kilobase million
FPKM
: fragments per kilobase million
SPM
: sequences per million
LCA
: lowest common ancestor
ICTV
: International Committee on Taxonomy of Viruses
PERMANOVA
: permutational analysis of variance
PCoA
: principal coordinate analysis
ANOVA
: analysis of variance
SIMPER
: similarity percentag
更多查看译文
关键词
end-to-end
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要