A Phylogenetic Framework to Simulate Synthetic Inter-species RNA-Seq Data.

Molecular biology and evolution(2022)

引用 1|浏览22
暂无评分
摘要
Inter-species RNA-Seq datasets are increasingly common, and have the potential to answer new questions about the evolution of gene expression. Single species differential expression analysis is now a well studied problem that benefits from sound statistical methods. Extensive reviews on biological or synthetic datasets have provided the community with a clear picture on the relative performances of the available methods in various settings. However, synthetic dataset simulation tools are still missing in the inter-species gene expression context. In this work, we develop and implement a new simulation framework. This tool builds on both the RNA-Seq and the Phylogenetic Comparative Methods literatures to generate realistic count datasets, while taking into account the phylogenetic relationships between the samples. We illustrate the usefulness of this new framework through a targeted simulation study, that reproduces the features of a recently published dataset, containing gene expression data in adult eye tissue across blind and sighted freshwater crayfish species. Using our simulated datasets, we perform a fair comparison of several approaches used for differential expression analysis. This benchmark reveals some of the strengths and weaknesses of both the classical and phylogenetic approaches for inter-species differential expression analysis, and allows for a reanalysis of the crayfish dataset. The tool has been integrated in the R package compcodeR, freely available on Bioconductor.
更多
查看译文
关键词
RNA-Seq,comparative transcriptomics,crayfish,differential gene expression,orthologous genes,phylogenetic comparative methods
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要