Probabilistic models for the analysis of gene expression profiles

Probabilistic models for the analysis of gene expression profiles(2012)

引用 23|浏览8
暂无评分
摘要
Gene expression profiles are some of the most abundant sources of data about the cellular state of a collection of cells in an organism. Comparison of the expression profiles of multiple samples allows biologists to find associations between observations at the molecular level and the phenotype of the samples. A key challenge is to distinguish variation in expression due to biological factors of interest from variation due to confounding factors that can arise for unrelated technical or biological reasons. This thesis presents models that can explicitly adjust the comparison of expression profiles to account for specific types of confounding factors.One such confounding factor arises when comparing tissue-specific expression profiles across multiple organisms to identify differences in expression that are indicative of changes in gene function. When the organisms are separated by long evolutionary distances, tissue functions may be re-distributed and introduce expression changes unrelated to changes in gene function. We developed Brownian Factor Phylogenetic Analysis, a model that can account for such re-distribution of function, and demonstrate that removing this confounding factor improves tasks such as predicting gene function.Another confounding factor arises because current protocols for expression profiling require RNA extracts from multiple cells. Often biological samples are heterogeneous mixtures of multiple cell types, so the measured expression profile is an average of the RNA levels of the constituent cells. When the biological sample contains both cells of interest and nuisance cells, the confounding expression from the nuisance cells can mask the expression of the cells of interest. We developed ISOLATE and ISOpure, two models for addressing the heterogeneity of tumor samples. We demonstrated that modeling tumor heterogeneity leads to an improvement in two tasks: identifying the site of origin of metastatic tumors, and predicting the risk of death of lung cancer patients.
更多
查看译文
关键词
expression change,probabilistic model,gene expression profile,expression profile,biological sample,measured expression profile,confounding expression,gene function,confounding factor,expression profiling,tissue-specific expression profile
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要