Using Deep Learning To Extrapolate Protein Expression Measurements

M Barzine,K Freivalds,J C Wright,M Opmanis, D Rituma, F Z Ghavidel, A J Jarnuczak,E Celms,K Cerans,I Jonassen,L Lace,J A Vizcaíno,J Choudhary,A Brazma,J Viksna

PROTEOMICS（2020）

引用 8|浏览38

暂无评分

摘要

Mass spectrometry (MS)-based quantitative proteomics experiments typically assay a subset of up to 60% of the approximate to 20 000 human protein coding genes. Computational methods for imputing the missing values using RNA expression data usually allow only for imputations of proteins measured in at least some of the samples. In silico methods for comprehensively estimating abundances across all proteins are still missing. Here, a novel method is proposed using deep learning to extrapolate the observed protein expression values in label-free MS experiments to all proteins, leveraging gene functional annotations and RNA measurements as key predictive attributes. This method is tested on four datasets, including human cell lines and human and mouse tissues. This method predicts the protein expression values with averageR2scores between 0.46 and 0.54, which is significantly better than predictions based on correlations using the RNA expression data alone. Moreover, it is demonstrated that the derived models can be "transferred" across experiments and species. For instance, the model derived from human tissues gave aR2=0.51when applied to mouse tissue data. It is concluded that protein abundances generated in label-free MS experiments can be computationally predicted using functional annotated attributes and can be used to highlight aberrant protein abundance values.

查看译文

关键词

deep learning networks, Gene Ontology, mass spectrometry, protein abundance prediction, UniProt keywords

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要