PICNIC identifies condensate-forming proteins across organisms

biorxiv(2023)

引用 0|浏览0
暂无评分
摘要
Biomolecular condensates are membrane-less organelles that selectively concentrate biomolecules to perform a multitude of fundamental biochemical functions. Despite their biological and therapeutical interest, experimental identification of condensate-forming proteins on the entire proteome scale remains challenging. To enable systematic detection of condensate proteins, we developed an algorithm to recognize proteins involved in in vivo biomolecular condensates regardless of the mechanism of condensate formation. Using a curated dataset of condensates, we trained a machine learning classifier based on sequence- and structure-based features to predict if a protein is part of a condensate. Our model, PICNIC (Proteins Involved in CoNdensates In Cells) outperforms other prediction tools, and although it was trained on human data, it generalizes well to various organisms. Experimental validation of 24 proteins spanning a wide-range of functions, structural content and disease relevance confirmed that 18 of them localize to condensates with high confidence, while 3 form condensates with low confidence. Thus, our experimental validation suggests an ∼87.5% success rate (75% with high confidence and 12.5% with low confidence) in identifying condensate-forming proteins. Proteome-wide predictions by PICNIC estimate that ∼40% of proteins partition into condensates across different organisms, from bacteria to humans, with no apparent correlation with organismal complexity or disordered protein content. Our model will shed light on the evolution of biomolecular condensates and will help identify potential protein targets to modulate biomolecular condensate formation. ### Competing Interest Statement A.A.H. is a founder and shareholder of Dewpoint Therapeutics. The remaining authors declare no competing interests.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要