谷歌浏览器插件
订阅小程序
在清言上使用

A Sequence-Based Model for Identifying Proteins Undergoing Liquid-Liquid Phase Separation/forming Fibril Aggregates Via Machine Learning.

Protein science(2024)

引用 0|浏览10
暂无评分
摘要
Liquid-liquid phase separation (LLPS) and the solid aggregate (also referred to as amyloid aggregates) formation of proteins, have gained significant attention in recent years due to their associations with various physiological and pathological processes in living organisms. The systematic investigation of the differences and connections between proteins undergoing LLPS and those forming amyloid fibrils at the sequence level has not yet been explored. In this research, we aim to address this gap by comparing the two types of proteins across 36 features using collected data available currently. The statistical comparison results indicate that, 24 of the selected 36 features exhibit significant difference between the two protein groups. A LLPS-Fibrils binary classification model built on these 24 features using random forest reveals that the fraction of intrinsically disordered residues (FIDR ) is identified as the most crucial feature. While, in the further three-class LLPS-Fibrils-Background classification model built on the same screened features, the composition of cysteine and that of leucine show more significant contributions than others. Through feature ablation analysis, we finally constructed a model FLFB (Feature-based LLPS-Fibrils-Background protein predictor) using six refined features, with an average area under the receiver operating characteristics of 0.83. This work indicates using sequence features and a machine learning model, proteins undergoing LLPS or forming amyloid fibrils can be identified.
更多
查看译文
关键词
aggregates,classification model,liquid-liquid phase separation,machine learning,protein,sequence charge decoration
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要