Refining Embedding-Based Binding Predictions by Leveraging AlphaFold2 Structures

biorxiv(2022)

引用 1|浏览24
暂无评分
摘要
Background Identifying residues in a protein involved in ligand binding is important for understanding its function. bindEmbed21DL is a Machine Learning method which predicts protein-ligand binding on a per-residue level using embeddings derived from the protein Language Model (pLM) ProtT5. This method relies solely on sequences, making it easily applicable to all proteins. However, highly reliable protein structures are now accessible through the AlphaFold Protein Structure Database or can be predicted using AlphaFold2 and ColabFold, allowing the incorporation of structural information into such sequence-based predictors. Results Here, we propose bindAdjust which leverages predicted distance maps to adjust the binding probabilities of bindEmbed21DL to subsequently boost performance. bindAdjust raises the recall of bindEmbed21DL from 47±2% to 53±2% at a precision of 50% for small molecule binding. For binding to metal ions and nucleic acids, bindAdjust serves as a filter to identify good predictions focusing on the binding site rather than isolated residues. Further investigation of two examples shows that bindAdjust is in fact able to add binding predictions which are not close in sequence but close in structure, extending the binding residue predictions of bindEmbed21DL to larger binding stretches or binding sites. Conclusion Due to its simplicity and speed, the algorithm of bindAdjust can easily refine binding predictions also from other tools than bindEmbed21DL and, in fact, could be applied to any protein prediction task. ### Competing Interest Statement The authors have declared no competing interest. * 3D : three-dimensional; AFDB : AlphaFold Protein Structure Database; FN : false negative(s); FP : false positive(s); MSA : Multiple Sequence Alignment; NLP : Natural Language Processing; pLM : protein Language Model; TN : true negative(s); TP : true positive(s);
更多
查看译文
关键词
binding predictions,embedding-based
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要