A benchmark of protein solubility prediction methods on UDP-dependent glycosyltransferases

biorxiv(2020)

引用 2|浏览7
暂无评分
摘要
UDP-dependent glycosyltransferases (UGTs) are enzymes that glycosylate a wide variety of natural products, thereby modifying their physico-chemical properties, i.e. solubility, stability, reactivity, and function. To successfully leverage the UGTs in biocatalytic processes, we need to be able to screen and characterise them , which requires efficient heterologous expression in amenable hosts, preferably . However, many UGTs are insoluble when expressed in standard and attempted optimised conditions, resulting in many unproductive and costly experiments. To overcome this limitation, we have investigated the performance of 11 existing solubility predictors on a dataset of 57 UGTs expressed in . We show that SoluProt outperforms other methods in terms of both threshold-independent and threshold-dependent measures. Among the benchmarked methods, only SoluProt is significantly better than random predictors using both measures. Moreover, we show that SoluProt uses a threshold for separating soluble and insoluble proteins that is optimal for our dataset. Hence, we conclude that using SoluProt to select UGT sequences for investigation will significantly increase the success rate of soluble expression, thereby minimising cost and enabling efficient characterisation efforts for biocatalysis research.
更多
查看译文
关键词
Protein solubility,Benchmark,UDP-dependent glycosyltransferase,Heterologous expression,<italic>Escherichia coli</italic>,SoluProt
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要