Within and cross species predictions of plant specialized metabolism genes using transfer learning

biorxiv(2020)

引用 4|浏览14
暂无评分
摘要
Plant specialized metabolites mediate interactions between plants and the environment and have significant agronomical/pharmaceutical value. Most genes involved in specialized metabolism (SM) are unknown because of the large number of specialized metabolites and the challenge in differentiating SM genes from general metabolism (GM) genes. We employed transfer learning, a machine learning strategy in which information from one species with substantially more experimentally derived function data ( is used to build a model to predict gene functions in another species (). Using machine learning to integrate heterogenous gene features, we built models distinguishing tomato SM and GM genes. Although SM/GM genes can be predicted based on tomato data alone (F-measure=0.74, compared with 0.5 for random and 1.0 for perfect predictions), using information from Arabidopsis to filter likely misannotated genes significantly improves prediction (F-measure= 0.92). This study demonstrates that SM/GM genes can be better predicted by leveraging cross-species information.
更多
查看译文
关键词
Cross-species gene prediction,specialized metabolism,transfer learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要