谷歌浏览器插件
订阅小程序
在清言上使用

MantaID: a Machine Learning-Based Tool to Automate the Identification of Biological Database IDs.

Zhengpeng Zeng, Jiamin Hu,Miyuan Cao, Bingbing Li, Xiting Wang,Feng Yu,Longfei Mao

Database(2023)

引用 0|浏览8
暂无评分
摘要
The number of biological databases is growing rapidly, but different databases use different identifiers (IDs) to refer to the same biological entity. The inconsistency in IDs impedes the integration of various types of biological data. To resolve the problem, we developed MantaID, a data-driven, machine learning-based approach that automates identifying IDs on a large scale. The MantaID model's prediction accuracy was proven to be 99%, and it correctly and effectively predicted 100,000 ID entries within 2 min. MantaID supports the discovery and exploitation of ID from large quantities of databases (e.g. up to 542 biological databases). An easy-to-use freely available open-source software R package, a user-friendly web application and application programming interfaces were also developed for MantaID to improve applicability. To our knowledge, MantaID is the first tool that enables an automatic, quick, accurate and comprehensive identification of large quantities of IDs and can therefore be used as a starting point to facilitate the complex assimilation and aggregation of biological data across diverse databases.
更多
查看译文
关键词
Biological Network Integration,Genomic Data Integration,Bioinformatics,Data-independent Acquisition,Protein Identification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要