BIOS: An Algorithmically Generated Biomedical Knowledge Graph

Shuang Yu, Yi Zheng,Jun Xia,Shengxuan Luo,Huaiyuan Ying,Sihang Zeng,Jingyi Ren, Hongbin Yuan, Zujin Zhao,Yucong Lin, Kening Lu,Jing Wang, Yibing Xie,Heung-Yeung Shum

arXiv (Cornell University)(2022)

引用 0|浏览0
暂无评分
摘要
Biomedical knowledge graphs (BioMedKGs) are essential infrastructures for biomedical and healthcare big data and artificial intelligence (AI), facilitating natural language processing, model development, and data exchange. For decades, these knowledge graphs have been developed via expert curation; however, this method can no longer keep up with today's AI development, and a transition to algorithmically generated BioMedKGs is necessary. In this work, we introduce the Biomedical Informatics Ontology System (BIOS), the first large-scale publicly available BioMedKG generated completely by machine learning algorithms. BIOS currently contains 4.1 million concepts, 7.4 million terms in two languages, and 7.3 million relation triplets. We present the methodology for developing BIOS, including the curation of raw biomedical terms, computational identification of synonymous terms and aggregation of these terms to create concept nodes, semantic type classification of the concepts, relation identification, and biomedical machine translation. We provide statistics on the current BIOS content and perform preliminary assessments of term quality, synonym grouping, and relation extraction. The results suggest that machine learning-based BioMedKG development is a viable alternative to traditional expert curation.
更多
查看译文
关键词
biomedical knowledge graph,algorithmically
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要