谷歌浏览器插件
订阅小程序
在清言上使用

A New Machine-Learning Extracting Approach to Construct a Knowledge Base: A Case Study on Global Stromatolites over Geological Time

Journal of Earth Science/Journal of earth science(2023)

引用 1|浏览16
暂无评分
摘要
Within any scientific disciplines, a large amount of data are buried within various literature depositories and archives, making it difficult to manually extract useful information from the datum swamps. The machine-learning extraction of data therefore is necessary for the big-data-based studies. Here, we develop a new text-mining technique to reconstruct the global database of the Precambrian to Recent stromatolites, providing better understanding of secular changes of stromatolites though geological time. The step-by-step data extraction process is described as below. First, the PDF documents of stromatolite-containing literatures were collected, and converted into text formation. Second, a glossary and tag-labeling system using NLP (Natural Language Processing) software was employed to search for all possible candidate pairs from each sentence within the papers collected here. Third, each candidate pair and features were represented as a factor graph model using a series of heuristic procedures to score the weights of each pair feature. Occurrence data of stromatolites versus stratigraphical units (abbreviated as Strata), facies types, locations, and age worldwide were extracted from literatures, respectively, and their extraction accuracies are 92
更多
查看译文
关键词
machine learning,knowledge base construction,stromatolites,Precambrian,knowledge graph
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要