Learning the ontological theory of an information extraction system in the multi-predicate ILP setting.

SAC09: The 2009 ACM Symposium on Applied Computing Honolulu Hawaii March, 2009（2009）

引用 0|浏览17

暂无评分

摘要

In recent years, numerous works have been carried out to design Information Extraction (IE) systems able to extract genic interaction networks from text. Usually, the extraction procedure is completed by so-called extraction patterns, which are often limited to map textual fragments to a single semantic relation. Such poor representations do not take into account the complexity of the data processed by biologists. IE systems need sophisticated representations, encoded with ontologies, allowing the definition of multiple relations, and of the (possibly recursive) dependencies between them. Up to now, Machine Learning techniques used to acquire extraction patterns, i.e. binary or multi-class learners, reflect those representation restrictions. They assume independence between target predicates, and do not handle recursion. In this paper, we use Inductive Logic Programming in a multi-predicate setting to learn extraction patterns fitted to an ontological context. Multi-predicate ILP is an important paradigm which allows to learn recursive theories. We experimented our framework on a Bacillus subtilis bacterium text corpus, in which we reach a global recall of 67.7% and a precision of 75.5% in ten-fold cross-validation.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要