Data and text mining Kernel approaches for genic interaction extraction

msra(2008)

引用 23|浏览22
暂无评分
摘要
Motivation: Automatic knowledge discovery and efficient informa- tion access such as named entity recognition and relation extraction between entities have recently become critical issues in the biomedical literature. However, the inherent difficulty of the relation extraction task, mainly caused by the diversity of natural language, is further compounded in the biomedical domain because biomed- ical sentences are commonly long and complex. In addition, relation extraction often involves modeling long range dependencies, discontiguous word patterns and semantic relations for which the pattern-based methodology is not directly applicable. Results: In this article, we shift the focus of biomedical relation extraction from the problem of pattern extraction to the problem of kernel construction. We suggest four kernels: predicate, walk, dependency and hybrid kernels to adequately encapsulate informa- tion required for a relation prediction based on the sentential structures involved in two entities. For this purpose, we view the dependency structure of a sentence as a graph, which allows the system to deal with an essential one from the complex syntactic structurebyfindingtheshortestpathbetweenentities.Thekernelswe suggest are augmented gradually from the flat features descriptions to the structural descriptions of the shortest paths. As a result, we obtain a very promising result, a 77.5 F-score with the walk kernel on theLanguageLearningin Logic(LLL)05 genicinteraction sharedtask. Availability: The used algorithms are free for use for academic research and are available from our Web site http://mllab.sogang. ac.kr/� shkim/LLL05.tar.gz. Contact: shkim@lex.yonsei.ac.kr
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要