Examination Of Effective Features For Crf-Based Bibliography Extraction From Reference Strings

2016 ELEVENTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT (ICDIM 2016)(2016)

引用 6|浏览1
暂无评分
摘要
Metadata such as bibliographic information about documents are indispensable in the effective use of digital libraries. In particular, the reference fields of academic papers contain much bibliographic information such as authors' names and document titles. We are therefore developing a method for automatically extracting bibliographic information from reference strings using a conditional random field (CRF). The features used by the CRF determine the accuracy of this method. We examine effective features for accurate extraction by experimentally changing the features used. The experiments showed that lexical features were quite effective in accurate extraction and augmenting lexicons properly could lead to further improvements in accuracy.
更多
查看译文
关键词
CRF-based bibliography extraction,reference strings,metadata,bibliographic information,digital libraries,document titles,author names,automatic bibliographic information extraction,conditional random field,lexical features
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要