A Conditional Random Field for Discriminatively-trained Finite-state String Edit Distance

Uncertainty in Artificial Intelligence(2012)

引用 168|浏览98
暂无评分
摘要
The need to measure sequence similarity arises in information extraction, object iden- tity, data mining, biological sequence analy- sis, and other domains. This paper presents discriminative string-edit CRFs, a finite- state conditional random field model for edit sequences between strings. Conditional ran- dom fields have advantages over generative approaches to this problem, such as pair HMMs or the work of Ristad and Yiani- los, because as conditionally-trained meth- ods, they enable the use of complex, arbitrary actions and features of the input strings. As in generative models, the training data does not have to specify the edit sequences be- tween the given string pairs. Unlike genera- tive models, however, our model is trained on both positive and negative instances of string pairs. We present positive experimental re- sults on several data sets.
更多
查看译文
关键词
data mining,extraction,information extraction,data bases,mathematical models,finite element analysis,conditional random field,information retrieval,sequences,random variables
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要