Named-Entity-Recognition (Ner) For Tamil Language Using Margin-Infused Relaxed Algorithm (Mira)

COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, (CICLING 2016), PT I(2018)

引用 2|浏览5
暂无评分
摘要
Named-Entity-Recognition (NER) is widely used as a foundation for Natural Language Processing (NLP) applications. There have been few previous attempts on building generic NER systems for Tamil language. These attempts were based on machine-learning approaches such as Hidden Markov Models (HMM), Maximum Entropy Markov Models (MEMM), Support Vector Machine (SVM) and Conditional Random Fields (CRF). Among them, CRF has been proven to be the best with respect to the accuracy of NER in Tamil. This paper presents a novel approach to build a Tamil NER system using the Margin-Infused Relaxed Algorithm (MIRA). We also present a comparison of performance between MIRA and CRF algorithms for Tamil NER. When the gazetteer, POS tags and orthographic features are used with the MIRA algorithm, it attains an F1-measure of 81.38% on the Tamil BBC news data whereas the CRF algorithm shows only an F1-measure of 79.13% for the same set of features. Our NER system outperforms all the previous NER systems for Tamil language.
更多
查看译文
关键词
NER, NLP, Tamil, NE, CRF, Margin-Infused relaxed algorithm MIRA
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要