Automatic cognate identification based on a fuzzy combination of string similarity measures

Fuzzy Systems(2012)

引用 4|浏览5
暂无评分
摘要
Cognates are words in different languages that have similar spelling and meaning. The identification of cognates is very useful for many different Natural Language Processing tasks, and also in the process of learning a second language. This paper presents a new approach to classify pairs of words into cognates/false friends or not related classes. The proposed approach uses a fuzzy system to combine complementary string similarity measures in order to improve the cognate identification task. The underlying hypothesis is that the combination of different string measures by applying heuristic knowledge, can outperform those measures working separately. The results obtained by the proposed system confirm the previous hypothesis, and furthermore it also outperforms other systems that combine string measures by using a supervised approach. As an additional contribution, we have created a bilingual test data set which include pairs of cognates, false friends and unrelated words in Spanish and English, that is freely available for research purposes.
更多
查看译文
关键词
fuzzy systems,natural language processing,pattern classification,string matching,word processing,English words,Spanish words,automatic cognate identification,bilingual test data set,false friends,fuzzy combination,fuzzy system,heuristic knowledge,natural language processing,second language learning,string similarity measures,supervised approach
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要