Automatic construction of a japanese onomatopoeic dictionary using text data on the WWW

NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PROCEEDINGS(2006)

引用 3|浏览0
暂无评分
摘要
As new onomatopoeic words are often created at short notice, existing dictionaries tend to have an insufficient number of their entries. Furthermore, onomatopoeic words seldom appear in collections of newspaper articles, that have been used as corpora in natural language processing. In this work, we present a method of automatically acquiring lexical knowledge for Japanese onomatopoeic words from the WWW. As a result, we could automatically construct a onomatopoeic dictionary that contained 5,130 entries. By manually evaluating 487 newly acquired words that were not in the existing dictionary, we found that we could acquire 266 new onomatopoeic words, and if words in the existing dictionary were regarded as being correct, precision of our automatic acquisition was 83.6%.
更多
查看译文
关键词
insufficient number,automatic acquisition,japanese onomatopoeic dictionary,onomatopoeic dictionary,lexical knowledge,automatic construction,natural language processing,japanese onomatopoeic word,newspaper article,existing dictionary,text data,onomatopoeic word,new onomatopoeic word
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要