Handling Unknown Words in Statistical Latent-Variable Parsing Models for Arabic, English and French.

SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages(2010)

引用 34|浏览0
暂无评分
摘要
This paper presents a study of the impact of using simple and complex morphological clues to improve the classification of rare and unknown words for parsing. We compare this approach to a language-independent technique often used in parsers which is based solely on word frequencies. This study is applied to three languages that exhibit different levels of morphological expressiveness: Arabic, French and English. We integrate information about Arabic affixes and morphotactics into a PCFG-LA parser and obtain state-of-the-art accuracy. We also show that these morphological clues can be learnt automatically from an annotated corpus.
更多
查看译文
关键词
complex morphological clue,morphological clue,morphological expressiveness,Arabic affix,PCFG-LA parser,annotated corpus,exhibit different level,language-independent technique,state-of-the-art accuracy,unknown word,statistical latent-variable
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要