What ’ s in a Name ? : Proper Names in Arabic Cross Language Information Retrieval

USENIX Technical Conference(2003)

引用 30|浏览27
暂无评分
摘要
Proper names are problematic for cross language information retrieval. Standard bilingual dictionaries typically have poor coverage of proper names. On the other hand, IR tasks involving news corpora, like TDT and TREC cross language IR, have proper names at their core. In this study, we demonstrate the importance of proper names in one such task, the TREC 2002 (Arabic-English) cross language track, by showing that performance degrades a tremendous amount when the bilingual lexicons do not have proper names. We then examine several different sources of proper name translations from English to Arabic, both static and generative (transliteration) and explore their effectiveness in the context of the TREC 2002 cross language IR task. We support a conclusion that a combination of static translation resources plus transliteration provides a successful solution.
更多
查看译文
关键词
general terms: algorithms,arabic,crosslingual,clir,proper names.,experimentation,transliteration,performance. keywords: cross language information retrieval
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要