Contextual modeling for meeting translation using unsupervised word sense disambiguation

COLING(2010)

引用 23|浏览13
暂无评分
摘要
In this paper we investigate the challenges of applying statistical machine translation to meeting conversations, with a particular view towards analyzing the importance of modeling contextual factors such as the larger discourse context and topic/domain information on translation performance. We describe the collection of a small corpus of parallel meeting data, the development of a statistical machine translation system in the absence of genre-matched training data, and we present a quantitative analysis of translation errors resulting from the lack of contextual modeling inherent in standard statistical machine translation systems. Finally, we demonstrate how the largest source of translation errors (lack of topic/domain knowledge) can be addressed by applying document-level, unsupervised word sense disambiguation, resulting in performance improvements over the baseline system.
更多
查看译文
关键词
translation error,standard statistical machine translation,statistical machine translation,statistical machine translation system,translation performance,baseline system,contextual factor,domain information,domain knowledge,genre-matched training data,contextual modeling,meeting translation,unsupervised word sense disambiguation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要