Improving Phrase-Based Machine Translation

msra(2005)

引用 23|浏览68
暂无评分
摘要
1 Overview Current state-of-the-art machine translation systems use a phrase-based scoring model for choosing among candidate translations in a target language, typically English. These models are deemed phrase-based because candidate sentence scores are in large part a product of phrase translation probabilities. These translation probabilities must be learned in some unsupervised manner from a pair of sentence-aligned corpora. With the end goal of improving upon the published results of such systems, our project proceeded in two stages. First, we attempted to duplicate the performance results of existing end-to-end translation systems by piecing together available components and engineering the remainder guided by published techniques. Second, we identified two significant shortcomings of published systems and attempted to remedy them via machine learning techniques. In particular, we chose to learn phrase translation probabilities directly rather than deriving them heuristically. We also augmented the scoring model to relax a troublesome independence assumption across phrases.
更多
查看译文
关键词
machine translation,machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要