The MSR-MSRA MT System for NIST Open Machine Translation 2008 Evaluation


引用 23|浏览80
1 Mei Yang was an intern with MSR in the summer of 2007 The system combination approach combining system outputs at the word level is similar to the one described in (Rosti et al., 2007). Compared to the previous work, we developed a new method to generate a better alignment between multiple MT hypotheses from different individual systems, which is used to construct a high-quality confusion network. The details of our method will be elaborated in a future paper (He et al., 2008). First, a minimum Bayes risk (MBR) based method is used to select a backbone from the multiple hypotheses, then all the hypotheses are aligned to that backbone to form a confusion network, i.e., a word lattice in which each word is aligned to a list of alternative words (including null). Then, a set of features, including language model scores, word count, and normalized system voting score, are used to decode the confusion network. In training, a confusion network is constructed based on the multiple hypotheses of each sentence in a dev set. Then the corresponding feature weights are trained using Powell’s search to maximize the BLEU score on that dev set. In testing, a confusion network for each sentence in the test set is constructed and these feature weights are applied to decode the final MT output from the confusion network. In this entry, two language models are used, including a 3-gram LM trained on the English part of the parallel training data, and a 5gram LM trained on the whole English Gigaword corpus using a scalable LM toolkit (Nguyen et al., 2007).
AI 理解论文
Chat Paper