Regularized Training Objective for Continued Training for Domain Adaptation in Neural Machine Translation.

Huda Khayrallah,Brian Thompson,Kevin Duh,Philipp Koehn

NEURAL MACHINE TRANSLATION AND GENERATION（2018）

引用 53|浏览29

暂无评分

摘要

Supervised domain adaptation-where a large generic corpus and a smaller in-domain corpus are both available for training-is a challenge for neural machine translation (NMT). Standard practice is to train a generic model and use it to initialize a second model, then continue training the second model on in-domain data to produce an in-domain model. We add an auxiliary term to the training objective during continued training that minimizes the cross entropy between the in-domain model's output word distribution and that of the out-of-domain model to prevent the model's output from differing too much from the original out-of-domain model. We perform experiments on EMEA (descriptions of medicines) and TED (rehearsed presentations), initialized from a general domain (WMT) model. Our method shows improvements over standard continued training by up to 1.5 BLEU.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要