Improving Non-autoregressive Machine Translation with Error Exposure and Consistency Regularization
CoRR(2024)
摘要
Being one of the IR-NAT (Iterative-refinemennt-based NAT) frameworks, the
Conditional Masked Language Model (CMLM) adopts the mask-predict paradigm to
re-predict the masked low-confidence tokens. However, CMLM suffers from the
data distribution discrepancy between training and inference, where the
observed tokens are generated differently in the two cases. In this paper, we
address this problem with the training approaches of error exposure and
consistency regularization (EECR). We construct the mixed sequences based on
model prediction during training, and propose to optimize over the masked
tokens under imperfect observation conditions. We also design a consistency
learning method to constrain the data distribution for the masked tokens under
different observing situations to narrow down the gap between training and
inference. The experiments on five translation benchmarks obtains an average
improvement of 0.68 and 0.40 BLEU scores compared to the base models,
respectively, and our CMLMC-EECR achieves the best performance with a
comparable translation quality with the Transformer. The experiments results
demonstrate the effectiveness of our method.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要