Latent Attribute Based Hierarchical Decoder for Neural Machine Translation
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)(2019)
摘要
Neural machine translation NMT has achieved state-of-the-art performance in many translation tasks. However, because the computational cost increases with the size of the search space for predicting the target words, the translation quality of NMT is constrained by the limited vocabulary. To alleviate this problem, we propose a novel dynamic hierarchical decoder for NMT to utilize all of the target words in the training and decoding process. In the proposed model, a target word is represented by two latent attribute vectors rather than a word vector. The model is trained to dynamically put together those words that share similar linguistic attributes. The prediction of a target word is, therefore, turned into the prediction of attribute vectors, where the $\mathrm{softmax}$ functions are performed at the attribute level. This greatly reduces the model size and the decoding time. Our experimental results demonstrate that the proposed model significantly outperforms the NMT baselines in both Chinese-English and English-German translation tasks.
更多查看译文
关键词
Vocabulary,Decoding,Computational modeling,Training,Speech processing,Linguistics,Semantics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络