Latent Attribute Based Hierarchical Decoder for Neural Machine Translation

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)(2019)

引用 9|浏览156
暂无评分
摘要
Neural machine translation NMT has achieved state-of-the-art performance in many translation tasks. However, because the computational cost increases with the size of the search space for predicting the target words, the translation quality of NMT is constrained by the limited vocabulary. To alleviate this problem, we propose a novel dynamic hierarchical decoder for NMT to utilize all of the target words in the training and decoding process. In the proposed model, a target word is represented by two latent attribute vectors rather than a word vector. The model is trained to dynamically put together those words that share similar linguistic attributes. The prediction of a target word is, therefore, turned into the prediction of attribute vectors, where the $\mathrm{softmax}$ functions are performed at the attribute level. This greatly reduces the model size and the decoding time. Our experimental results demonstrate that the proposed model significantly outperforms the NMT baselines in both Chinese-English and English-German translation tasks.
更多
查看译文
关键词
Vocabulary,Decoding,Computational modeling,Training,Speech processing,Linguistics,Semantics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要