Beyond the Transformer: A Novel Polynomial Inherent Attention (PIA) Model and Its Great Impact on Neural Machine Translation

COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE（2022）

引用 1|浏览5

暂无评分

摘要

This paper describes a novel polynomial inherent attention (PIA) model that outperforms all state-of-the-art transformer models on neural machine translation (NMT) by a wide margin. PIA is based on the simple idea that natural language sentences can be transformed into a special type of binary attention context vectors that accurately capture the semantic context and the relative dependencies between words in a sentence. The transformation is performed using a simple power-of-two polynomial transformation that maintains strict consistent positioning of words in the resulting vectors. It is shown how this transformation reduces the neural machine translation process to a simple neural polynomial regression model that provides excellent solutions to the alignment and positioning problems haunting transformer models. The test BELU scores obtained on the WMT-2014 data set are 75.07 BELU for the EN-FR data set and 66.35 BELU for the EN-DE data set-well above accuracies achieved by state-of-the-art transformer models for the same data sets. The improvements are, respectively, 65.7% and 87.42%.

查看译文

关键词

novel polynomial inherent attention,translation,pia,transformer

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要