Beyond the Transformer: A Novel Polynomial Inherent Attention (PIA) Model and Its Great Impact on Neural Machine Translation

COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE(2022)

引用 1|浏览5
暂无评分
摘要
This paper describes a novel polynomial inherent attention (PIA) model that outperforms all state-of-the-art transformer models on neural machine translation (NMT) by a wide margin. PIA is based on the simple idea that natural language sentences can be transformed into a special type of binary attention context vectors that accurately capture the semantic context and the relative dependencies between words in a sentence. The transformation is performed using a simple power-of-two polynomial transformation that maintains strict consistent positioning of words in the resulting vectors. It is shown how this transformation reduces the neural machine translation process to a simple neural polynomial regression model that provides excellent solutions to the alignment and positioning problems haunting transformer models. The test BELU scores obtained on the WMT-2014 data set are 75.07 BELU for the EN-FR data set and 66.35 BELU for the EN-DE data set-well above accuracies achieved by state-of-the-art transformer models for the same data sets. The improvements are, respectively, 65.7% and 87.42%.
更多
查看译文
关键词
novel polynomial inherent attention,translation,pia,transformer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要