谷歌浏览器插件
订阅小程序
在清言上使用

Automatic Arabic Grammatical Error Correction Based on Expectation-Maximization Routing and Target-Bidirectional Agreement.

Knowledge-based systems(2022)

引用 8|浏览22
暂无评分
摘要
Automatic Grammar Error Correction (GEC) detects and corrects various types of syntax, spelling, and grammatical errors. Different approaches such as rule-based, Statistical Machine Translation (SMT), and Neural Machine Translation (NMT) have been proposed. Among these approaches, NMT based on seq2seq multi-head attention (Transformer) performs the best. The key shortcoming of GEC seq2seq models with multiple encoder-decoder layers is that only the top layer is exploited in the subsequent processes. In addition, due to the exposure bias problem during inference, some of the previous target words are deleted and replaced by other words generated by the model itself, which leads to unsatisfactory output. This paper proposed GEC model based on seq2seq Transformer for low-resource languages such as Arabic to address these issues. Initially, we proposed a noising method for constructing synthetic parallel data to overcome the bottleneck arising from the lack of corpus. Furthermore, motivated by the success of capsule networks in computer vision, we used the Expectation-Maximization routing algorithm to dynamically aggregate information across layers in Arabic GEC. Moreover, to conquer the exposure bias problem, we introduced a bidirectional regularization term using Kullback-Leibler divergence in the training objective to improve the agreement between Right-to-left and Left-to-right models. Experiments performed on two benchmarks QALB-2014 and QALB-2015 showed that our proposed model achieved the best F1 score compared to the existing Arabic GEC systems.
更多
查看译文
关键词
Automatic grammar error correction,Capsule network,Expectation-maximization routing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要