A Multi-channel Chinese Text Correction Method Based on Grammatical Error Diagnosis

2022 8th International Conference on Big Data and Information Analytics (BigDIA)(2022)

引用 0|浏览6
暂无评分
摘要
Text error correction is a key research direction in natural language processing (NLP). Using a single model to diagnose and correct grammatical errors is the current mainstream approach. Although these methods have high performance in diagnosing grammatical errors. They also have the problem of low error correction performance. In order to improve the error correction performance, this paper proposes a multi-channel Chinese text correction method based on grammatical error diagnosis (GED-MCCTC). First, in order to accurately identify the types of grammatical errors and error position information, the Bi-LSTM-CRF model is utilized to give the labels of grammatical error types and positions. Then, four corresponding correction channels are designed according to the five different types of grammatical errors defined to complete the error correction task. Experimental results on shared datasets show that the proposed method improves the average F1 score by 4.83% compared with the optimal baseline model. The multi-channel correction model can greatly improve the text correction performance, on the SIGHAN15 and Corpus500 test datasets, the average F1 score has reached 76.07% and 72.62%.
更多
查看译文
关键词
grammatical error diagnosis,Bi-LSTM-CRF,Seq2Edit,MacBERT4CSC,GED-MCCTC
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要