DUKE: Distance Fusion and Knowledge Enhanced Framework for Chinese Spelling Check

2022 Euro-Asia Conference on Frontiers of Computer Science and Information Technology (FCSIT)(2022)

引用 0|浏览7
暂无评分
摘要
Chinese Spelling Check (CSC) is a task that eliminates spelling errors in Chinese, which is important for downstream NLP tasks. Misuse of semantically, glyphically, or phonologically similar characters, as well as unseen fixed expressions, are typical causes of Chinese spelling errors. Previous methods neglect the fact that characters' relative positions in multimodal spaces are not well-aligned. To address these issues, we propose the DUKE CSC framework, which fuses multimodal features with relative distances between characters retained, and reduce errors from unseen fixed expressions with a knowledge enhancement mechanism. Specifically, to transfer multimodal candidate information into semantic space, we capture relative distances between character and vocabulary in multimodal space and then construct a semantic feature with a similar distance relationship. By reconstructing input sentences, our method is able to capture extra knowledge and has more flexibility and controllability during prediction. Experiments on SIGHAN benchmarks demonstrate our method improves detection F1 by 1.0%, 0.7%, 5.0%, correction F1 by 1.4%, 2.6%, 4.9% previous models.
更多
查看译文
关键词
Chinese Spelling Check,Multi-Modal Feature Fusion,Knowledge Enhance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要