Confidence-based Syntax encoding network for better ancient Chinese understanding

INFORMATION PROCESSING & MANAGEMENT(2024)

引用 0|浏览16
暂无评分
摘要
While neural-based models continue to make rapid strides, syntax remains a foundational element in the domain of Natural Language Processing (NLP), particularly in the context of Chinese language understanding. However, there exists a significant gap in research that integrates syntactic information for the understanding of ancient Chinese, primarily due to the lack of high-quality syntactic annotations. This paper explores the untapped potential of syntax to enhance ancient Chinese understanding, leveraging the "not-so-perfect'' noisy syntax trees generated by unsupervised derivations and modern Chinese syntax parsers. To achieve this, we introduce a novel syntax encoding component: the confidence-based syntax encoding network (cSEN). This component is tailored to mitigate the side-effects arising from the noise associated with unsupervised syntax derivations and the incompatibility between ancient and modern Chinese. We validate the importance of syntax information and the efficacy of our cSEN through experimental tasks, specifically ancient poetry theme classification and ancient-modern Chinese translation. Our findings suggest that proper implementation of syntactic information can effectively enhance model understanding of ancient Chinese. The introduced cSEN proves vital in noise-rich environments, potentially revolutionizing the way information professionals approach and utilize ancient Chinese texts.
更多
查看译文
关键词
Ancient Chinese understanding,Syntax encoding,Syntax parse confidence,Ancient poetry thematic classification,Ancient-modern Chinese translation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要