Large-scale AMR Corpus with Re-generated Sentences: Domain Adaptive Pre-training on ACL Anthology Corpus

Ming Zhao, Yaling Wang,Yves Lepage

2022 International Conference on Advanced Computer Science and Information Systems (ICACSIS)(2022)

引用 0|浏览5
暂无评分
摘要
Abstract Meaning Representation (AMR) is a broad -coverage formalism for capturing the semantics of a given sentence. However, domain adaptation of AMR is limited by the shortage of annotated AMR graphs. In this paper, we explore and build a new large-scale dataset with 2.3 million AMRs in the domain of academic writing. Additionally, we prove that 30% of them are of similar quality as the annotated data in the downstream AMR-to-text task. Our results outperform previous graph-based approaches by over 11 BLEU points. We provide a pipeline that integrates automated generation and evaluation. This can help explore other AMR benchmarks.
更多
查看译文
关键词
Abstract Meaning Representation,Academic Writing,Domain Adaptation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要