A Rationale-centric Counterfactual Data Augmentation Method for Cross-Document Event Coreference Resolution
arxiv(2024)
摘要
Based on Pre-trained Language Models (PLMs), event coreference resolution
(ECR) systems have demonstrated outstanding performance in clustering
coreferential events across documents. However, the existing system exhibits an
excessive reliance on the `triggers lexical matching' spurious pattern in the
input mention pair text. We formalize the decision-making process of the
baseline ECR system using a Structural Causal Model (SCM), aiming to identify
spurious and causal associations (i.e., rationales) within the ECR task.
Leveraging the debiasing capability of counterfactual data augmentation, we
develop a rationale-centric counterfactual data augmentation method with
LLM-in-the-loop. This method is specialized for pairwise input in the ECR
system, where we conduct direct interventions on triggers and context to
mitigate the spurious association while emphasizing the causation. Our approach
achieves state-of-the-art performance on three popular cross-document ECR
benchmarks and demonstrates robustness in out-of-domain scenarios.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要