Document-level denoising relation extraction with false-negative mining and reinforced positive-class knowledge distillation

Daojian Zeng,Jianling Zhu,Hongting Chen,Jianhua Dai,Lincheng Jiang

INFORMATION PROCESSING & MANAGEMENT（2024）

引用 0|浏览28

暂无评分

摘要

Many datasets for document-level relation extraction (RE) suffer from incomplete labeling, par-ticularly the false negative problem, which induces improper biases during training. However, existing denoising methods are either limited to the scale of dataset or primarily focus on sentence-level RE. To tackle this prevalent issue of false negatives, we propose a denoising framework called FM-RKD for document-level RE. Firstly, a false-negative mining mechanism is introduced to identify and re-annotate false negative samples (FNs) within the original corpus, thereby producing a higher-quality pseudo corpus. Then, we propose a reinforced positive-class knowledge distillation method, where a teacher network trained with positive samples provides soft labels for a student network. This approach enables the student network to learn complete positive-class patterns and mitigate the overfitting issue caused by FNs. Extensive experiments on the Re-DocRED dataset show that FM-RKD outperforms the current state-of-the-art method by 1.36% in F1 score and 1.24% in Ign F1 score when the training data is incompletely annotated. Moreover, FM-RKD consistently achieves new peak performance with an F1 score of 78.38% even when the training data is well-annotated.

查看译文

关键词

Document-level relation extraction,Incomplete labeling data,False-negative mining mechanism,Reinforced positive-class knowledge distillation

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要