Alleviate Dataset Shift Problem in Fine-grained Entity Typing with Virtual Adversarial Training

Haochen Shi,Siliang Tang,Xiaotao Gu,Bo Chen,Zhigang Chen,Jian Shao,Xiang Ren

IJCAI 2020（2020）

引用 8|浏览536

暂无评分

摘要

The recent success of Distant Supervision (DS) brings abundant labeled data for the task of fine-grained entity typing (FET) without human annotation. However, the heuristically generated labels inevitably bring a significant distribution gap, namely dataset shift, between the distantly labeled training set and the manually curated test set. Considerable efforts have been made to alleviate this problem from the label perspective by either intelligently denoising the training labels, or designing noise-aware loss functions. Despite their progress, the dataset shift can hardly be eliminated completely. In this work, complementary to the label perspective, we reconsider this problem from the model perspective: Can we learn a more robust typing model with the existence of dataset shift? To this end, we propose a novel regularization module based on virtual adversarial training (VAT). The proposed approach first uses a self-paced sample selection function to select suitable samples for VAT, then constructs virtual adversarial perturbations based on the selected samples, and finally regularizes the model to be robust to such perturbations. Experiments on two benchmarks demonstrate the effectiveness of the proposed method, with an average 3.8%, 2.5% and 3.2% improvement in accuracy, Macro F1 and Micro F1 respectively compared to the next best method.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要