Using Semi-Supervised Learning andWikipedia to Train an Event Argument Extraction System

Informatica (Slovenia)（2022）

引用 0|浏览6

暂无评分

摘要

The paper presents a methodology for training an event argument extraction system in a semi-supervised setting. We use Wikipedia and Wikidata to automatically obtain a small noisily labeled dataset and a large unlabeled dataset. The dataset consists of event clusters containingWikipedia pages in multiple languages. The unlabeled data is iteratively labeled using semi-supervised learning combined with probabilistic soft logic to infer the pseudo-label of each example from the predictions of multiple base learners. The proposed methodology is applied toWikipedia pages about earthquakes and terrorist attacks in a cross-lingual setting. Our experiments show improvement of the results when using the proposed methodology. The system achieves F1-score of 0:79 when only the automatically labeled dataset is used, and F1-score of 0:84 when trained according to the methodology with semi-supervised learning combined with probabilistic soft logic.

查看译文

关键词

event extraction, event argument extraction, semi-supervised learning, probabilistic soft logic

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要