Semi-Supervised Learning for Semantic Relation Classification using Stratified Sampling Strategy.

EMNLP '09: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3(2009)

引用 10|浏览13
暂无评分
摘要
This paper presents a new approach to selecting the initial seed set using stratified sampling strategy in bootstrapping-based semi-supervised learning for semantic relation classification. First, the training data is partitioned into several strata according to relation types/subtypes, then relation instances are randomly sampled from each stratum to form the initial seed set. We also investigate different augmentation strategies in iteratively adding reliable instances to the labeled set, and find that the bootstrapping procedure may stop at a reasonable point to significantly decrease the training time without degrading too much in performance. Experiments on the ACE RDC 2003 and 2004 corpora show the stratified sampling strategy contributes more than the bootstrapping procedure itself. This suggests that a proper sampling strategy is critical in semi-supervised learning.
更多
查看译文
关键词
stratified sampling strategy,bootstrapping procedure,different augmentation strategy,initial seed set,proper sampling strategy,relation instance,relation type,semantic relation classification,bootstrapping-based semi-supervised learning,initial seed,Semi-supervised learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要