AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We propose a self-supervised learning model SelfORE for open-domain relation extraction

SelfORE: Self supervised Relational Feature Learning for Open Relation Extraction

EMNLP 2020, pp.3673-3682, (2020)

Cited by: 0|Views297
Full Text
Bibtex
Weibo

Abstract

Open relation extraction is the task of extracting open-domain relation facts from natural language sentences. Existing works either utilize heuristics or distant-supervised annotations to train a supervised classifier over pre-defined relations, or adopt unsupervised methods with additional assumptions that have less discriminative power...More

Code:

Data:

0
Introduction
  • With huge amounts of information people generate, Relation Extraction (RE) aims to extract triplets of the form from sentences, discovering the semantic relation that holds between two entities mentioned in the text.
  • Given a sentence Derek Bell was born in Belfast, the authors can extract a relation born in between two entities Derek Bell and Belfast.
  • The extracted triplets from the sentence are used in various downstream applications like web search, question answering, and natural language understanding.
  • Existing RE methods work well on pre-defined relations that have already appeared either in human-annotated datasets or knowledge bases.
  • Human annotation can be laborintensive to obtain and hard to scale up to a large number of relations.
Highlights
  • With huge amounts of information people generate, Relation Extraction (RE) aims to extract triplets of the form from sentences, discovering the semantic relation that holds between two entities mentioned in the text
  • To further alleviate the human annotation efforts while obtaining high-quality supervision for open relation extraction, in this paper, we propose a selfsupervised learning framework which obtains supervision from the data itself and learns to improve the supervision quality by learning better feature presentations in an iterative fashion
  • We showed that the self-supervised model outperforms strong baselines, and is robust when no prior information is available on target relations
  • We propose a self-supervised learning model SelfORE for open-domain relation extraction
  • Different from conventional distant supervised models which require pre-defined Knowledge Bases or labeled instances for Relation Extraction in a closed-world setting, our model does not require annotation and has the ability to work on open-domain scenario when target relation number and the relation distribution are not known in advance
  • Our model exploits the advantages of supervised models to bootstraps the discriminative power from self-supervised signals to improve contextualized relational feature learning
Methods
  • The authors conduct extensive experiments on real-world datasets to show the effectiveness of the selfsupervised learning rationale on relation extraction, and give a detailed analysis to show its advantages. 3.1 Datasets

    The authors use three labeled datasets to evaluate the model: NYT+FB, T-REx SPO, and T-REx DS.
  • The NYT+FB dataset is generated via distant supervision, aligning sentences from the New York Times corpus (Sandhaus, 2008) with Freebase (Bollacker et al, 2008) triplets.
  • It has been widely used in previous RE works (Marcheggiani and Titov, 2016; Yao et al, 2011; Simon et al, 2019).
  • The dataset still contains some misalignment, but should be easier for models to extract the correct semantic relation. 20% of these sentences will be used as the validation dataset and 80% will be used for model training
Results
  • UIE-PCNN is considered as the previous state-of-the-art result.
  • The authors enhance this baseline by replacing PCNN and GolVe embedding with the proposed BERT-.
  • The enhanced stateof-the-art model, namely UIE-BERT, achieves the best performance among baselines.
  • The proposed SelfORE model outperforms all baseline models consistently on B3 F1/Precision, V-measure F1/Homogeneity and ARI.
Conclusion
  • The authors propose a self-supervised learning model SelfORE for open-domain relation extraction.
  • Different from conventional distant supervised models which require pre-defined Knowledge Bases or labeled instances for Relation Extraction in a closed-world setting, the model does not require annotation and has the ability to work on open-domain scenario when target relation number and the relation distribution are not known in advance.
  • The authors' model exploits the advantages of supervised models to bootstraps the discriminative power from self-supervised signals to improve contextualized relational feature learning.
  • Experiments on three real-world datasets show the effectiveness and the robustness of the proposed model over competitive baselines
Summary
  • Introduction:

    With huge amounts of information people generate, Relation Extraction (RE) aims to extract triplets of the form from sentences, discovering the semantic relation that holds between two entities mentioned in the text.
  • Given a sentence Derek Bell was born in Belfast, the authors can extract a relation born in between two entities Derek Bell and Belfast.
  • The extracted triplets from the sentence are used in various downstream applications like web search, question answering, and natural language understanding.
  • Existing RE methods work well on pre-defined relations that have already appeared either in human-annotated datasets or knowledge bases.
  • Human annotation can be laborintensive to obtain and hard to scale up to a large number of relations.
  • Methods:

    The authors conduct extensive experiments on real-world datasets to show the effectiveness of the selfsupervised learning rationale on relation extraction, and give a detailed analysis to show its advantages. 3.1 Datasets

    The authors use three labeled datasets to evaluate the model: NYT+FB, T-REx SPO, and T-REx DS.
  • The NYT+FB dataset is generated via distant supervision, aligning sentences from the New York Times corpus (Sandhaus, 2008) with Freebase (Bollacker et al, 2008) triplets.
  • It has been widely used in previous RE works (Marcheggiani and Titov, 2016; Yao et al, 2011; Simon et al, 2019).
  • The dataset still contains some misalignment, but should be easier for models to extract the correct semantic relation. 20% of these sentences will be used as the validation dataset and 80% will be used for model training
  • Results:

    UIE-PCNN is considered as the previous state-of-the-art result.
  • The authors enhance this baseline by replacing PCNN and GolVe embedding with the proposed BERT-.
  • The enhanced stateof-the-art model, namely UIE-BERT, achieves the best performance among baselines.
  • The proposed SelfORE model outperforms all baseline models consistently on B3 F1/Precision, V-measure F1/Homogeneity and ARI.
  • Conclusion:

    The authors propose a self-supervised learning model SelfORE for open-domain relation extraction.
  • Different from conventional distant supervised models which require pre-defined Knowledge Bases or labeled instances for Relation Extraction in a closed-world setting, the model does not require annotation and has the ability to work on open-domain scenario when target relation number and the relation distribution are not known in advance.
  • The authors' model exploits the advantages of supervised models to bootstraps the discriminative power from self-supervised signals to improve contextualized relational feature learning.
  • Experiments on three real-world datasets show the effectiveness and the robustness of the proposed model over competitive baselines
Tables
  • Table1: Quantitative Performance Evaluation on three datasets
  • Table2: Extracted vs. golden surface-form relation names on T-REx SPO
Download tables as Excel
Related work
  • Relation extraction focuses on identifying the relation between two entities in a given sentence. Traditional closed-domain relation extraction methods are supervised models. They need a set of predefined relation labels and require large amounts of annotated triplets, making them less ideal to work on open-domain corpora. Distant supervision (Mintz et al, 2009; Hoffmann et al, 2011; Surdeanu et al, 2012) is a widely adopted method to alleviate human annotation: if multiple sentences contain two entities that have a certain relation in a knowledge graph, at least one sentence is believed to convey the corresponding relation. However, entities convey semantic meanings also according to the contexts —distant supervised models do not explicitly consider contexts and the resulting model cannot discover new relations as the supervision is purely adopted from knowledge bases.

    Unsupervised relation extraction gets lots of attention, due to the ability to discover relational knowledge without access to annotations and external resources. Unsupervised models either 1) cluster the relation representation extracted from the sentence; 2) make more assumptions that provide learning signals for classification models.
Funding
  • We stop the clustering and classification loop when the current pseudo labels have less than 10% difference with the former epoch
  • SelfORE on average achieves 7.0% higher in B3 F1, 3.4% higher in Vmeasure F1 and 7.7% higher in ARI across three dataset when comparing with previous state-of-theart
  • Without self-supervised signals for relational feature learning, SelfORE w/o Classification gives us 14.4% less performance averaged over all metrics on all datasets
  • Thanks to the self-learning schema and the Adaptive Clustering, when we very Kfrom 10 to 1250, the model achieves stable F1 score and is not sensitive to the inital choice of Kon all three datasets
Study subjects and analysis
datasets: 3
In this work, we proposed a self-supervised framework named SelfORE, which exploits weak, self-supervised signals by leveraging large pretrained language model for adaptive clustering on contextualized relational features, and bootstraps the self-supervised signals by improving contextualized features in relation classification. Experimental results on three datasets show the effectiveness and robustness of SelfORE on open-domain Relation Extraction when comparing with competitive baselines. Source code is available at https://github.com/THU-BPM/SelfORE

labeled datasets: 3
Source code is available at https://github.com/THU-BPM/SelfORE. We conduct extensive experiments on real-world datasets to show the effectiveness of our selfsupervised learning rationale on relation extraction, and give a detailed analysis to show its advantages. 3.1 Datasets

We use three labeled datasets to evaluate our model: NYT+FB, T-REx SPO, and T-REx DS
. The NYT+FB dataset is generated via distant supervision, aligning sentences from the New York Times corpus (Sandhaus, 2008) with Freebase (Bollacker et al, 2008) triplets

labeled datasets: 3
3.1 Datasets. We use three labeled datasets to evaluate our model: NYT+FB, T-REx SPO, and T-REx DS. The NYT+FB dataset is generated via distant supervision, aligning sentences from the New York Times corpus (Sandhaus, 2008) with Freebase (Bollacker et al, 2008) triplets

entity pairs: 50
Visualize Contextualized Features To intuitively show how self-supervised learning helps learn better contextualized relational features on entity pairs for Relation Extraction, we visualize the contextual representation space R2·hR after dimension reduction using t-SNE (Maaten and Hinton, 2008). We randomly choose 4 relations from NYT+FB dataset and sample 50 entity pairs. The visualization results are shown in Figure 2

datasets: 3
As shown in Figure 3, the best performance is obtained when K = 10, indicating that SelfORE can leverage the number of target relations as a useful prior knowledge. Thanks to the self-learning schema and the Adaptive Clustering, when we very Kfrom 10 to 1250, the model achieves stable F1 score and is not sensitive to the inital choice of Kon all three datasets. The results also further indicate the applicability of the proposed model when being applied to an opendomain corpus when the number of target relations is not available in advance

real-world datasets: 3
Comparing with unsupervised models, our model exploits the advantages of supervised models to bootstraps the discriminative power from self-supervised signals to improve contextualized relational feature learning. Experiments on three real-world datasets show the effectiveness and the robustness of the proposed model over competitive baselines.

Reference
  • Gabor Angeli, Melvin Jose Johnson Premkumar, and Christopher D Manning. 2015. Leveraging linguistic structure for open domain information extraction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 344–354.
    Google ScholarLocate open access versionFindings
  • Michele Banko, Michael J Cafarella, Stephen Soderland, Matthew Broadhead, and Oren Etzioni. 2007. Open information extraction from the web. In Ijcai, volume 7, pages 2670–2676.
    Google ScholarLocate open access versionFindings
  • Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1247–1250. AcM.
    Google ScholarLocate open access versionFindings
  • Mathilde Caron, Piotr Bojanowski, Armand Joulin, and Matthijs Douze. 2018. Deep clustering for unsupervised learning of visual features. In Proceedings of the European Conference on Computer Vision (ECCV), pages 132–149.
    Google ScholarLocate open access versionFindings
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
    Findings
  • Hady Elsahar, Pavlos Vougiouklis, Arslen Remaci, Christophe Gravier, Jonathon Hare, Frederique Laforest, and Elena Simperl. 2018. T-rex: A large scale alignment of natural language with knowledge base triples. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018).
    Google ScholarLocate open access versionFindings
  • Anthony Fader, Stephen Soderland, and Oren Etzioni. 2011. Identifying relations for open information extraction. In Proceedings of the conference on empirical methods in natural language processing, pages 1535–1545. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Raphael Hoffmann, Congle Zhang, Xiao Ling, Luke Zettlemoyer, and Daniel S Weld. 2011. Knowledgebased weak supervision for information extraction of overlapping relations. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language TechnologiesVolume 1, pages 541–550. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Lawrence Hubert and Phipps Arabie. 1985. Comparing partitions. Journal of classification, 2(1):193– 218.
    Google ScholarLocate open access versionFindings
  • Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-sne. Journal of machine learning research, 9(Nov):2579–2605.
    Google ScholarLocate open access versionFindings
  • Diego Marcheggiani and Ivan Titov. 2016. Discretestate variational autoencoders for joint discovery and factorization of relations. Transactions of the Association for Computational Linguistics, 4:231–244.
    Google ScholarLocate open access versionFindings
  • Mike Mintz, Steven Bills, Rion Snow, and Dan Jurafsky. 2009. Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2, pages 1003–1011. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Maximilian Nickel, Volker Tresp, and Hans-Peter Kriegel. 2011. A three-way model for collective learning on multi-relational data. In ICML, volume 11, pages 809–816.
    Google ScholarLocate open access versionFindings
  • Andrew Rosenberg and Julia Hirschberg. 2007. Vmeasure: A conditional entropy-based external cluster evaluation measure. In Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pages 410–420.
    Google ScholarLocate open access versionFindings
  • Arpita Roy, Youngja Park, Taesung Lee, and Shimei Pan. 2019. Supervising unsupervised open information extraction models. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 728–737.
    Google ScholarLocate open access versionFindings
  • Evan Sandhaus. 2008. The new york times annotated corpus. Linguistic Data Consortium, Philadelphia, 6(12):e26752.
    Google ScholarLocate open access versionFindings
  • Etienne Simon, Vincent Guigue, and Benjamin Piwowarski. 2019. Unsupervised information extraction: Regularizing discriminative approaches with relation distribution losses. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1378–1387.
    Google ScholarLocate open access versionFindings
  • Livio Baldini Soares, Nicholas FitzGerald, Jeffrey Ling, and Tom Kwiatkowski. 2019. Matching the blanks: Distributional similarity for relation learning. arXiv preprint arXiv:1906.03158.
    Findings
  • Mihai Surdeanu, Julie Tibshirani, Ramesh Nallapati, and Christopher D Manning. 2012. Multi-instance multi-label learning for relation extraction. In Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning, pages 455–465. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, and Pierre-Antoine Manzagol. 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of machine learning research, 11(Dec):3371–3408.
    Google ScholarLocate open access versionFindings
  • Denny Vrandecic. 2012. Wikidata: A new platform for collaborative data collection. In Proceedings of the 21st international conference on world wide web, pages 1063–1064. ACM.
    Google ScholarLocate open access versionFindings
  • Olivia Wiles, A Koepke, and Andrew Zisserman. 2018. Self-supervised learning of a facial attribute embedding from video. arXiv preprint arXiv:1808.06882.
    Findings
  • Ruidong Wu, Yuan Yao, Xu Han, Ruobing Xie, Zhiyuan Liu, Fen Lin, Leyu Lin, and Maosong Sun. 2019. Open relation extraction: Relational knowledge transfer from supervised data to unsupervised data. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 219–228.
    Google ScholarLocate open access versionFindings
  • Junyuan Xie, Ross Girshick, and Ali Farhadi. 2016. Unsupervised deep embedding for clustering analysis. In International conference on machine learning, pages 478–487.
    Google ScholarLocate open access versionFindings
  • Limin Yao, Aria Haghighi, Sebastian Riedel, and Andrew McCallum. 2011. Structured relation discovery using generative models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 1456–1466. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Alexander Yates, Michael Cafarella, Michele Banko, Oren Etzioni, Matthew Broadhead, and Stephen Soderland. 2007. Textrunner: open information extraction on the web. In Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, pages 25– Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Daojian Zeng, Kang Liu, Yubo Chen, and Jun Zhao. 2015. Distant supervision for relation extraction via piecewise convolutional neural networks. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1753– 1762.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
小科