Reinforcement Learning for Relation Classification From Noisy Data
AAAI, 2018.
EI
微博一下:
摘要:
Existing relation classification methods that rely on distant supervision assume that a bag of sentences mentioning an entity pair are all describing a relation for the entity pair. Such methods, performing classification at the bag level, cannot identify the mapping between a relation and a sentence, and largely suffers from the noisy la...更多
代码:
数据:
简介
- Relation classification, aiming to categorize semantic relations between two entities given a plain text, is an important problem in natural language processing, for knowledge graph completion and question answering.
- In order to obtain large-scale training data, distant supervision (Mintz et al 2009) was proposed by assuming that if two entities have a relation in a given knowledge base, all sentences that contain the two entities will mention that relation.
- Taking the triple (Barack Obama, BornIn, United States) as an example, the noisy sentence “Barack Obamba is the 44th president of the United State” will be regarded as a positive instance by distant supervision and a BornIn relation is.
重点内容
- Relation classification, aiming to categorize semantic relations between two entities given a plain text, is an important problem in natural language processing, for knowledge graph completion and question answering
- To handle the above two limitations, we propose a novel relation classification model consisting of two modules: instance selector and relation classifier
- We propose a novel model for sentence-level relation classification from noisy data using a reinforcement learning framework
- The model consists of an instance selector and a relation classifier
- Extensive experiments demonstrate that our model can filter out the noisy sentences and perform sentence-level relation classification better than state-of-theart baselines from noisy data
- Our solution for instance selection can be generalized to other tasks that employ noisy data or distant supervision
方法
- Method CNN
CNN+Max CNN+ATT CNN+RL over the sentences in a bag and can down weight noisy sentences in a bag.
CNN is a sentence-level model that is trained directly on noisy data. - CNN+Max CNN+ATT CNN+RL over the sentences in a bag and can down weight noisy sentences in a bag.
- CNN is a sentence-level model that is trained directly on noisy data.
- For bag-level models (CNN+Max and CNN+ATT), the training process is the same as the referenced papers.
- Each sentence is treated as a bag and a relation is predicted for each bag.
- Results in Table 1 reveal the following observations
结论
- Conclusion and Future Work
In this paper, the authors propose a novel model for sentence-level relation classification from noisy data using a reinforcement learning framework. - The relation classifier predicts relation at the sentence level and provides rewards to the selector as a weak signal to supervise the instance selection process.
- A possible attempt might be to perform sentiment classification on noisy data (Go, Bhayani, and Huang 2009).
- The authors leave this as the future work
总结
Introduction:
Relation classification, aiming to categorize semantic relations between two entities given a plain text, is an important problem in natural language processing, for knowledge graph completion and question answering.- In order to obtain large-scale training data, distant supervision (Mintz et al 2009) was proposed by assuming that if two entities have a relation in a given knowledge base, all sentences that contain the two entities will mention that relation.
- Taking the triple (Barack Obama, BornIn, United States) as an example, the noisy sentence “Barack Obamba is the 44th president of the United State” will be regarded as a positive instance by distant supervision and a BornIn relation is.
Methods:
Method CNN
CNN+Max CNN+ATT CNN+RL over the sentences in a bag and can down weight noisy sentences in a bag.
CNN is a sentence-level model that is trained directly on noisy data.- CNN+Max CNN+ATT CNN+RL over the sentences in a bag and can down weight noisy sentences in a bag.
- CNN is a sentence-level model that is trained directly on noisy data.
- For bag-level models (CNN+Max and CNN+ATT), the training process is the same as the referenced papers.
- Each sentence is treated as a bag and a relation is predicted for each bag.
- Results in Table 1 reveal the following observations
Conclusion:
Conclusion and Future Work
In this paper, the authors propose a novel model for sentence-level relation classification from noisy data using a reinforcement learning framework.- The relation classifier predicts relation at the sentence level and provides rewards to the selector as a weak signal to supervise the instance selection process.
- A possible attempt might be to perform sentiment classification on noisy data (Go, Bhayani, and Huang 2009).
- The authors leave this as the future work
表格
- Table1: Performance on sentence-level relation classification
- Table2: Instance selection examples by different models. For CNN+RL and CNN+Max, 1 or 0 means the sentence is selected or not. For CNN+ATT, the value is the attention weight
相关工作
- Relation classification is a common task in natural language processing. Many approaches have been developed, particularly with supervised methods (Mooney and Bunescu 2005; Zhou et al 2005; Zelenko, Aone, and Richardella 2003). However, such supervised methods heavily rely on highquality labeled data.
Recently, neural models have been widely applied to relation classification (Zeng et al 2014; dos Santos, Xiang, and Zhou 2015; Mooney and Bunescu 2005; Yang et al 2016) including convolutional neural networks, recursive neural network (Ebrahimi and Dou 2015; Liu et al 2015), and long short-term memory network (Miwa and Bansal 2016; Xu et al 2015; Miwa and Bansal 2016). In (Wang et al 2016), two levels of attention is proposed in order to better discern patterns in heterogeneous contexts for relation classification.
In general, a large amount of labeled data are required to train neural models, which is quite expensive. To address this issue, distant supervision was proposed (Mintz et al. 2009) by assuming that all sentences that mention two entities of a fact triple describe the relation in the triple. In spite of the success of distance supervision, such methods suffer from the noisy labeling issue. To alleviate this issue, many studies formulated relation classification as a multi-instance learning problem (Riedel, Yao, and McCallum 2010; Hoffmann et al 2011; Surdeanu et al 2012; Zeng et al 2015). In (Lin et al 2016; Ji et al 2017; Tianyu Liu and Sui 2017), a sentence-level attention mechanism over multiple instances was proposed and incorrect sentences can be down-weighted. However, such multiinstance learning models all predict relations at the bag level but not at the sentence level, and they can not deal with the bags in which all sentences are not describing a relation at all. There are other approaches to reduce the noise of distant supervision using active learning (Sterckx et al 2014) and negative patterns (Takamatsu, Sato, and Nakagawa 2012).
基金
- This work was partly supported by the National Science Foundation of China under grant No.61272227/61332007
引用论文
- Bahdanau, D.; Brakel, P.; Xu, K.; Goyal, A.; Lowe, R.; Pineau, J.; Courville, A.; and Bengio, Y. 2016. An actorcritic algorithm for sequence prediction. arXiv preprint arXiv:1607.07086.
- Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; and Yakhnenko, O. 2013. Translating embeddings for modeling multi-relational data. In Advances in neural information processing systems, 2787–2795.
- dos Santos, C. N.; Xiang, B.; and Zhou, B. 2015. Classifying relations by ranking with convolutional neural networks. In ACL, 626–634.
- Ebrahimi, J., and Dou, D. 2015. Chain based RNN for relation classification. In NAACL, 1244–1249.
- Go, A.; Bhayani, R.; and Huang, L. 2009. Twitter sentiment classification using distant supervision. Cs224n Project Report.
- Hoffmann, R.; Zhang, C.; Ling, X.; Zettlemoyer, L.; and Weld, D. S. 2011. Knowledge-based weak supervision for information extraction of overlapping relations. In ACL, 541–550. Association for Computational Linguistics.
- Ji, G.; Liu, K.; He, S.; and Zhao, J. 201Distant supervision for relation extraction with sentence-level attention and entity descriptions. In AAAI, 3060–3066.
- Li, J.; Monroe, W.; Ritter, A.; Jurafsky, D.; Galley, M.; and Gao, J. 2016. Deep reinforcement learning for dialogue generation. In EMNLP, 1192–1202.
- Lillicrap, T. P.; Hunt, J. J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; and Wierstra, D. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.
- Lin, Y.; Shen, S.; Liu, Z.; Luan, H.; and Sun, M. 2016. Neural relation extraction with selective attention over instances. In ACL, volume 1, 2124–2133.
- Liu, Y.; Wei, F.; Li, S.; Ji, H.; Zhou, M.; and Wang, H. 2015. A dependency-based neural network for relation classification. In ACL, 285–290.
- Mintz, M.; Bills, S.; Snow, R.; and Jurafsky, D. 2009. Distant supervision for relation extraction without labeled data. In ACL, 1003–1011. Association for Computational Linguistics.
- Miwa, M., and Bansal, M. 2016. End-to-end relation extraction using lstms on sequences and tree structures. In ACL.
- Mooney, R. J., and Bunescu, R. C. 2005. Subsequence kernels for relation extraction. In Advances in neural information processing systems, 171–178.
- Narasimhan, K.; Yala, A.; and Barzilay, R. 2016. Improving information extraction by acquiring external evidence with reinforcement learning. arXiv preprint arXiv:1603.07954.
- Riedel, S.; Yao, L.; and McCallum, A. 2010. Modeling relations and their mentions without labeled text. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 148–163. Springer.
- Sterckx, L.; Demeester, T.; Deleu, J.; and Develder, C. 2014. Using active learning and semantic clustering for noise reduction in distant supervision. In 4e Workshop on Automated Base Construction at NIPS2014 (AKBC-2014), 1–6.
- Surdeanu, M.; Tibshirani, J.; Nallapati, R.; and Manning, C. D. 2012. Multi-instance multi-label learning for relation extraction. In EMNLP, 455–465. Association for Computational Linguistics.
- Sutton, R. S., and Barto, A. G. 1998. Reinforcement learning: An introduction, volume 1. MIT press Cambridge.
- Sutton, R. S.; McAllester, D.; Singh, S.; and Mansour, Y. 1999. Policy gradient methods for reinforcement learning with function approximation. In NIPS.
- Takamatsu, S.; Sato, I.; and Nakagawa, H. 2012. Reducing wrong labels in distant supervision for relation extraction. In ACL, 721–729. Association for Computational Linguistics.
- Tianyu Liu, Kexiang Wang, B. C., and Sui, Z. 2017. A softlabel method for noise-tolerant distantly supervised relation extraction. In EMNLP, 17911796.
- Wang, L.; Cao, Z.; de Melo, G.; and Liu, Z. 2016. Relation classification via multi-level attention cnns. In ACL.
- Williams, R. J. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning 8(3-4):229–256.
- Xu, Y.; Mou, L.; Li, G.; Chen, Y.; Peng, H.; and Jin, Z. 2015. Classifying relations via long short term memory networks along shortest dependency paths. In EMNLP, 1785–1794.
- Yang, Y.; Tong, Y.; Ma, S.; and Deng, Z. 2016. A position encoding convolutional neural network based on dependency tree for relation classification. In EMNLP, 65–74.
- Zelenko, D.; Aone, C.; and Richardella, A. 2003. Kernel methods for relation extraction. Journal of machine learning research 3(Feb):1083–1106.
- Zeng, D.; Liu, K.; Lai, S.; Zhou, G.; Zhao, J.; et al. 2014. Relation classification via convolutional deep neural network. In COLING, 2335–2344.
- Zeng, D.; Liu, K.; Chen, Y.; and Zhao, J. 2015. Distant supervision for relation extraction via piecewise convolutional neural networks. In EMNLP, 17–21.
- Zhou, G.; Su, J.; Zhang, J.; and Zhang, M. 2005. Exploring various knowledge in relation extraction. In ACL, 427–434. Association for Computational Linguistics.
下载 PDF 全文
标签
评论