## AI帮你理解科学

## AI 精读

AI抽取本论文的概要总结

微博一下：

# RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs

ICLR, (2021)

EI

摘要

This paper studies learning logic rules for reasoning on knowledge graphs. Logic rules provide interpretable explanations when used for prediction as well as being able to generalize to other tasks, and hence are critical to learn. Existing methods either suffer from the problem of searching in a large search space (e.g., neural logic p...更多

简介

- Knowledge graphs are collections of real-world facts, which are useful in various applications.
- This paper studies learning logic rules for reasoning on knowledge graphs.
- The rule can be applied to infer new hobbies of people.
- Such logic rules are able to improve interpretability and precision of reasoning (Qu & Tang, 2019; Zhang et al, 2020).
- Due to the large search space, inferring high-quality logic rules for reasoning on knowledge graphs is a challenging task

重点内容

- Knowledge graphs are collections of real-world facts, which are useful in various applications
- Given the ranks from all queries, we report the Mean Rank (MR), Mean Reciprocal Rank (MRR) and Hit@k (H@k) under the filtered setting (Bordes et al, 2013), which is used by most existing studies
- The reason is that RNNLogic is optimized with an EM-based framework, in which the reasoning predictor provides more useful feedback to the rule generator, and addresses the challenge of sparse reward
- This paper studies learning logic rules for knowledge graph reasoning, and an approach called RNNLogic is proposed
- RNNLogic treats a set of logic rules as a latent variable, and a rule generator as well as a reasoning predictor with logic rules are jointly learned
- We see that RNNLogic significantly outperforms RotatE at every embedding dimension
- Extensive expemriments prove the effectiveness of RNNLogic

方法

**Experimental Setup of RNNLogic**

For each training triplet (h, r, t), the authors add an inverse triplet (t, r−1, h) into the training set, yielding an augmented set of training triplets T.- To build a training instance from pdata, the authors first randomly sample a triplet (h, r, t) from T , and form an instance as (G = T \ {(h, r, t)}, q = (h, r, ?), a = t).
- The authors use the sampled triplet (h, r, t) to construct the query and answer, and use the rest of triplets in T to form the background knowledge graph G.
- The background knowledge graph G is formed with all the triplets in T.

结果

- The authors first compare RNNLogic with rule learning methods.
- RNNLogic achieves much better results than statistical relational learning methods (MLN, Boosted RDN, PathRank) and neural differentiable methods (NeuralLP, DRUM, NLIL, CTP).
- This is because the rule generator and reasoning predictor of RNNLogic can collaborate with each other to reduce search space and learn better rules.
- The reason is that RNNLogic is optimized with an EM-based framework, in which the reasoning predictor provides more useful feedback to the rule generator, and addresses the challenge of sparse reward

结论

- This paper studies learning logic rules for knowledge graph reasoning, and an approach called RNNLogic is proposed.
- RNNLogic treats a set of logic rules as a latent variable, and a rule generator as well as a reasoning predictor with logic rules are jointly learned.
- The authors develop an EM-based algorithm for optimization.
- Extensive expemriments prove the effectiveness of RNNLogic.
- The authors plan to study generating more complicated logic rules rather than only compositional rules

- Table1: Results of reasoning on FB15k-237 and WN18RR. H@k is in %. [∗] means the numbers are taken from original papers. [†] means we rerun the methods with the same evaluation process
- Table2: Results of reasoning on the Kinship and UMLS datasets. H@k is in %
- Table3: Comparison between REINFORCE and EM
- Table4: Case study of the rules generated by the rule generator
- Table5: Statistics of datasets
- Table6: Comparison with MultiHopKG
- Table7: Logic rules learned by RNNLogic

相关工作

- Our work is related to existing efforts on learning logic rules for knowledge graph reasoning. Most traditional methods enumerate relational paths between query entities and answer entities as candidate logic rules, and further learn a scalar weight for each rule to assess the quality. Representative methods include Markov logic networks (Kok & Domingos, 2005; Richardson & Domingos, 2006; Khot et al, 2011), relational dependency networks (Neville & Jensen, 2007; Natarajan et al, 2010), rule mining algorithms (Galarraga et al, 2013; Meilicke et al, 2019), path ranking (Lao & Cohen, 2010; Lao et al, 2011) and probabilistic personalized page rank (ProPPR) algorithms (Wang et al, 2013; 2014a;b). Some recent methods extend the idea by simultaneously learning logic rules and the weights in a differentiable way, and most of them are based on neural logic programming (Rocktaschel & Riedel, 2017; Yang et al, 2017; Cohen et al, 2018; Sadeghian et al, 2019; Yang & Song, 2020) or neural theorem provers (Rocktaschel & Riedel, 2017; Minervini et al, 2020). These methods and our approach are similar in spirit, as they are all able to learn the weights of logic rules efficiently. However, these existing methods try to simultaneously learn logic rules and their weights, which is nontrivial in terms of optimization. The main innovation of our approach is to separate rule generation and rule weight learning by introducing a rule generator and a reasoning predictor respectively, which can mutually enhance each other. The rule generator generates a few high-quality logic rules, and the reasoning predictor only focuses on learning the weights of such high-quality rules, which significantly reduces the search space and leads to better reasoning results. Meanwhile, the reasoning predictor can in turn help identify some useful logic rules to improve the rule generator.

基金

- We see that RNNLogic significantly outperforms RotatE at every embedding dimension

研究对象与分析

datasets: 4

Then in the E-step, we select a set of high-quality rules from all generated rules with both the rule generator and reasoning predictor via posterior inference; and in the M-step, the rule generator is updated with the rules selected in the E-step. Experiments on four datasets prove the effectiveness of RNNLogic.

For each training triplet (h, r, t), we add an inverse triplet (t, r−1, h) into the training set, yielding an augmented set of training triplets T

**Experimental Setup of RNNLogic**For each training triplet (h, r, t), we add an inverse triplet (t, r−1, h) into the training set, yielding an augmented set of training triplets T

datasets: 4

Then in the E-step, we select a set of high-quality rules from all generated rules with both the rule generator and reasoning predictor via posterior inference; and in the M-step, the rule generator is updated with the rules selected in the E-step. Experiments on four datasets prove the effectiveness of RNNLogic. Knowledge graphs are collections of real-world facts, which are useful in various applications

datasets: 4

Datasets. We choose four datasets for evaluation, including FB15k-237 (Toutanova & Chen, 2015), WN18RR (Dettmers et al, 2018), Kinship and UMLS (Kok & Domingos, 2007). For Kinship and UMLS, there are no standard data splits, so we randomly sample 30% of all the triplets for training, 20% for validation, and the rest 50% for testing

引用论文

- Ivana Balazevic, Carl Allen, and Timothy Hospedales. Tucker: Tensor factorization for knowledge graph completion. In EMNLP, 2019.
- Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. Translating embeddings for modeling multi-relational data. In NeurIPS, 2013.
- Liwei Cai and William Yang Wang. Kbgan: Adversarial learning for knowledge graph embeddings. In NAACL, 2018.
- Wenhu Chen, Wenhan Xiong, Xifeng Yan, and William Wang. Variational knowledge graph reasoning. In NAACL, 2018.
- William W Cohen, Fan Yang, and Kathryn Rivard Mazaitis. Tensorlog: Deep learning meets probabilistic databases. Journal of Artificial Intelligence Research, 2018.
- James Cussens. Stochastic logic programs: sampling, inference and applications. In UAI, 2000.
- Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, and Andrew McCallum. Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning. In ICLR, 2018.
- Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. Convolutional 2d knowledge graph embeddings. In AAAI, 2018.
- Jonathan Eckstein, Noam Goldberg, and Ai Kagawa. Rule-enhanced penalized regression by column generation using rectangular maximum agreement. In ICML, 2017.
- Luis Antonio Galarraga, Christina Teflioudi, Katja Hose, and Fabian Suchanek. Amie: association rule mining under incomplete evidence in ontological knowledge bases. In WWW, 2013.
- Noam Goldberg and Jonathan Eckstein. Boosting classifiers with tightened l0-relaxation penalties. In ICML, 2010.
- Sepp Hochreiter and Jurgen Schmidhuber. Long short-term memory. Neural computation, 1997.
- Eric Jang, Shixiang Gu, and Ben Poole. Categorical reparameterization with gumbel-softmax. In ICLR, 2017.
- Tushar Khot, Sriraam Natarajan, Kristian Kersting, and Jude Shavlik. Learning markov logic networks via functional gradient boosting. In ICDM, 2011.
- Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In ICLR, 2014.
- Stanley Kok and Pedro Domingos. Learning the structure of markov logic networks. In ICML, 2005.
- Stanley Kok and Pedro Domingos. Statistical predicate invention. In ICML, 2007.
- Daphne Koller and Nir Friedman. Probabilistic graphical models: principles and techniques. MIT press, 2009.
- Timothee Lacroix, Nicolas Usunier, and Guillaume Obozinski. Canonical tensor decomposition for knowledge base completion. In ICML, 2018.
- Ni Lao and William W Cohen. Relational retrieval using a combination of path-constrained random walks. Machine learning, 2010.
- Ni Lao, Tom Mitchell, and William W Cohen. Random walk inference and learning in a large scale knowledge base. In EMNLP, 2011.
- Xi Victoria Lin, Richard Socher, and Caiming Xiong. Multi-hop knowledge graph reasoning with reward shaping. In EMNLP, 2018.
- Chris J Maddison, Andriy Mnih, and Yee Whye Teh. The concrete distribution: A continuous relaxation of discrete random variables. In ICLR, 2017.
- Christian Meilicke, Melisachew Wudage Chekol, Daniel Ruffinelli, and Heiner Stuckenschmidt. Anytime bottom-up rule learning for knowledge graph completion. In IJCAI, 2019.
- Pasquale Minervini, Sebastian Riedel, Pontus Stenetorp, Edward Grefenstette, and Tim Rocktaschel. Learning reasoning strategies in end-to-end differentiable proving. In ICML, 2020.
- Sriraam Natarajan, Tushar Khot, Kristian Kersting, Bernd Gutmann, and Jude Shavlik. Boosting relational dependency networks. In ICILP, 2010.
- Radford M Neal and Geoffrey E Hinton. A view of the em algorithm that justifies incremental, sparse, and other variants. In Learning in graphical models. Springer, 1998.
- Jennifer Neville and David Jensen. Relational dependency networks. Journal of Machine Learning Research, 2007.
- Maximilian Nickel, Lorenzo Rosasco, and Tomaso Poggio. Holographic embeddings of knowledge graphs. In AAAI, 2016.
- Meng Qu and Jian Tang. Probabilistic logic neural networks for reasoning. In NeurIPS, 2019.
- Matthew Richardson and Pedro Domingos. Markov logic networks. Machine learning, 2006.
- Tim Rocktaschel and Sebastian Riedel. End-to-end differentiable proving. In NeurIPS, 2017.
- Ali Sadeghian, Mohammadreza Armandpour, Patrick Ding, and Daisy Zhe Wang. Drum: End-to-end differentiable rule mining on knowledge graphs. In NeurIPS, 2019.
- Yelong Shen, Jianshu Chen, Po-Sen Huang, Yuqing Guo, and Jianfeng Gao. M-walk: Learning to walk over graphs using monte carlo tree search. In NeurIPS, 2018.
- Slavko Simic. On a global upper bound for jensen’s inequality. Journal of mathematical analysis and applications, 2008.
- Zhiqing Sun, Zhi-Hong Deng, Jian-Yun Nie, and Jian Tang. Rotate: Knowledge graph embedding by relational rotation in complex space. In ICLR, 2019.
- Zhiqing Sun, Shikhar Vashishth, Soumya Sanyal, Partha Talukdar, and Yiming Yang. A re-evaluation of knowledge graph completion methods. In ACL, 2020.
- Komal K Teru and William L Hamilton. Inductive relation prediction on knowledge graphs. In ICML, 2020.
- Kristina Toutanova and Danqi Chen. Observed versus latent features for knowledge base and text inference. In Workshop on Continuous Vector Space Models and their Compositionality, 2015.
- Theo Trouillon, Johannes Welbl, Sebastian Riedel, Eric Gaussier, and Guillaume Bouchard. Complex embeddings for simple link prediction. In ICML, 2016.
- William Yang Wang, Kathryn Mazaitis, and William W Cohen. Programming with personalized pagerank: a locally groundable first-order probabilistic logic. In CIKM, 2013.
- William Yang Wang, Kathryn Mazaitis, and William W Cohen. Proppr: Efficient first-order probabilistic logic programming for structure discovery, parameter learning, and scalable inference. In Workshops at AAAI, 2014a.
- William Yang Wang, Kathryn Mazaitis, and William W Cohen. Structure learning via parameter learning. In CIKM, 2014b.
- Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. Knowledge graph embedding by translating on hyperplanes. In AAAI, 2014c.
- Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 1992.
- Wenhan Xiong, Thien Hoang, and William Yang Wang. Deeppath: A reinforcement learning method for knowledge graph reasoning. In EMNLP, 2017.
- Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. Embedding entities and relations for learning and inference in knowledge bases. In ICLR, 2015.
- Fan Yang, Zhilin Yang, and William W Cohen. Differentiable learning of logical rules for knowledge base reasoning. In NeurIPS, 2017.
- Yuan Yang and Le Song. Learn to explain efficiently via neural logic inductive learning. In ICLR, 2020.
- Yuyu Zhang, Xinshi Chen, Yuan Yang, Arun Ramamurthy, Bo Li, Yuan Qi, and Le Song. Efficient probabilistic logic reasoning with graph neural networks. In ICLR, 2020.
- Thus, it only remains to prove Lemma 1 to complete the proof. We use Theorem 1 from (Simic, 2008) as a starting point: Theorem 1 Suppose that x = {xi}ni=1 represents a finite sequence of real numbers belonging to a fixed closed interval I = [a, b], a < b. If f is a convex function on I, then we have that: 1n
- In practice, we observe that the hard-assignment EM algorithm (Koller & Friedman, 2009) works better than the standard EM algorithm despite the reduced theoretical guarantees. In the hardassignment EM algorithm, we need to draw a sample zI with the maximum posterior probability. Based on the above approximation q(zI ) of the true posterior distribution pθ,w(zI |G, q, a), we could simply construct such a sample zI with K rules which have the maximum probability under the distribution qr. By definition, we have qr(rule) ∝ exp(H(rule)), and hence drawing K rules with maximum probability under qr is equivalent to choosing K rules with the maximum H values.
- For the scalar score φw(path) of a path, we either fix it to 1, or compute it by introducing entity and relation embeddings. In the second case, we introduce an embedding for each entity and relation in the complex space. Formally, the embedding of an entity e is denoted as xe, and the embedding of a relation r is denoted as xr. For a grounding path path = e0 −r→1 e1 −r→2 e2 · · · el−1 −r→l el, we follow the idea in RotatE (Sun et al., 2019) and compute φw(path) in the following way: φw(path) = σ(δ − d(xe0 ◦ xr1 ◦ xr2 ◦ · · · ◦ xrl, xel )), (17)
- This paper focuses on compositional rules, which have the abbreviation form r ← r1 ∧ · · · ∧ rl and thus could be viewed a sequence of relations [r, r1, r2 · · · rl, rEND], where r is the query relation or the head of the rule, {ri}li=1 are the body of the rule, and rEND is a special relation indicating the end of the relation sequence. We introduce a rule generator RNNθ parameterized with an LSTM (Hochreiter & Schmidhuber, 1997) to model such sequences. Given the current relation sequence [r, r1, r2 · · · ri], RNNθ aims to generate the next relation ri+1 and meanwhile output the probability of ri+1. The detailed computational process towards the goal is summarized as follows:

标签

评论

数据免责声明

页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果，我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问，可以通过电子邮件方式联系我们：report@aminer.cn