# Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding

ACL, pp. 2713-2722, 2020.

EI

Weibo:

Abstract:

Translational distance-based knowledge graph embedding has shown progressive improvements on the link prediction task, from TransE to the latest state-of-the-art RotatE. However, N-1, 1-N and N-N predictions still remain challenging. In this work, we propose a novel translational distance-based approach for knowledge graph link predicti...More

Introduction

- Knowledge graph is a multi-relational graph whose nodes represent entities and edges denote relationships between entities.
- A large number of knowledge graphs, such as Freebase (Bollacker et al, 2008), DBpedia (Auer et al, 2007), NELL (Carlson et al, 2010) and YAGO3 (Mahdisoltani et al, 2013), have been built over the years and successfully applied to many domains such as recommendation and question answering (Bordes et al, 2014; Zhang et al, 2016)
- These knowledge graphs need to be updated with new facts periodically.
- Many knowledge graph embedding methods have been proposed for link prediction that is used for knowledge graph completion

Highlights

- Knowledge graph is a multi-relational graph whose nodes represent entities and edges denote relationships between entities
- Many knowledge graph embedding methods have been proposed for link prediction that is used for knowledge graph completion
- We show that orthogonal transform embedding together with graph context modeling performs consistently better than RotatE on the standard benchmark FB15k-237 and WN18RR datasets
- In this paper we propose a new distance-based knowledge graph embedding for link prediction
- Orthogonal transform embedding extends the modeling of RotatE from 2D complex domain to high dimensional space with orthogonal relation transforms
- Graph context is proposed to integrate graph structure information into the distance scoring function to measure the plausibility of the triples during training and inference

Methods

- 4.1 Datasets

Two commonly used benchmark datasets (FB15k237 and WN18RR) are employed in this study to evaluate the performance of link prediction. - FB15k-237 (Toutanova and Chen, 2015) dataset contains knowledge base relation triples and textual mentions of Freebase entity pairs.
- The knowledge base triples are a subset of the FB15K (Bordes et al, 2013), originally derived from Freebase.
- WN18RR (Dettmers et al, 2018) is derived from WN18 (Bordes et al, 2013), which is a subset of WordNet. WN18 consists of 18 relations and 40,943 entities.
- WN18RR (Dettmers et al, 2018) is created to ensure that the evaluation dataset does not have test leakage due to redundant inverse relation

Results

- The authors first present the results of link prediction, followed by the ablation study and error analysis of the models.
- Table 2 compares the proposed models (OTE and graph context based GC-OTE) to several stateof-the-art models: including translational distance based TransE (Bordes et al, 2013), RotatE (Sun et al, 2019); semantic matching based DistMult (Yang et al, 2014), ComplEx (Trouillon et al, 2016), ConvE (Dettmers et al, 2018), TuckER (Balazevic et al, 2019) and QuatE (Zhang et al, 2019), and graph context information based R-GCN+ (Schlichtkrull et al, 2017), SACN (Shang et al, 2019) and A2N (Bansal et al, 2019).

Conclusion

- In this paper the authors propose a new distance-based knowledge graph embedding for link prediction.
- It includes two-folds.
- OTE extends the modeling of RotatE from 2D complex domain to high dimensional space with orthogonal relation transforms.
- Experimental results on standard benchmark FB15k-237 and WN18RR show that OTE improves consistently over RotatE, the state-of-the-art distance-based embedding model, especially on FB15k-237 with many high in-degree nodes.
- On WN18RR the model achieves the new state-of-the-art results.

Summary

## Introduction:

Knowledge graph is a multi-relational graph whose nodes represent entities and edges denote relationships between entities.- A large number of knowledge graphs, such as Freebase (Bollacker et al, 2008), DBpedia (Auer et al, 2007), NELL (Carlson et al, 2010) and YAGO3 (Mahdisoltani et al, 2013), have been built over the years and successfully applied to many domains such as recommendation and question answering (Bordes et al, 2014; Zhang et al, 2016)
- These knowledge graphs need to be updated with new facts periodically.
- Many knowledge graph embedding methods have been proposed for link prediction that is used for knowledge graph completion
## Methods:

4.1 Datasets

Two commonly used benchmark datasets (FB15k237 and WN18RR) are employed in this study to evaluate the performance of link prediction.- FB15k-237 (Toutanova and Chen, 2015) dataset contains knowledge base relation triples and textual mentions of Freebase entity pairs.
- The knowledge base triples are a subset of the FB15K (Bordes et al, 2013), originally derived from Freebase.
- WN18RR (Dettmers et al, 2018) is derived from WN18 (Bordes et al, 2013), which is a subset of WordNet. WN18 consists of 18 relations and 40,943 entities.
- WN18RR (Dettmers et al, 2018) is created to ensure that the evaluation dataset does not have test leakage due to redundant inverse relation
## Results:

The authors first present the results of link prediction, followed by the ablation study and error analysis of the models.- Table 2 compares the proposed models (OTE and graph context based GC-OTE) to several stateof-the-art models: including translational distance based TransE (Bordes et al, 2013), RotatE (Sun et al, 2019); semantic matching based DistMult (Yang et al, 2014), ComplEx (Trouillon et al, 2016), ConvE (Dettmers et al, 2018), TuckER (Balazevic et al, 2019) and QuatE (Zhang et al, 2019), and graph context information based R-GCN+ (Schlichtkrull et al, 2017), SACN (Shang et al, 2019) and A2N (Bansal et al, 2019).
## Conclusion:

In this paper the authors propose a new distance-based knowledge graph embedding for link prediction.- It includes two-folds.
- OTE extends the modeling of RotatE from 2D complex domain to high dimensional space with orthogonal relation transforms.
- Experimental results on standard benchmark FB15k-237 and WN18RR show that OTE improves consistently over RotatE, the state-of-the-art distance-based embedding model, especially on FB15k-237 with many high in-degree nodes.
- On WN18RR the model achieves the new state-of-the-art results.

- Table1: Statistics of datasets. Only triples in the training set are used to compute graph context
- Table2: Link prediction for FB15k-237 and WN18RR on test sets
- Table3: Ablation study on FB15k-237 validation set
- Table4: H@10 from FB15-237 validation set by categories (1-to-N, N-to-1 and N-to-N). 4.4.3 Error Analysis

Related work

- 2.1 Knowledge Graph Embedding

Knowledge graph embedding could be roughly categorized into two classes (Wang et al, 2017): distance-based models and semantic matching models. Distance-based model is also known as additive models, since it projects head and tail entities into the same embedding space and the distance scoring between two entity embeddings is used to measure the plausibility of the given triple.

TransE (Bordes et al, 2013) is the first and most representative translational distance model. A series of work is conducted along this line such as TransH (Wang et al, 2014), TransR (Lin et al, 2015) and TransD (Ji et al, 2015) etc. RotatE (Sun et al, 2019) further extends the computation into complex domain and is currently the state-of-art in this category. On the other hand, Semantic matching models usually take multiplicative score functions to compute the plausibility of the given triple, such as DistMult (Yang et al, 2014), ComplEx (Trouillon et al, 2016), ConvE (Dettmers et al, 2018), TuckER (Balazevic et al, 2019) and QuatE (Zhang et al, 2019). ConvKB (Nguyen et al, 2017) and CapsE (Nguyen et al, 2019) further took the triple as a whole, and fed head, relation and tail embeddings into convolutional models or capsule networks.

Funding

- This work is partially supported by Beijing Academy of Artificial Intelligence (BAAI)

Reference

- Soren Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. 2007. Dbpedia: A nucleus for a web of open data. In The semantic web.
- Ivana Balazevic, Carl Allen, and Timothy Hospedales. 2019. TuckER: Tensor factorization for knowledge graph completion. In EMNLP.
- Nitin Bansal, Xiaohan Chen, and Zhangyang Wang. 2018. Can we gain more from orthogonality regularizations in training deep networks? In NeurIPS.
- Trapit Bansal, Da-Cheng Juan, Shrividya Ravi, and Andrew McCallum. 2019. A2N: Attending to neighbors for knowledge graph inference. In ACL.
- Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD.
- Antoine Bordes, Sumit Chopra, and Jason Weston. 2014. Question answering with subgraph embeddings. In EMNLP.
- Antoine Bordes, Nicolas Usunier, Alberto GarciaDuran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multirelational data. In NeurIPS.
- Andrew Carlson, Justin Betteridge, Bryan Kisiel, Burr Settles, Estevam R Hruschka Jr, and Tom M Mitchell. 2010. Toward an architecture for neverending language learning. In AAAI.
- Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. 2018. Convolutional 2D knowledge graph embeddings. In AAAI.
- Mehrtash Harandi and Basura Fernando. 2016. Generalized backpropagation, etude de cas: Orthogonality. ArXiv.
- Lei Huang, Xianglong Liu, Bo Lang, Adams Wei Yu, Yongliang Wang, and Bao Qin Li. 2017. Orthogonal weight normalization: Solution to optimization over multiple dependent stiefel manifolds in deep neural networks. In AAAI.
- Guoliang Ji, Shizhu He, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Knowledge graph embedding via dynamic mapping matrix. In ACL.
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. In ICLR.
- Thomas N Kipf and Max Welling. 2016. Semisupervised classification with graph convolutional networks. In ICLR.
- Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 20Learning entity and relation embedings for knowledge graph completion. In AAAI.
- Farzaneh Mahdisoltani, Joanna Biega, and Fabian M Suchanek. 2013. Yago3: A knowledge base from multilingual wikipedias. In CIDR.
- Deepak Nathani, Jatin Chauhan, Charu Sharma, and Manohar Kaul. 2019. Learning attention-based embeddings for relation prediction in knowledge graphs. In ACL.
- Dai Quoc Nguyen, Tu Dinh Nguyen, Dat Quoc Nguyen, and Dinh Phung. 2017. A novel embedding model for knowledge base completion based on convolutional neural network. arXiv preprint arXiv:1712.02121.
- Dai Quoc Nguyen, Thanh Vu, Tu Dinh Nguyen, Dat Quoc Nguyen, and Dinh Phung. 20A Capsule Network-based Embedding Model for Knowledge Graph Completion and Search Personalization. In NAACL.
- Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch.
- Andrew M. Saxe, James L. McClelland, and Surya Ganguli. 2013. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. In ICLR.
- Michael Sejr Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, and Max Welling. 2017. Modeling relational data with graph convolutional networks. In ESWC.
- Chao Shang, Yun Tang, Jing Huang, Jinbo Bi, Xiaodong He, and Bowen Zhou. 2019. End-to-end structure-aware convolutional networks for knowledge base completion. In AAAI.
- Arkadii Slinko. 2000. A generalization of Komlos theorem on random matrices. Univ., Department of Mathematics, School of Mathematical and Information.
- Zhiqing Sun, Zhi-Hong Deng, Jing Nie, and Jian Tang. 2019. Rotate: Knowledge graph embedding by relational rotation in complex space. In ICLR.
- Kristina Toutanova and Danqi Chen. 2015. Observed versus latent features for knowledge base and text inference. In Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality.
- Theo Trouillon, Johannes Welbl, Sebastian Riedel, Eric Gaussier, and Guillaume Bouchard. 2016. Complex embeddings for simple link prediction. In ICML, pages 2071–2080.
- Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. In ICLR.
- Eugene Vorontsov, Chiheb Trabelsi, Samuel Kadoury, and Christopher Joseph Pal. 2017. On orthogonality and learning recurrent networks with long term dependencies. In ICML.
- Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. 2017. Knowledge graph embedding: A survey of approaches and applications. TKDE, 29:2724–2743.
- Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge graph embedding by translating on hyperplanes. In AAAI.
- Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2014. Embedding entities and relations for learning and inference in knowledge bases. In ICLR.
- Fuzheng Zhang, Nicholas Jing Yuan, Defu Lian, Xing Xie, and Wei-Ying Ma. 2016. Collaborative knowledge base embedding for recommender systems. In KDD.
- Shuai Zhang, Yi Tay, Lina Yao, and Qi Liu. 2019. Quaternion knowledge graph embeddings. In NeurIPS.

Full Text

Tags

Comments