End-to-end Structure-Aware Convolutional Networks for Knowledge Base Completion

AAAI, Volume abs/1811.04441, 2019.

Cited by: 17|Bibtex|Views151
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com|arxiv.org
Weibo:
Weighted graph convolutional network with learnable weights has the benefit of collecting adaptive amount of information from neighboring graph nodes

Abstract:

Knowledge graph embedding has been an active research topic for knowledge base completion, with progressive improvement from the initial TransE, TransH, DistMult et al to the current state-of-the-art ConvE. ConvE uses 2D convolution over embeddings and multiple layers of nonlinear features to model knowledge graphs. The model can be effic...More

Code:

Data:

0
Introduction
  • Large-scale knowledge bases (KBs), such as Freebase (Bollacker et al 2008), DBpedia (Auer et al 2007), NELL (Carlson et al 2010) and YAGO3 (Mahdisoltani, Biega, and Suchanek 2013), have been built to store structured information about common facts.
  • The relationships are organized in the forms of (s, r, o) triplets.
  • These KBs are extensively used for web search, rec-.
  • Ommendation and question answering
  • These KBs have already contained millions of entities and triplets, they are far from complete compared to existing facts and newly added knowledge of the real world.
Highlights
  • Over the recent years, large-scale knowledge bases (KBs), such as Freebase (Bollacker et al 2008), DBpedia (Auer et al 2007), NELL (Carlson et al 2010) and YAGO3 (Mahdisoltani, Biega, and Suchanek 2013), have been built to store structured information about common facts
  • We propose an end-to-end graph StructureAware Convolutional Networks (SACN) that take all benefits of graph convolutional network and ConvE together
  • StructureAware Convolutional Network consists of an encoder of a weighted graph convolutional network (WGCN), and a decoder of a convolutional network called Conv-TransE
  • We have introduced an end-to-end structure-aware convolutional network (SACN)
  • Weighted graph convolutional network with learnable weights has the benefit of collecting adaptive amount of information from neighboring graph nodes
  • It uses a convolutional network to model the relationship as the translation operation and capture the translational characteristic between entities and relations
Methods
  • The authors describe the proposed end-to-end SACN. The encoder WGCN is focused on representing entities by aggregating connected entities as specified by the relations in the KB.
  • With node embeddings as the input, the decoder Conv-TransE network aims to represent the relations more accurately by recovering the original triplets in the KB.
  • Both encoder and decoder are trained jointly by minimizing the discrepancy between the embeddings es+er and eo to preserve the translational property es + er ≈ eo.
Results
  • Evaluation Protocol The authors' experiments use the the proportion of correct entities ranked in top 1,3 and 10 (Hits@1, Hits@3, Hits@10) and the mean reciprocal rank (MRR) as the metrics.
  • Since some corrupted triples exist in the knowledge graphs, the authors use the filtered setting (Bordes et al 2013), i.e. the authors filter out all valid triples before ranking.
  • Link Prediction The authors' results on the standard FB15k-237, WN18RR and FB15k-237-Attr are shown in Table 3.
  • Table 3 reports Hits@10, Hits@3, Hits@1 and MRR results of four different baseline models and two the models on three knowledge graphs datasets.
  • The authors run the SACN in FB15k-237-Attr to do the comparison with SACN using FB15k-237
Conclusion
  • Conclusion and Future Work

    The authors have introduced an end-to-end structure-aware convolutional network (SACN).
  • The encoding network is a weighted graph convolutional network, utilizing knowledge graph connectivity structure, node attributes and relation types.
  • The entity attributes are added as the nodes in the network so that attributes are transformed into knowledge structure information, which is integrated into the node embedding.
  • The authors prove that ConvTransE alone has already achieved the state of the art performance.
  • The performance of SACN achieves overall about 10% improvement than the state of the art such as ConvE
Summary
  • Introduction:

    Large-scale knowledge bases (KBs), such as Freebase (Bollacker et al 2008), DBpedia (Auer et al 2007), NELL (Carlson et al 2010) and YAGO3 (Mahdisoltani, Biega, and Suchanek 2013), have been built to store structured information about common facts.
  • The relationships are organized in the forms of (s, r, o) triplets.
  • These KBs are extensively used for web search, rec-.
  • Ommendation and question answering
  • These KBs have already contained millions of entities and triplets, they are far from complete compared to existing facts and newly added knowledge of the real world.
  • Methods:

    The authors describe the proposed end-to-end SACN. The encoder WGCN is focused on representing entities by aggregating connected entities as specified by the relations in the KB.
  • With node embeddings as the input, the decoder Conv-TransE network aims to represent the relations more accurately by recovering the original triplets in the KB.
  • Both encoder and decoder are trained jointly by minimizing the discrepancy between the embeddings es+er and eo to preserve the translational property es + er ≈ eo.
  • Results:

    Evaluation Protocol The authors' experiments use the the proportion of correct entities ranked in top 1,3 and 10 (Hits@1, Hits@3, Hits@10) and the mean reciprocal rank (MRR) as the metrics.
  • Since some corrupted triples exist in the knowledge graphs, the authors use the filtered setting (Bordes et al 2013), i.e. the authors filter out all valid triples before ranking.
  • Link Prediction The authors' results on the standard FB15k-237, WN18RR and FB15k-237-Attr are shown in Table 3.
  • Table 3 reports Hits@10, Hits@3, Hits@1 and MRR results of four different baseline models and two the models on three knowledge graphs datasets.
  • The authors run the SACN in FB15k-237-Attr to do the comparison with SACN using FB15k-237
  • Conclusion:

    Conclusion and Future Work

    The authors have introduced an end-to-end structure-aware convolutional network (SACN).
  • The encoding network is a weighted graph convolutional network, utilizing knowledge graph connectivity structure, node attributes and relation types.
  • The entity attributes are added as the nodes in the network so that attributes are transformed into knowledge structure information, which is integrated into the node embedding.
  • The authors prove that ConvTransE alone has already achieved the state of the art performance.
  • The performance of SACN achieves overall about 10% improvement than the state of the art such as ConvE
Tables
  • Table1: Scoring function ψ(es, eo). Here es and er denote a 2D reshaping of es and er
  • Table2: Statistics of datasets
  • Table3: Link prediction for FB15k-237, WN18RR and FB15k-237-Attr datasets
  • Table4: Kernel size analysis for FB15k-237 and FB15k237-Attr datasets. “SACN+Attr” means the SACN using FB15k-237-Attr dataset
  • Table5: Node indegree study using FB15k-237 dataset
Download tables as Excel
Related work
  • Knowledge graph embedding learning has been an active research area with applications directly in knowledge base completion (i.e. link prediction) and relation extractions. TransE (Bordes et al 2013) started this line of work by projecting both entities and relations into the same embedding vector space, with translational constraint of es + er ≈ eo. Later works enhanced KG embedding models such as TransH (Wang et al 2014), TransR (Lin et al 2015), and TransD (Ji et al 2015) introduced new representations of relational translation and thus increased model complexity. These models were categorized as translational distance models (Wang et al 2017) or additive models, while DistMult (Yang et al 2014) and ComplEx (Trouillon et al.1https://github.com/JD-AI-Research-Silicon-Valley/SACN

    2016) are multiplicative models (Sharma, Talukdar, and others 2018), due to the multiplicative score functions used for computing entity-relation-entity triplet likelihood.

    The most recent KG embedding models are ConvE (Dettmers et al 2017) and ConvKB (Nguyen et al 2017). ConvE was the first model using 2D convolutions over embeddings of different embedding dimensions, with the hope of extracting more feature interactions. ConvKB replaced 2D convolutions in ConvE with 1D convolutions, which constrains the convolutions to be the same embedding dimensions and keeps the translational property of TransE. ConvKB can be considered as a special case of Conv-TransE that only uses filters with width equal to 1. Although ConvKB was shown to be better than ConvE, the results on two datasets (FB15k-237 and WN18RR) were not consistent, so we leave these results out of our comparison table. The other major difference of ConvE and ConvKB is on the loss functions used in the models. ConvE used the cross-entropy loss that could be sped up with 1-N scoring in the decoder, while ConvKB used a hinge loss that was computed from positive examples and sampled negative examples. We take the decoder from ConvE because we can easily integrate the encoder of GCN and the decoder of ConvE into an end-to-end training framework, while ConvKB is not suitable for our approach.
Funding
  • This work was partially supported by NSF grants CCF-1514357 and IIS-1718738, as well as NIH grants R01DA037349 and K02DA043063 to Jinbo Bi
Reference
  • Auer, S.; Bizer, C.; Kobilarov, G.; Lehmann, J.; Cyganiak, R.; and Ives, Z. 2007. Dbpedia: A nucleus for a web of open data. In The semantic web. Springer. 722–735.
    Google ScholarFindings
  • Bollacker, K.; Evans, C.; Paritosh, P.; Sturge, T.; and Taylor, J. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, 1247–1250. AcM.
    Google ScholarLocate open access versionFindings
  • Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; and Yakhnenko, O. 201Translating embeddings for modeling multi-relational data. In Advances in neural information processing systems, 2787–2795.
    Google ScholarLocate open access versionFindings
  • Bruna, J.; Zaremba, W.; Szlam, A.; and LeCun, Y. 2013. Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203.
    Findings
  • Carlson, A.; Betteridge, J.; Kisiel, B.; Settles, B.; Hruschka Jr, E. R.; and Mitchell, T. M. 2010. Toward an architecture for never-ending language learning. In AAAI, volume 5, 3. Atlanta.
    Google ScholarLocate open access versionFindings
  • Defferrard, M.; Bresson, X.; and Vandergheynst, P. 201Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in Neural Information Processing Systems, 3844–3852.
    Google ScholarLocate open access versionFindings
  • Dettmers, T.; Minervini, P.; Stenetorp, P.; and Riedel, S. 201Convolutional 2d knowledge graph embeddings. arXiv preprint arXiv:1707.01476.
    Findings
  • Hamilton, W.; Ying, Z.; and Leskovec, J. 2017a. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems, 1025–1035.
    Google ScholarLocate open access versionFindings
  • Hamilton, W. L.; Ying, R.; and Leskovec, J. 2017b. Representation learning on graphs: Methods and applications. arXiv preprint arXiv:1709.05584.
    Findings
  • Henaff, M.; Bruna, J.; and LeCun, Y. 2015. Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163.
    Findings
  • Ji, G.; He, S.; Xu, L.; Liu, K.; and Zhao, J. 2015. Knowledge graph embedding via dynamic mapping matrix. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), volume 1, 687–696.
    Google ScholarLocate open access versionFindings
  • Kingma, D. P., and Ba, J. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
    Findings
  • Kipf, T. N., and Welling, M. 2016a. Variational graph autoencoders. In Advances in neural information processing systems,Bayesian Deep Learning Workshop.
    Google ScholarFindings
  • Kipf, T. N., and Welling, M. 2016b. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.
    Findings
  • Lin, Y.; Liu, Z.; Sun, M.; Liu, Y.; and Zhu, X. 20Learning entity and relation embeddings for knowledge graph completion. In AAAI, volume 15, 2181–2187.
    Google ScholarLocate open access versionFindings
  • Lin, Y.; Liu, Z.; and Sun, M. 20Knowledge representation learning with entities, attributes and relations. ethnicity 1:41–52.
    Google ScholarLocate open access versionFindings
  • Mahdisoltani, F.; Biega, J.; and Suchanek, F. M. 2013. Yago3: A knowledge base from multilingual wikipedias. In CIDR.
    Google ScholarLocate open access versionFindings
  • Nguyen, D. Q.; Sirts, K.; Qu, L.; and Johnson, M. 2016. Stranse: a novel embedding model of entities and relationships in knowledge bases. arXiv preprint arXiv:1606.08140.
    Findings
  • Nguyen, D. Q.; Nguyen, T. D.; Nguyen, D. Q.; and Phung, D. 2017. A novel embedding model for knowledge base completion based on convolutional neural network. arXiv preprint arXiv:1712.02121.
    Findings
  • Nguyen, D. Q. 2017. An overview of embedding models of entities and relationships for knowledge base completion. arXiv preprint arXiv:1703.08098.
    Findings
  • Pham, T.; Tran, T.; Phung, D.; and Venkatesh, S. 2017. Column networks for collective classification.
    Google ScholarFindings
  • Schlichtkrull, M.; Kipf, T. N.; Bloem, P.; van den Berg, R.; Titov, I.; and Welling, M. 2018. Modeling relational data with graph convolutional networks. In European Semantic Web Conference, 593–607. Springer.
    Google ScholarLocate open access versionFindings
  • Shang, C.; Liu, Q.; Chen, K.-S.; Sun, J.; Lu, J.; Yi, J.; and Bi, J. 2018. Edge attention-based multi-relational graph convolutional networks. arXiv preprint arXiv:1802.04944.
    Findings
  • Sharma, A.; Talukdar, P.; et al. 2018. Towards understanding the geometry of knowledge graph embeddings. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), volume 1, 122–131.
    Google ScholarLocate open access versionFindings
  • Toutanova, K., and Chen, D. 2015. Observed versus latent features for knowledge base and text inference. In Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality, 57–66.
    Google ScholarLocate open access versionFindings
  • Trouillon, T.; Welbl, J.; Riedel, S.; Gaussier, E.; and Bouchard, G. 2016. Complex embeddings for simple link prediction. In International Conference on Machine Learning, 2071–2080.
    Google ScholarLocate open access versionFindings
  • Wang, Z.; Zhang, J.; Feng, J.; and Chen, Z. 2014. Knowledge graph embedding by translating on hyperplanes. In AAAI, volume 14, 1112–1119.
    Google ScholarLocate open access versionFindings
  • Wang, Q.; Mao, Z.; Wang, B.; and Guo, L. 2017. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering 29(12):2724–2743.
    Google ScholarLocate open access versionFindings
  • Yang, B.; Yih, W.-t.; He, X.; Gao, J.; and Deng, L. 2014. Embedding entities and relations for learning and inference in knowledge bases. arXiv preprint arXiv:1412.6575.
    Findings
  • Ying, R.; He, R.; Chen, K.; Eksombatchai, P.; Hamilton, W. L.; and Leskovec, J. 2018. Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 974–983.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments