AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
Our work focuses on modeling multi-relational data from knowledge bases, with the goal of providing an efficient tool to complete them by automatically adding new facts, without requiring extra knowledge

Translating Embeddings for Modeling Multi-relational Data.

NIPS, pp.2787-2795, (2013)

被引用3659|浏览706
EI
下载 PDF 全文
引用
微博一下

摘要

We consider the problem of embedding entities and relationships of multirelational data in low-dimensional vector spaces. Our objective is to propose a canonical model which is easy to train, contains a reduced number of parameters and can scale up to very large databases. Hence, we propose TransE, a method which models relationships by i...更多

代码

数据

0
简介
  • Multi-relational data refers to directed graphs whose nodes correspond to entities and edges of the form (denoted (h, , t)), each of which indicates that there exists a relationship of name label between the entities head and tail.
  • The authors' work focuses on modeling multi-relational data from KBs (Wordnet [9] and Freebase [1] in this paper), with the goal of providing an efficient tool to complete them by automatically adding new facts, without requiring extra knowledge.
  • The notion of locality for a single relationship may be purely structural, such as the friend of the author's friend is the author's friend in
重点内容
  • Multi-relational data refers to directed graphs whose nodes correspond to entities and edges of the form (denoted (h, , t)), each of which indicates that there exists a relationship of name label between the entities head and tail
  • Our work focuses on modeling multi-relational data from knowledge bases (Wordnet [9] and Freebase [1] in this paper), with the goal of providing an efficient tool to complete them by automatically adding new facts, without requiring extra knowledge
  • We proposed a new approach to learn embeddings of knowledge bases, focusing on the minimal parametrization of the model to primarily represent hierarchical relationships
  • We showed that it works very well compared to competing methods on two different knowledge bases, and is a highly scalable model, whereby we applied it to a very large-scale chunk of Freebase data
方法
  • Unstructured [2] RESCAL [11] SE [3] SME(LINEAR) [2] SME(BILINEAR) [2] LFM [6] TransE

    NB.
  • Unstructured [2] RESCAL [11] SE [3] SME(LINEAR) [2] SME(BILINEAR) [2] LFM [6] TransE.
  • O O O O O O O ON FB15K 0.75 DATA SET WN ENTITIES RELATIONSHIPS TRAIN.
  • VALID EX
结果
  • Results for

    Unstructured, SE, SME, SME and TransE are presented in Figure 1.
  • The performance of Unstructured is the best when no example of the unknown relationship is provided, because it does not use this information to predict.
  • This performance does not improve while providing labeled examples.
  • TransE is the fastest method to learn: with only 10 examples of a new relationship, the hits@10 is already 18% and it improves monotonically with the number of provided samples.
  • The authors believe the simplicity of the TransE model makes it able to generalize well, without having to modify any of the already trained embeddings
结论
  • The authors proposed a new approach to learn embeddings of KBs, focusing on the minimal parametrization of the model to primarily represent hierarchical relationships.
  • It remains unclear to them if all relationship types can be modeled adequately by the approach, by breaking down the evaluation into categories (1-to-1, 1-to-Many, .
  • Combining KBs with text as in [2] is another important direction where the approach could prove useful.
  • The authors recently fruitfully inserted TransE into a framework for relation extraction from text [16]
表格
  • Table1: Numbers of parameters and their values
  • Table2: Statistics of the data sets used for FB15k (in millions). ne and nr are the nb. of en- in this paper and extracted from the two tities and relationships; k the embeddings dimension. knowledge bases, Wordnet and Freebase
  • Table3: Link prediction results. Test performance of the different methods
  • Table4: Detailed results by category of relationship. We compare Hits@10 (in %) on FB15k in the filtered evaluation setting for our model, TransE and baselines. (M. stands for MANY)
  • Table5: Example predictions on the FB15k test set using TransE. Bold indicates the test triplet’s true tail and italics other true tails present in the training set
Download tables as Excel
相关工作
  • Section 1 described a large body of work on embedding KBs. We detail here the links between our model and those of [3] (Structured Embeddings or SE) and [14].
基金
  • This work was carried out in the framework of the Labex MS2T (ANR-11-IDEX-0004-02), and funded by the French National Agency for Research (EVEREST-12-JS02-005-01)
引用论文
  • K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, 2008.
    Google ScholarLocate open access versionFindings
  • A. Bordes, X. Glorot, J. Weston, and Y. Bengio. A semantic matching energy function for learning with multi-relational data. Machine Learning, 2013.
    Google ScholarFindings
  • A. Bordes, J. Weston, R. Collobert, and Y. Bengio. Learning structured embeddings of knowledge bases. In Proceedings of the 25th Annual Conference on Artificial Intelligence (AAAI), 2011.
    Google ScholarLocate open access versionFindings
  • X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS)., 2010.
    Google ScholarLocate open access versionFindings
  • R. A. Harshman and M. E. Lundy. Parafac: parallel factor analysis. Computational Statistics & Data Analysis, 18(1):39–72, Aug. 1994.
    Google ScholarLocate open access versionFindings
  • R. Jenatton, N. Le Roux, A. Bordes, G. Obozinski, et al. A latent factor model for highly multi-relational data. In Advances in Neural Information Processing Systems (NIPS 25), 2012.
    Google ScholarLocate open access versionFindings
  • C. Kemp, J. B. Tenenbaum, T. L. Griffiths, T. Yamada, and N. Ueda. Learning systems of concepts with an infinite relational model. In Proceedings of the 21st Annual Conference on Artificial Intelligence (AAAI), 2006.
    Google ScholarLocate open access versionFindings
  • T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems (NIPS 26), 2013.
    Google ScholarLocate open access versionFindings
  • G. Miller. WordNet: a Lexical Database for English. Communications of the ACM, 38(11):39– 41, 1995.
    Google ScholarLocate open access versionFindings
  • K. Miller, T. Griffiths, and M. Jordan. Nonparametric latent feature models for link prediction. In Advances in Neural Information Processing Systems (NIPS 22), 2009.
    Google ScholarLocate open access versionFindings
  • M. Nickel, V. Tresp, and H.-P. Kriegel. A three-way model for collective learning on multirelational data. In Proceedings of the 28th International Conference on Machine Learning (ICML), 2011.
    Google ScholarLocate open access versionFindings
  • M. Nickel, V. Tresp, and H.-P. Kriegel. Factorizing YAGO: scalable machine learning for linked data. In Proceedings of the 21st international conference on World Wide Web (WWW), 2012.
    Google ScholarLocate open access versionFindings
  • A. P. Singh and G. J. Gordon. Relational learning via collective matrix factorization. In Proceedings of the 14th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2008.
    Google ScholarLocate open access versionFindings
  • R. Socher, D. Chen, C. D. Manning, and A. Y. Ng. Learning new facts from knowledge bases with neural tensor networks and semantic word vectors. In Advances in Neural Information Processing Systems (NIPS 26), 2013.
    Google ScholarLocate open access versionFindings
  • I. Sutskever, R. Salakhutdinov, and J. Tenenbaum. Modelling relational data using bayesian clustered tensor factorization. In Advances in Neural Information Processing Systems (NIPS 22), 2009.
    Google ScholarLocate open access versionFindings
  • J. Weston, A. Bordes, O. Yakhnenko, and N. Usunier. Connecting language and knowledge bases with embedding models for relation extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2013.
    Google ScholarLocate open access versionFindings
  • J. Zhu. Max-margin nonparametric latent feature models for link prediction. In Proceedings of the 29th International Conference on Machine Learning (ICML), 2012.
    Google ScholarLocate open access versionFindings
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科