Temporal Knowledge Base Completion: New Algorithms and Evaluation Protocols

Jain Prachi
Jain Prachi
Rathi Sushant
Rathi Sushant
Chakrabarti Soumen
Chakrabarti Soumen

EMNLP 2020, 2020.

Cited by: 0|Bibtex|Views70
Other Links: arxiv.org
Weibo:
Intense attention on KBC has only just begun to transfer to temporal knowledge base completion

Abstract:

Temporal knowledge bases associate relational (s,r,o) triples with a set of times (or a single time instant) when the relation is valid. While time-agnostic KB completion (KBC) has witnessed significant research, temporal KB completion (TKBC) is in its early days. In this paper, we consider predicting missing entities (link prediction) ...More

Code:

Data:

0
Introduction
  • A knowledge base (KB) is a collection of triples (s, r, o), with a subject s, a relation type r and an object o.
  • KBs are usually incomplete, necessitating completion (KBC) of triples not provided in the collection.
  • A KBC model is often evaluated by its performance on link prediction: supplying missing arguments to queries of the form (s, r, ?) and (?, r, o).
  • Many relations are transient or impermanent.
  • Temporal KBs annotate each fact with the time period in which it holds or occurs.
  • A person is born in a city in an instant, a politician can
Highlights
  • A knowledge base (KB) is a collection of triples (s, r, o), with a subject s, a relation type r and an object o
  • We propose evaluation protocols for link and time prediction queries for temporal KB completion (TKBC)
  • Intense attention on KBC has only just begun to transfer to TKBC
  • We propose a new TKBC framework, TIMEPLEX, which combines representations of time with representations of entities and relations
  • TIMEPLEX exceeds the performance of all baseline and existing TKBC systems
  • Our experiments suggest that time embeddings are temporally meaningful, and the model makes fewer temporal consistency and ordering mistakes compared to previous models
Methods
  • CX TA-DM TA-CX HyTE TIMEPLEX(base) TIMEPLEX

    5.2 Hyperparameters and policies

    The authors implemented TIMEPLEX using PyTorch.
  • The authors trained HyTE and TA-DM with 400dimensional real vectors for fair comparison.
  • Relation embeddings increase to 3× in TIMEPLEX, but this makes little overall difference compared to the much larger number of entity and time embeddings.
  • Some instances in YAGO11k and WIKIDATA12k datasets may have tb or te missing.
  • The authors replace missing values by −∞ or +∞ as appropriate for training and link prediction.
  • For evaluating models on time prediction, the authors filter out such instances from the test set.
Results
  • Results and Observations

    Table 3 shows that TIMEPLEX dominates across all data sets. To the surprise, time-agnostic CX

    Datasets→ ↓Methods HyTE TIMEPLEX(base) TIMEPLEX YAGO11k aeIOU@1

    baseline performs comparably or better than timeaware baselines (TA-DM and HyTE).
  • Table 3 shows that TIMEPLEX dominates across all data sets.
  • Datasets→ ↓Methods HyTE TIMEPLEX(base) TIMEPLEX YAGO11k aeIOU@1.
  • Baseline performs comparably or better than timeaware baselines (TA-DM and HyTE).
  • The time spans in ICEWS data sets are too short to explore temporal difference features, but TIMEPLEX performs best even without them.
  • The performance of HyTE-CX dropped for all datasets, so the authors exclude it.
  • Time queries are limited to the datasets with nontrivial intervals and time spans and shown in Table 4
Conclusion
  • Intense attention on KBC has only just begun to transfer to TKBC. The authors argue that current evaluation schemes for both link and time prediction have limitations, and propose more meaningful schemes.
  • CX HyTE TIMEPLEX ignorant KBC system can perform better than many recent TKBC systems.
  • The authors propose a new TKBC framework, TIMEPLEX, which combines representations of time with representations of entities and relations.
  • It has learnt temporal consistency constraints, which allow other temporal facts in the KB to influence the validity of a given fact.
  • The authors' experiments suggest that time embeddings are temporally meaningful, and the model makes fewer temporal consistency and ordering mistakes compared to previous models
Summary
  • Introduction:

    A knowledge base (KB) is a collection of triples (s, r, o), with a subject s, a relation type r and an object o.
  • KBs are usually incomplete, necessitating completion (KBC) of triples not provided in the collection.
  • A KBC model is often evaluated by its performance on link prediction: supplying missing arguments to queries of the form (s, r, ?) and (?, r, o).
  • Many relations are transient or impermanent.
  • Temporal KBs annotate each fact with the time period in which it holds or occurs.
  • A person is born in a city in an instant, a politician can
  • Objectives:

    The authors aim to capture three types of temporal constraints: Relation Recurrence: Many relations do not recur for a given entity.
  • Methods:

    CX TA-DM TA-CX HyTE TIMEPLEX(base) TIMEPLEX

    5.2 Hyperparameters and policies

    The authors implemented TIMEPLEX using PyTorch.
  • The authors trained HyTE and TA-DM with 400dimensional real vectors for fair comparison.
  • Relation embeddings increase to 3× in TIMEPLEX, but this makes little overall difference compared to the much larger number of entity and time embeddings.
  • Some instances in YAGO11k and WIKIDATA12k datasets may have tb or te missing.
  • The authors replace missing values by −∞ or +∞ as appropriate for training and link prediction.
  • For evaluating models on time prediction, the authors filter out such instances from the test set.
  • Results:

    Results and Observations

    Table 3 shows that TIMEPLEX dominates across all data sets. To the surprise, time-agnostic CX

    Datasets→ ↓Methods HyTE TIMEPLEX(base) TIMEPLEX YAGO11k aeIOU@1

    baseline performs comparably or better than timeaware baselines (TA-DM and HyTE).
  • Table 3 shows that TIMEPLEX dominates across all data sets.
  • Datasets→ ↓Methods HyTE TIMEPLEX(base) TIMEPLEX YAGO11k aeIOU@1.
  • Baseline performs comparably or better than timeaware baselines (TA-DM and HyTE).
  • The time spans in ICEWS data sets are too short to explore temporal difference features, but TIMEPLEX performs best even without them.
  • The performance of HyTE-CX dropped for all datasets, so the authors exclude it.
  • Time queries are limited to the datasets with nontrivial intervals and time spans and shown in Table 4
  • Conclusion:

    Intense attention on KBC has only just begun to transfer to TKBC. The authors argue that current evaluation schemes for both link and time prediction have limitations, and propose more meaningful schemes.
  • CX HyTE TIMEPLEX ignorant KBC system can perform better than many recent TKBC systems.
  • The authors propose a new TKBC framework, TIMEPLEX, which combines representations of time with representations of entities and relations.
  • It has learnt temporal consistency constraints, which allow other temporal facts in the KB to influence the validity of a given fact.
  • The authors' experiments suggest that time embeddings are temporally meaningful, and the model makes fewer temporal consistency and ordering mistakes compared to previous models
Tables
  • Table1: Jean is the gold answer. Rows are ranked system predictions. Candidates were also seen in other folds with various intervals that may overlap with T ev in the query. Columns 3–4 show the effective ranks ‘lost’ by Jean to earlier candidates, using earlier filtering methods. Columns 5–8 (Method 3) show the ranks lost for each time point. The bottom row shows ranks of Jean as computed by different methods. Time-insensitive filtering over-estimates system performance, while unfiltered evaluation under-estimates it. The final rank assigned by Method 3 to Jean is 3.25, which is the average of {4, 4, 3, 2}, which are the filtered ranks for each time instant in [2000, 2003]
  • Table2: Details of datasets used
  • Table3: Link prediction performance of time-agnostic and time aware models. The newly-proposed time-sensitive filtering scheme is used. TIMEPLEX(base) means TIMEPLEX without relation pair / recurrent relation features
  • Table4: Time prediction performance
  • Table5: TIMEPLEX performs better (MRR) on long duration relations as compared to short duration relations on both datasets. On Wikidata12k the model performs well on instant relations
  • Table6: Time constraint violations among top predictions of various models
  • Table7: High confidence (99%) relation orderings extracted from YAGO11k
  • Table8: High confidence (99%) relation orderings extracted from WIKIDATA12k
Download tables as Excel
Funding
  • This work is supported by IBM AI Horizons Network grant, an IBM SUR award, grants by Google, Bloomberg and 1MG, and a Visvesvaraya faculty award by Govt. of India
  • Soumen Chakrabarti is supported by grants from IBM and Amazon
Study subjects and analysis
standard TKBC datasets: 4
5.1 Data sets. We report on experiments with four standard TKBC datasets. WIKIDATA12k and YAGO11k (Dasgupta et al, 2018) are two knowledge graphs with a time interval associated with each fact triple

Reference
  • Antoine Bordes, Nicolas Usunier, Alberto GarciaDuran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multirelational data. In NIPS Conference, pages 2787– 2795.
    Google ScholarLocate open access versionFindings
  • Shib Sankar Dasgupta, Swayambhu Nath Ray, and Partha Talukdar. 2018. HyTE: Hyperplane-based temporally aware knowledge graph embedding. In EMNLP, pages 2001–2011.
    Google ScholarLocate open access versionFindings
  • Alberto Garcıa-Duran, Sebastijan Dumancic, and Mathias Niepert. 2018. Learning sequence encoders for temporal knowledge graph completion. In EMNLP.
    Google ScholarFindings
  • Alberto Garcıa-Duran, Sebastijan Dumancic, and Mathias Niepert. 2018. Learning sequence encoders for temporal knowledge graph completion. arXiv preprint arXiv:1809.03202.
    Findings
  • Alberto Garcia-Duran and Mathias Niepert. 2018. Kblrn: End-to-end learning of knowledge base representations with latent, relational, and numerical features. In UAI.
    Google ScholarFindings
  • Prachi Jain, Pankaj Kumar, Mausam, and Soumen Chakrabarti. 2018. Type-sensitive knowledge base inference without explicit type supervision. In ACL Conference.
    Google ScholarFindings
  • Guoliang Ji, Shizhu He, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Knowledge graph embedding via dynamic mapping matrix. In ACL Conference, pages 687–696.
    Google ScholarLocate open access versionFindings
  • Heng Ji, Ralph Grishman, Hoa Trang Dang, and Kira Griffitt. 2011. Overview of the TAC2011 knowledge base population (KBP) track. In Text Analysis Conference (TAC), pages 14–15.
    Google ScholarLocate open access versionFindings
  • Tingsong Jiang, Tianyu Liu, Tao Ge, Lei Sha, Baobao Chang, Sujian Li, and Zhifang Sui. 2016a. Towards time-aware knowledge graph completion. In COLING, pages 1715–1724.
    Google ScholarLocate open access versionFindings
  • Tingsong Jiang, Tianyu Liu, Tao Ge, Lei Sha, Sujian Li, Baobao Chang, and Zhifang Sui. 2016b. Encoding temporal information for time-aware link prediction. In EMNLP, pages 2350–2354.
    Google ScholarLocate open access versionFindings
  • Woojeong Jin, Changlin Zhang, Pedro A. Szekely, and Xiang Ren. 2019. Recurrent event network for reasoning over temporal knowledge graphs. CoRR, abs/1904.05530.
    Findings
  • Daphne Koller and Nir Friedman. 2009. Probabilistic Graphical Models: Principles and Techniques. MIT Press.
    Google ScholarFindings
  • Timothee Lacroix, Nicolas Usunier, and Guillaume Obozinski. 2018. Canonical tensor decomposition for knowledge base completion. In ICML, pages 2863–2872.
    Google ScholarLocate open access versionFindings
  • Maximilian Nickel, Lorenzo Rosasco, Tomaso A Poggio, et al. 2016. Holographic embeddings of knowledge graphs. In AAAI Conference, pages 1955– 1961.
    Google ScholarLocate open access versionFindings
  • Hamid Rezatofighi, Nathan Tsoi, JunYoung Gwak, Amir Sadeghian, Ian Reid, and Silvio Savarese. 2019. Generalized intersection over union: A metric and a loss for bounding box regression. In CVPR, pages 658–666.
    Google ScholarLocate open access versionFindings
  • Mihai Surdeanu. 2013. Overview of the TAC2013 knowledge base population evaluation: English slot filling and temporal slot filling. In Text Analysis Conference (TAC).
    Google ScholarFindings
  • Rakshit Trivedi, Hanjun Dai, Yichen Wang, and Le Song. 20Know-evolve: Deep temporal reasoning for dynamic knowledge graphs. In ICML, pages 3462–3471.
    Google ScholarLocate open access versionFindings
  • Theo Trouillon, Johannes Welbl, Sebastian Riedel, Eric Gaussier, and Guillaume Bouchard. 2016. Complex embeddings for simple link prediction. In ICML, pages 2071–2080.
    Google ScholarLocate open access versionFindings
  • Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge graph embedding by translating on hyperplanes. In AAAI Conference.
    Google ScholarLocate open access versionFindings
  • Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2015. Embedding entities and relations for learning and inference in knowledge bases. In ICLR.
    Google ScholarFindings
Full Text
Your rating :
0

 

Tags
Comments