G2T: Generating Fluent Descriptions for Knowledge Graph

Yunzhou Shi
Yunzhou Shi
Pengcheng Zhu
Pengcheng Zhu
Feng Ji
Feng Ji
Wei Zhou
Wei Zhou
Yujiu Yang
Yujiu Yang

SIGIR '20: The 43rd International ACM SIGIR conference on research and development in Information Retrieval Virtual Event China July, 2020, pp. 1861-1864, 2020.

Cited by: 0|Bibtex|Views51|DOI:https://doi.org/10.1145/3397271.3401289
EI
Other Links: dl.acm.org|dblp.uni-trier.de|academic.microsoft.com
Weibo:
We study the problem of generating natural language descriptions for Knowledge graph

Abstract:

Generating natural language descriptions for knowledge graph (KG) is an important category for intelligent writing. Recent models on this task substitute the sequence encoder in a commonly used encoder-decoder framework with a graph encoder. However, these models suffer from entity missing and repetition. In this paper, we propose a novel...More

Code:

Data:

0
Introduction
  • Knowledge graph (KG) has been widely applied in the area of information retrieval, such as question answering (Q&A) systems, web search and recommender systems.
  • A typical KG consists of.
  • EVALUATE-FOR Dataset RelMaotidoeln LSTM Extraction NYT 10 Mintz++ LSTM Model USED-FOR.
  • Relation Extraction Freauency COMPARE NYT10 Dataset EVALUATE-FOR Mintz++
Highlights
  • Knowledge graph (KG) has been widely applied in the area of information retrieval, such as question answering (Q&A) systems, web search and recommender systems
  • To tackle the above issues, we propose a novel model named G2T to generate fluent descriptions by taking KG as input
  • We introduce Copy Coverage Loss (CCL) in the training phase to suppress the
  • To tackle the repetition problem, we propose a Copy Coverage Loss (CCL) for training loss
  • The proposed model can achieve 63.0%, which is better than GraphWriter even without Graph Structure Enhanced Mechanism (GSEM) or CCL module
  • Our proposed model achieves state-of-the-art performance, which demonstrates the effectiveness of our model
  • We study the problem of generating natural language descriptions for KG
Methods
  • The authors conduct the experiments on AGENDA and DuIE 1 datasets.
  • AGENDA Dataset [2] is a dataset of knowledge graph paired with scientific abstracts.
  • The AGENDA dataset is split into 38720, 1000 and 1000 for training, validation, and test.
  • Each sample in DuIE contains sentences and a set of associated knowledge graph inferred from sentences.
  • The authors split the original training and validation dataset into 173108, 10000 and 11639 samples for training, validation and test
Results
  • The authors adopt widely used BLEU and METEOR, which reflect the similarity between generated text and reference text.
  • The less entity repetition and missing are in generated text, the higher the BLEU and METEOR scores are.
  • Table 1 shows the automatic evaluation results of each model on the AGENDA dataset and DuIE dataset.
  • Table 1 presents the result of the ablation study by either removing GSEM or CCL.
  • The authors observe that GSEM and CCL have positive effects on the model
Conclusion
  • The authors study the problem of generating natural language descriptions for KG. To alleviate entity repetition and missing issues, the authors design the G2T model, with GSEM and CCL, to generate more fluent descriptions.
Summary
  • Introduction:

    Knowledge graph (KG) has been widely applied in the area of information retrieval, such as question answering (Q&A) systems, web search and recommender systems.
  • A typical KG consists of.
  • EVALUATE-FOR Dataset RelMaotidoeln LSTM Extraction NYT 10 Mintz++ LSTM Model USED-FOR.
  • Relation Extraction Freauency COMPARE NYT10 Dataset EVALUATE-FOR Mintz++
  • Methods:

    The authors conduct the experiments on AGENDA and DuIE 1 datasets.
  • AGENDA Dataset [2] is a dataset of knowledge graph paired with scientific abstracts.
  • The AGENDA dataset is split into 38720, 1000 and 1000 for training, validation, and test.
  • Each sample in DuIE contains sentences and a set of associated knowledge graph inferred from sentences.
  • The authors split the original training and validation dataset into 173108, 10000 and 11639 samples for training, validation and test
  • Results:

    The authors adopt widely used BLEU and METEOR, which reflect the similarity between generated text and reference text.
  • The less entity repetition and missing are in generated text, the higher the BLEU and METEOR scores are.
  • Table 1 shows the automatic evaluation results of each model on the AGENDA dataset and DuIE dataset.
  • Table 1 presents the result of the ablation study by either removing GSEM or CCL.
  • The authors observe that GSEM and CCL have positive effects on the model
  • Conclusion:

    The authors study the problem of generating natural language descriptions for KG. To alleviate entity repetition and missing issues, the authors design the G2T model, with GSEM and CCL, to generate more fluent descriptions.
Tables
  • Table1: Experimental results of different models on AGENDA and DuIE dataset (w/o is short for without)
  • Table2: Human evaluation of GraphWriter and our model on AGENDA and DuIE dataset. GW is short for GraphWriter
Download tables as Excel
Funding
  • This work was supported in part by the National Key Research and Development Program of China (No 2018YFB1601102), National Natural Science Foundation of China (No 61802340), and Shenzhen special fund for the strategic development of emerging industries (No JCYJ20170412170118573)
Reference
  • Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In ICLR 2015.
    Google ScholarLocate open access versionFindings
  • Rik Koncel-Kedziorski, Dhanush Bekal, Yi Luan, Mirella Lapata, and Hannaneh Hajishirzi. 2019. Text Generation from Knowledge Graphs with Graph Transformers. In NAACL-HLT 2019. 2284–2293.
    Google ScholarFindings
  • Yi Luan, Luheng He, Mari Ostendorf, and Hannaneh Hajishirzi. 2018. Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction. In EMNLP 2018. 3219–3232.
    Google ScholarLocate open access versionFindings
  • Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation. In EMNLP 2015. 1412– 1421.
    Google ScholarLocate open access versionFindings
  • Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get To The Point: Summarization with Pointer-Generator Networks. In ACL 2017. 1073–1083.
    Google ScholarLocate open access versionFindings
  • Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. In NIPS 2014. 3104–3112.
    Google ScholarLocate open access versionFindings
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 201Attention is All you Need. In NIPS 2017. 5998–6008.
    Google ScholarLocate open access versionFindings
  • Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 201Graph Attention Networks. In ICLR 2018.
    Google ScholarLocate open access versionFindings
  • Yaoming Zhu, Sidi Lu, Lei Zheng, Jiaxian Guo, Weinan Zhang, Jun Wang, and Yong Yu. 2018. Texygen: A Benchmarking Platform for Text Generation Models. In SIGIR 2018. 1097–1100.
    Google ScholarLocate open access versionFindings
Full Text
PPT
Your rating :
0

 

Tags
Comments