A Graph Reasoning Network for Multi-turn Response Selection via Customized Pre-training

Cited by: 0|Views15
Weibo:
We introduce sequence and graph reasoning structure jointly, where the sequence reasoning module can capture the key information from the global perspective and the graph reasoning module is responsible for capturing the clue words information from the local perspective

Abstract:

We investigate response selection for multi-turn conversation in retrieval-based chatbots. Existing studies pay more attention to the matching between utterances and responses by calculating the matching score based on learned features, leading to insufficient model reasoning ability. In this paper, we propose a graph-reasoning network ...More

Code:

Data:

0
Full Text
Bibtex
Weibo
Introduction
  • As an important task in a dialogue system, response selection aims to find the best matched response from a set of candidates given the context of a conversation.
  • Most existing studies on this task pay more attention to the matching problem between utterances and responses, but with insufficient concern for the reasoning issue in multiturn response selection.
  • Matching focuses on capturing the relevance features between utterances and responses, while reasoning needs to identify key features (i.e., u1: Good morning , two tickets to london , please .
  • U2 : Express train or regular one ?
  • U5: the author sees , but how long does the express train take ?
  • Besides , as long as the author gets to london earlier , the author doesn't mind paying a little extra .·
Highlights
  • As an important task in a dialogue system, response selection aims to find the best matched response from a set of candidates given the context of a conversation
  • We first introduce two pre-training tasks called Next utterance prediction (NUP) and Utterance order prediction (UOP) which are specially designed for response selection
  • We introduce two customized pre-training methods NUP and UOP, which are specially designed for response selection
  • We can observe that the performance of graph-reasoning network (GRN) significantly outperforms all comparative models on both datasets, demonstrating the superior power of GCN in reasoning questions with multi-turn context
  • Compared with ALBERT, GRN has an absolute advantage of 6.8% on R@1, approximately 2% on R@2 and approximately 3.8% on Mean Reciprocal Rank (MRR) on MuTual
  • We introduce sequence and graph reasoning structure jointly, where the sequence reasoning module can capture the key information from the global perspective and the graph reasoning module is responsible for capturing the clue words information from the local perspective
Methods
  • Method Human Random TF

    IDF DuLSTM SMN DAM BIDAF RNET QANET BERT RoBERTa SpanBERT GPT-2 GPT-2-FT BERTMC RoBERTaMC ALBERT GRN MuTual MuTualplus

    R@1 R@2 MRR R@1 R@2 MRR.
  • IDF DuLSTM SMN DAM BIDAF RNET QANET BERT RoBERTa SpanBERT GPT-2 GPT-2-FT BERTMC RoBERTaMC ALBERT GRN MuTual MuTualplus.
  • BERT (Devlin et al 2019): An autoencoding language model based on transformer.
  • SpanBERT (Joshi et al 2020): An autoencoding language model with span masking base on transformer.
  • RoBERTa (Liu et al 2019): An autoencoding language model with dynamic masking base on transformer.
Results
  • Table 1 reports the testing results of GRN as well as all comparative models on MuTual and MuTualplus.
  • One notable point is that the performance of traditional representation models (i.e., TF-IDF, DuLSTM, SMN and DMN) is relatively low.
  • This indicates that these representation models have insufficient reasoning ability.
Conclusion
  • The authors propose a new architecture for multi-turn response reasoning.
  • The authors first propose NUP and UOP pre-training tasks for response selection.
  • The authors design the UDG of utterance for reasoning.
  • The authors introduce sequence and graph reasoning structure jointly, where the sequence reasoning module can capture the key information from the global perspective and the graph reasoning module is responsible for capturing the clue words information from the local perspective.
  • The experiment results on MuTual and MuTualplus achieve a new heights.
  • There is still expansive room for improvement in performance on MuTualplus.
  • The authors will further investigate how to balance safe response and meaningful candidate response
Summary
  • Introduction:

    As an important task in a dialogue system, response selection aims to find the best matched response from a set of candidates given the context of a conversation.
  • Most existing studies on this task pay more attention to the matching problem between utterances and responses, but with insufficient concern for the reasoning issue in multiturn response selection.
  • Matching focuses on capturing the relevance features between utterances and responses, while reasoning needs to identify key features (i.e., u1: Good morning , two tickets to london , please .
  • U2 : Express train or regular one ?
  • U5: the author sees , but how long does the express train take ?
  • Besides , as long as the author gets to london earlier , the author doesn't mind paying a little extra .·
  • Objectives:

    The authors' goal is to learn a matching model f (U, ri), which can measure the relevance between the context U and each candidate response ri∈{a,b,c,d}.
  • Methods:

    Method Human Random TF

    IDF DuLSTM SMN DAM BIDAF RNET QANET BERT RoBERTa SpanBERT GPT-2 GPT-2-FT BERTMC RoBERTaMC ALBERT GRN MuTual MuTualplus

    R@1 R@2 MRR R@1 R@2 MRR.
  • IDF DuLSTM SMN DAM BIDAF RNET QANET BERT RoBERTa SpanBERT GPT-2 GPT-2-FT BERTMC RoBERTaMC ALBERT GRN MuTual MuTualplus.
  • BERT (Devlin et al 2019): An autoencoding language model based on transformer.
  • SpanBERT (Joshi et al 2020): An autoencoding language model with span masking base on transformer.
  • RoBERTa (Liu et al 2019): An autoencoding language model with dynamic masking base on transformer.
  • Results:

    Table 1 reports the testing results of GRN as well as all comparative models on MuTual and MuTualplus.
  • One notable point is that the performance of traditional representation models (i.e., TF-IDF, DuLSTM, SMN and DMN) is relatively low.
  • This indicates that these representation models have insufficient reasoning ability.
  • Conclusion:

    The authors propose a new architecture for multi-turn response reasoning.
  • The authors first propose NUP and UOP pre-training tasks for response selection.
  • The authors design the UDG of utterance for reasoning.
  • The authors introduce sequence and graph reasoning structure jointly, where the sequence reasoning module can capture the key information from the global perspective and the graph reasoning module is responsible for capturing the clue words information from the local perspective.
  • The experiment results on MuTual and MuTualplus achieve a new heights.
  • There is still expansive room for improvement in performance on MuTualplus.
  • The authors will further investigate how to balance safe response and meaningful candidate response
Tables
  • Table1: Experimental results of different methods on two testing sets
  • Table2: Ablation experimental results of GRN on MuTual validation set
  • Table3: Performance comparison of UBERT using different pre-training methods on the validation dataset
  • Table4: Performance comparison of different UDGs on the validation dataset from the previous work (<a class="ref-link" id="cDevlin_et+al_2019_a" href="#rDevlin_et+al_2019_a">Devlin et al 2019</a>), we concatenate the CLS representations of all input sequences in one instance to calculate matching score, which is denoted as BERTMC. This method is also applicable to other language models similar to BERT, such as RoBERTa
  • Table5: R@1 performance comparison of different number of turns on the test set. T denotes number of turns
Download tables as Excel
Related work
  • Response Selection Response selection aims to select the best matched response from a set of candidates, which can be categorized into single-turn and multi-turn dialogues. Early studies focused on the single-turn dialogues. Recently, researchers devote more attention to the multi-turn dialogues technology (Tao et al 2019b,a; Lu et al 2019). Existing methods tend to use deep matching methods to model the relationships between utterances and candidate responses. These models generally use representation methods based on LSTM, attention mechanisms and hierarchical interaction techniques. These models are focused on matching not reasoning. The key problem of matching type models is how to extract better matching features. In fact, however, the key problem of reasoning is how to conduct inference according to clue words from different utterances, which is more complicated. Existing multi-turn response selection methods are not suitable for the reasoning problems.
Funding
  • The project is supported by the National Key R&D Program of China (2018YFB1004700), the National Natural Science Foundation of China (61872074, 61772122, U1811261), and the Fundamental Research Funds for the Central Universities (No.N181602013)
Study subjects and analysis
context-response pairs: 8860
MuTual is built based on Chinese high school English listening comprehension test data, consisting of 8,860 challenge questions, in terms of almost all questions involving reasoning, which are designed by linguistic experts and professional annotators. MuTual consists of 8,860 context-response pairs and has an average of 4.73 turns. Each context-response pair has four candidate responses

Reference
  • Baeza-Yates, R. A.; and Ribeiro-Neto, B. 1999. Modern Information Retrieval.
    Google ScholarLocate open access versionFindings
  • Cui, L.; Wu, Y.; Liu, S.; Zhang, Y.; and Zhou, M. 2020. MuTual: A Dataset for Multi-Turn Dialogue Reasoning. In ACL
    Google ScholarLocate open access versionFindings
  • 2020: 58th annual meeting of the Association for Computational Linguistics, 1406–1416.
    Google ScholarFindings
  • Devlin, J.; Chang, M.-W.; Lee, K.; and Toutanova, K. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT 2019: Annual Conference of the North American Chapter of the Association for Computational Linguistics, 4171–4186.
    Google ScholarLocate open access versionFindings
  • Fang, Y.; Sun, S.; Gan, Z.; Pillai, R.; Wang, S.; and Liu, J. 2019. Hierarchical Graph Network for Multi-hop Question Answering. arXiv preprint arXiv:1911.03631.
    Findings
  • Gan, Z.; Pu, Y.; Henao, R.; Li, C.; He, X.; and Carin, L. 2017. Learning Generic Sentence Representations Using Convolutional Neural Networks. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2390–2400.
    Google ScholarLocate open access versionFindings
  • Grosz, B. J.; Weinstein, S.; and Joshi, A. K. 1995. Centering: a framework for modeling the local coherence of discourse. Computational Linguistics 21(2): 203–225.
    Google ScholarLocate open access versionFindings
  • Gururangan, S.; Marasovic, A.; Swayamdipta, S.; Lo, K.; Beltagy, I.; Downey, D.; and Smith, N. A. 2020. Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks. In ACL 2020: 58th annual meeting of the Association for Computational Linguistics, 8342–8360.
    Google ScholarLocate open access versionFindings
  • Hill, F.; Cho, K.; and Korhonen, A. 2016. Learning distributed representations of sentences from unlabelled data. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1367–1377.
    Google ScholarLocate open access versionFindings
  • Joshi, M.; Chen, D.; Liu, Y.; Weld, D. S.; Zettlemoyer, L.; and Levy, O. 2020. SpanBERT: Improving Pre-training by Representing and Predicting Spans. Transactions of the Association for Computational Linguistics 8: 64–77.
    Google ScholarLocate open access versionFindings
  • Kipf, T. N.; and Welling, M. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR 2017: International Conference on Learning Representations 2017.
    Google ScholarLocate open access versionFindings
  • Kiros, R.; Zhu, Y.; Salakhutdinov, R.; Zemel, R. S.; Torralba, A.; Urtasun, R.; and Fidler, S. 2015. Skip-thought vectors. In NIPS’15 Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2, 3294–3302.
    Google ScholarLocate open access versionFindings
  • Klicpera, J.; Bojchevski, A.; and Gunnemann, S. 2018. Predict then propagate: Graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997.
    Findings
  • Lan, Z.; Chen, M.; Goodman, S.; Gimpel, K.; Sharma, P.; and Soricut, R. 2020. ALBERT: A Lite BERT for Selfsupervised Learning of Language Representations. In ICLR 2020: Eighth International Conference on Learning Representations.
    Google ScholarLocate open access versionFindings
  • Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; and Stoyanov, V. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692.
    Findings
  • Lowe, R.; Pow, N.; Serban, I.; and Pineau, J. 2015. The Ubuntu Dialogue Corpus: A Large Dataset for Research in
    Google ScholarFindings
  • Unstructured Multi-Turn Dialogue Systems. In Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 285–294.
    Google ScholarLocate open access versionFindings
  • Lu, J.; Zhang, C.; Xie, Z.; Ling, G.; Zhou, T. C.; and Xu, Z. 2019. Constructing Interpretive Spatio-Temporal Features for Multi-Turn Responses Selection. In ACL 2019: The 57th Annual Meeting of the Association for Computational Linguistics, 44–50.
    Google ScholarLocate open access versionFindings
  • Mihalcea, R.; and Tarau, P. 2004. TextRank: Bringing Order into Text. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 404–411.
    Google ScholarLocate open access versionFindings
  • Barcelona, Spain: Association for Computational Linguistics. URL https://www.aclweb.org/anthology/W04-3252.
    Findings
  • Qiu, L.; Xiao, Y.; Qu, Y.; Zhou, H.; Li, L.; Zhang, W.; and Yu, Y. 2019. Dynamically Fused Graph Network for Multihop Reasoning. In ACL 2019: The 57th Annual Meeting of the Association for Computational Linguistics, 6140–6150.
    Google ScholarFindings
  • Qu, C.; Yang, L.; Qiu, M.; Croft, W. B.; Zhang, Y.; and Iyyer, M. 2019. BERT with History Answer Embedding for Conversational Question Answering. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 1133–1136.
    Google ScholarLocate open access versionFindings
  • Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; and Sutskever, I. 2019. Language Models are Unsupervised Multitask Learners.
    Google ScholarFindings
  • Seo, M.; Kembhavi, A.; Farhadi, A.; and Hajishirzi, H. 2017. Bidirectional Attention Flow for Machine Comprehension. In ICLR 2017: International Conference on Learning Representations 2017.
    Google ScholarLocate open access versionFindings
  • Su, H.; Shen, X.; Zhang, R.; Sun, F.; Hu, P.; Niu, C.; and Zhou, J. 2019. Improving Multi-turn Dialogue Modelling with Utterance ReWriter. In ACL 2019: The 57th Annual Meeting of the Association for Computational Linguistics, 22–31.
    Google ScholarLocate open access versionFindings
  • Tao, C.; wei wu; Xu, C.; Hu, W.; Zhao, D.; and Yan, R. 2019a. One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues. In ACL 2019: The 57th Annual Meeting of the Association for Computational Linguistics, 1–11.
    Google ScholarLocate open access versionFindings
  • Tao, C.; Wu, W.; Xu, C.; Hu, W.; Zhao, D.; and Yan, R. 2019b. Multi-Representation Fusion Network for MultiTurn Response Selection in Retrieval-Based Chatbots. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, 267–275.
    Google ScholarLocate open access versionFindings
  • Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; and Polosukhin, I. 2017. Attention is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, 5998–6008.
    Google ScholarLocate open access versionFindings
  • Wang, W.; Yang, N.; Wei, F.; Chang, B.; and Zhou, M. 2017. Gated Self-Matching Networks for Reading Comprehension and Question Answering. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), volume 1, 189–198.
    Google ScholarLocate open access versionFindings
  • Wu, Y.; Wu, W.; Xing, C.; Zhou, M.; and Li, Z. 2017. Sequential Matching Network: A New Architecture for Multiturn Response Selection in Retrieval-Based Chatbots. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), volume 1, 496–505.
    Google ScholarLocate open access versionFindings
  • Xu, K.; Lai, Y.; Feng, Y.; and Wang, Z. 2019. Enhancing Key-Value Memory Neural Networks for Knowledge Based Question Answering. In NAACL-HLT 2019: Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2937–2947.
    Google ScholarLocate open access versionFindings
  • Ye, D.; Lin, Y.; Liu, Z.; Liu, Z.; and Sun, M. 2019. MultiParagraph Reasoning with Knowledge-enhanced Graph Neural Network. arXiv preprint arXiv:1911.02170.
    Findings
  • Yeh, Y. T.; and Chen, Y.-N. 2019. FlowDelta: Modeling Flow Information Gain in Reasoning for Conversational Machine Comprehension. In Proceedings of the 2nd Workshop on Machine Reading for Question Answering, 86–90.
    Google ScholarLocate open access versionFindings
  • Yu, A. W.; Dohan, D.; Luong, M.-T.; Zhao, R.; Chen, K.; Norouzi, M.; and Le, Q. V. 2018. QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension. In International Conference on Learning Representations.
    Google ScholarLocate open access versionFindings
  • Zhang, Y.; Dai, H.; Kozareva, Z.; Smola, A.; and Song, L. 2018. Variational Reasoning for Question Answering with Knowledge Graph. In AAAI-18 AAAI Conference on Artificial Intelligence, 6069–6076.
    Google ScholarLocate open access versionFindings
  • Zhou, X.; Li, L.; Dong, D.; Liu, Y.; Chen, Y.; Zhao, W. X.; Yu, D.; and Wu, H. 2018. Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network. In ACL 2018: 56th Annual Meeting of the Association for Computational Linguistics, volume 1, 1118–1127.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments