AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
Experimental results show superior performance of our method on a large-scale multi-domain intent detection dataset with OOS

Discriminative Nearest Neighbor Few Shot Intent Detection by Transferring Natural Language Inference

EMNLP 2020, pp.5064-5082, (2020)

Cited by: 3|Views607
Full Text
Bibtex
Weibo

Abstract

Intent detection is one of the core components of goal-oriented dialog systems, and detecting out-of-scope (OOS) intents is also a practically important skill. Few-shot learning is attracting much attention to mitigate data scarcity, but OOS detection becomes even more challenging. In this paper, we present a simple yet effective approach...More

Code:

Data:

0
Introduction
  • Intent detection is one of the core components when building goal-oriented dialog systems.
  • The authors train a matching model as a pairwise binary classifier to estimate whether an input utterance belongs to the same class of a paired example.
  • The authors expect this to free the model from having the OOS separation issue in Figure 1 (a) by avoiding explicit modeling of the intent classes.
Highlights
  • Intent detection is one of the core components when building goal-oriented dialog systems
  • We further propose to seamlessly transfer a natural language inference (NLI) model to enhance this clear separation (Figure 1 (d))
  • The comparison between discriminative nearest neighbor classification (DNNC)-scratch and DNNC shows that our NLI task transfer is effective
  • We have presented a simple yet efficient nearest-neighbor classification model to detect user intents and OOS intents
  • A seamless transfer from NLI and a joint approach with fast retrieval are designed to improve the performance in terms of the accuracy and inference speed
  • Because RoBERTa performed significantly better and more stably than the original BERT in our fewshot experiments
  • Experimental results show superior performance of our method on a large-scale multi-domain intent detection dataset with OOS
Methods
  • This section first describes how to directly model inter-utterance relations in the nearest neighbor classification scenario.
  • The authors introduce a binary classification strategy by synthesizing pairwise examples, and propose a seamless transfer of NLI.
  • 3.1 Deep Pairwise Matching Function.
  • As shown in Figure 1, the text embedding methods do not discriminate the OOS examples well enough.
  • To model fine-grained relations of utterance pairs to distinguish in-domain and OOS intents, 3.2 Discriminative Training
Results
  • In the 5-shot setting, the proposed DNNC method consistently attains the best results across all the four domains.
  • In the 10-shot setting, all the approaches generally experience an accuracy improvement due to the additional training data, and the dominant performance of DNNC weakens, it remains highly competitive.
  • The authors can see that the DNNC is comparable with or even surpasses some of the 50-shot classifier’s scores, and the data augmentation techniques are not always helpful when the authors use the strong pre-trained model
Conclusion
  • The authors have presented a simple yet efficient nearest-neighbor classification model to detect user intents and OOS intents.
  • It includes paired encoding and discriminative training to model relations between the input and example utterances.
  • Experimental results show superior performance of the method on a large-scale multi-domain intent detection dataset with OOS.
  • Future work includes its cross-lingual transfer and cross-dataset generalization
Tables
  • Table1: Training examples for our model. The first two examples ((a)–(b)) come from the CLINC150 dataset (<a class="ref-link" id="cLarson_et+al_2019_a" href="#rLarson_et+al_2019_a">Larson et al, 2019</a>), and the other three examples ((c)–(e)) come from the MNLI dataset (<a class="ref-link" id="cWilliams_et+al_2018_a" href="#rWilliams_et+al_2018_a">Williams et al, 2018</a>)
  • Table2: Dataset statistics. The number of the examples is equally distributed across the intent classes
  • Table3: Testing results on the four different domains
  • Table4: Testing results on the whole dataset (5 runs)
  • Table5: Comparison among the nearest neighbor methods on the test sets for the banking domain and all the domains. The latency is measured on a single NVIDIA Tesla V100 GPU, where the batch size is 1 to simulate an online use case. DNNC-joint is based on top-20 Emb-kNN retrieval
  • Table6: Development results on three NLI datasets
  • Table7: Some hyper-parameter settings for a few models
  • Table8: Best hyper-parameter settings for a few models on the all-domain experiments, where bs is batch size, ep represents epochs, lr is learning rate
  • Table9: Best hyper-parameter settings for a few models on the four single domains, where bs is batch size, ep represents epochs, lr is learning rate
  • Table10: Examples used to train clasifier-BT
  • Table11: Case studies on the development set of banking domain. The first two cases are in-domain examples from the banking domain, and the rest are OOS examples
Download tables as Excel
Related work
  • Interpretability Interpretability is an important line of research recently (Jiang et al, 2019; Sydorova et al, 2019; Asai et al, 2020). The nearest neighbor approach (Simard et al, 1993) is appealing in that we can explicitly know which training example triggers each prediction. Table 11 in Appendix C shows some examples.

    Call for better embeddings Emb-kNN and RNkNN are not as competitive as DNNC. This encourages future work on the task-oriented evaluation of text embeddings in kNN.

    Training time Our DNNC method needs longer training time than that of the classifier (e.g., 90 vs. 40 seconds to train a single-domain model), because we synthesize the pairwise examples. As a first step, we used all the training examples to investigate the effectiveness, but it is an interesting direction to seek more efficient pairwise training.
Funding
  • This work is supported in part by NSF under grants III-1763325, III-1909323, and SaTC-1930941
Study subjects and analysis
NLI datasets: 3
because RoBERTa performed significantly better and more stably than the original BERT in our fewshot experiments. We combine three NLI datasets, SNLI (Bowman et al, 2015), MNLI (Williams et al, 2018), and WNLI (Levesque et al, 2011) from the GLUE benchmark (Wang et al, 2018) to pre-train our proposed model. We apply label smoothing (Szegedy et al, 2016) to all the cross-entropy loss functions, which has been shown to improve the reliability of the model confidence (Muller et al, 2019)

NLI datasets: 3
Comparison among the nearest neighbor methods on the test sets for the banking domain and all the domains. The latency is measured on a single NVIDIA Tesla V100 GPU, where the batch size is 1 to simulate an online use case. DNNC-joint is based on top-20 Emb-kNN retrieval. Development results on three NLI datasets. Some hyper-parameter settings for a few models

Reference
  • Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richard Socher, and Caiming Xiong. 2020. Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering. In 8th International Conference on Learning Representations (ICLR).
    Google ScholarLocate open access versionFindings
  • Samuel Bowman and Xiaodan Zhu. 2019. Deep Learning for Natural Language Inference. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorials, pages 6–8.
    Google ScholarLocate open access versionFindings
  • Samuel R. Bowman, Gabor Angeli, Christopher Potts, and Christopher D. Manning. 2015. A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 632–642.
    Google ScholarLocate open access versionFindings
  • Inigo Casanueva, Tadas Temcinas, Daniela Gerz, Matthew Henderson, and Ivan Vulic. 2020. Efficient Intent Detection with Dual Sentence Encoders. arXiv preprint arXiv:2003.04807.
    Findings
  • Nathanael Chambers and Daniel Jurafsky. 2010. Improving the Use of Pseudo-Words for Evaluating Selectional Preferences. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 445–453.
    Google ScholarLocate open access versionFindings
  • Rajen Chatterjee, Christian Federmann, Matteo Negri, and Marco Turchi. 2019. Findings of the WMT 2019 Shared Task on Automatic Post-Editing. In Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2), pages 11–28.
    Google ScholarLocate open access versionFindings
  • Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. 201Reading Wikipedia to Answer OpenDomain Questions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1870– 1879.
    Google ScholarLocate open access versionFindings
  • Alexis Conneau, Ruty Rinott, Guillaume Lample, Adina Williams, Samuel Bowman, Holger Schwenk, and Veselin Stoyanov. 201XNLI: Evaluating Cross-lingual Sentence Representations. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2475– 2485.
    Google ScholarLocate open access versionFindings
  • Padraig Cunningham and Sarah Jane Delany. 2007. kNearest Neighbour Classifiers. Technical Report UCD-CSI-2007-04, School of Computer Science & Informatics, University College Dublin.
    Google ScholarFindings
  • Shumin Deng, Ningyu Zhang, Zhanlin Sun, Jiaoyan Chen, and Huajun Chen. 2019. When low resource nlp meets unsupervised language model: Metapretraining then meta-learning for few-shot text classification. arXiv, pages arXiv–1908.
    Google ScholarFindings
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186.
    Google ScholarLocate open access versionFindings
  • Terrance DeVries and Graham W. Taylor. 2018. Learning Confidence for Out-of-Distribution Detection in Neural Networks. arXiv preprint arXiv:1802.04865.
    Findings
  • Akiko Eriguchi, Melvin Johnson, Orhan Firat, Hideto Kazawa, and Wolfgang Macherey. 2018. ZeroShot Cross-lingual Classification Using Multilingual Neural Machine Translation. arXiv preprint arXiv:1809.04686.
    Findings
  • Li Fei-Fei, Rob Fergus, and Pietro Perona. 2006. Oneshot learning of object categories. IEEE transactions on pattern analysis and machine intelligence, 28(4):594–611.
    Google ScholarLocate open access versionFindings
  • Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 1126–1135. JMLR. org.
    Google ScholarLocate open access versionFindings
  • Juri Ganitkevitch, Benjamin Van Durme, and Chris Callison-Burch. 2013. PPDB: The Paraphrase Database. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 758–764.
    Google ScholarLocate open access versionFindings
  • Ruiying Geng, Binhua Li, Yongbin Li, Xiaodan Zhu, Ping Jian, and Jian Sun. 2019. Induction networks for few-shot text classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3895–3904.
    Google ScholarLocate open access versionFindings
  • Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, and Richard Socher. 2017. A Joint ManyTask Model: Growing a Neural Network for Multiple NLP Tasks. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1923–1933.
    Google ScholarLocate open access versionFindings
  • Matthew Henderson, Inigo Casanueva, Nikola Mrksic, Pei-Hao Su, Tsung-Hsien Wen, and Ivan Vulic. 20ConveRT: Efficient and Accurate Conversational Representations from Transformers. arXiv preprint arXiv:1911.03688.
    Findings
  • Dan Hendrycks and Kevin Gimpel. 2017. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks. In 5th International Conference on Learning Representations (ICLR).
    Google ScholarLocate open access versionFindings
  • 2020. Pretrained Transformers Improve Outof-Distribution Robustness.
    Google ScholarFindings
  • Yichen Jiang, Nitish Joshi, Yen-Chun Chen, and Mohit Bansal. 2019.
    Google ScholarFindings
  • Explore, Propose, and Assemble: An Interpretable Model for Multi-Hop Reading Comprehension. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2714–2725.
    Google ScholarLocate open access versionFindings
  • Jeff Johnson, Matthijs Douze, and Herve Jegou. 2017. Billion-scale similarity search with GPUs. arXiv preprint arXiv:1702.08734.
    Findings
  • Stefan Larson, Anish Mahendran, Joseph J. Peper, Christopher Clarke, Andrew Lee, Parker Hill, Jonathan K. Kummerfeld, Kevin Leach, Michael A. Laurenzano, Lingjia Tang, and Jason Mars. 2019. An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1311–1316.
    Google ScholarLocate open access versionFindings
  • Kenton Lee, Ming-Wei Chang, and Kristina Toutanova. 2019. Latent Retrieval for Weakly Supervised Open Domain Question Answering. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 6086–6096.
    Google ScholarLocate open access versionFindings
  • Hector J Levesque, Ernest Davis, and Leora Morgenstern. 2011. The Winograd schema challenge. AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning, 46:47.
    Google ScholarLocate open access versionFindings
  • Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692.
    Findings
  • Ilya Loshchilov and Frank Hutter. 2017. Fixing Weight Decay Regularization in Adam. arXiv preprint arXiv:1711.05101.
    Findings
  • Bingfeng Luo, Yansong Feng, Zheng Wang, Songfang Huang, Rui Yan, and Dongyan Zhao. 2018. Marrying Up Regular Expressions with Neural Networks: A Case Study for Spoken Language Understanding. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2083–2093.
    Google ScholarLocate open access versionFindings
  • Marco Marelli, Luisa Bentivogli, Marco Baroni, Raffaella Bernardi, Stefano Menini, and Roberto Zamparelli. 2014. SemEval-2014 Task 1: Evaluation of Compositional Distributional Semantic Models on Full Sentences through Semantic Relatedness and Textual Entailment. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages 1–8.
    Google ScholarLocate open access versionFindings
  • Rafael Muller, Simon Kornblith, and Geoffrey E Hinton. 2019. When does label smoothing help? In Advances in Neural Information Processing Systems 32, pages 4694–4703.
    Google ScholarLocate open access versionFindings
  • Matteo Negri, Marco Turchi, Rajen Chatterjee, and Nicola Bertoldi. 2018. ESCAPE: a Large-scale Synthetic Corpus for Automatic Post-Editing. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018).
    Google ScholarLocate open access versionFindings
  • Yixin Nie, Songhe Wang, and Mohit Bansal. 2019. Revealing the Importance of Semantic Retrieval for Machine Reading at Scale. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2553–2566.
    Google ScholarLocate open access versionFindings
  • Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. 2013. On the difficulty of training recurrent neural networks. In Proceedings of the 30th International Conference on Machine Learning, pages 1310–1318.
    Google ScholarLocate open access versionFindings
  • Abhinav Rastogi, Xiaoxue Zang, Srinivas Sunkara, Raghav Gupta, and Pranav Khaitan. 2019. Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset. arXiv preprint arXiv:1909.05855.
    Findings
  • Nils Reimers and Iryna Gurevych. 2019. SentenceBERT: Sentence Embeddings using Siamese BERTNetworks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992.
    Google ScholarLocate open access versionFindings
  • Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.
    Findings
  • Minjoon Seo, Jinhyuk Lee, Tom Kwiatkowski, Ankur Parikh, Ali Farhadi, and Hannaneh Hajishirzi. 2019. Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4430–4441.
    Google ScholarLocate open access versionFindings
  • Sam Shleifer. 2019. Low Resource Text Classification with ULMFit and Backtranslation. arXiv preprint arXiv:1903.09244.
    Findings
  • Patrice Simard, Yann LeCun, and John S. Denker. 1993. Efficient Pattern Recognition Using a New Transformation Distance. In Advances in Neural Information Processing Systems 5, pages 50–58.
    Google ScholarLocate open access versionFindings
  • Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical Networks for Few-shot Learning. In Advances in Neural Information Processing Systems 30, pages 4077–4087.
    Google ScholarLocate open access versionFindings
  • Shengli Sun, Qingfeng Sun, Kevin Zhou, and Tengchao Lv. 2019. Hierarchical Attention Prototypical Networks for Few-Shot Text Classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 476–485.
    Google ScholarLocate open access versionFindings
  • Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip H.S. Torr, and Timothy M. Hospedales. 2018. Learning to Compare: Relation Network for FewShot Learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1199–1208.
    Google ScholarLocate open access versionFindings
  • Alona Sydorova, Nina Poerner, and Benjamin Roth. 2019. Interpretable Question Answering on Knowledge Bases and Text. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4943–4951.
    Google ScholarLocate open access versionFindings
  • Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, and Jon Shlens. 2016. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
    Google ScholarLocate open access versionFindings
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30, pages 5998–6008.
    Google ScholarLocate open access versionFindings
  • Oriol Vinyals, Charles Blundell, Timothy Lillicrap, koray kavukcuoglu, and Daan Wierstra. 2016a. Matching Networks for One Shot Learning. In Advances in Neural Information Processing Systems 29, pages 3630–3638.
    Google ScholarLocate open access versionFindings
  • Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Daan Wierstra, et al. 2016b. Matching networks for one shot learning. In Advances in neural information processing systems, pages 3630–3638.
    Google ScholarLocate open access versionFindings
  • Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel Bowman. 2018. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 353–355.
    Google ScholarLocate open access versionFindings
  • Yusuke Watanabe, Bhuwan Dhingra, and Ruslan Salakhutdinov. 2017. Question Answering from Unstructured Text by Retrieval and Comprehension. arXiv preprint arXiv:1703.08885.
    Findings
  • Jason Wei and Kai Zou. 2019. EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 6382–6388.
    Google ScholarLocate open access versionFindings
  • John Wieting and Kevin Gimpel. 2018. ParaNMT50M: Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 451–462.
    Google ScholarLocate open access versionFindings
  • Adina Williams, Nikita Nangia, and Samuel Bowman. 2018. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1112–1122.
    Google ScholarLocate open access versionFindings
  • Chien-Sheng Wu, Steven Hoi, Richard Socher, and Caiming Xiong. 2020. ToD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogues. arXiv preprint arXiv:2004.06871.
    Findings
  • Congying Xia, Chenwei Zhang, Hoang Nguyen, Jiawei Zhang, and Philip Yu. 2020. Cg-bert: Conditional text generation with bert for generalized few-shot intent detection. arXiv preprint arXiv:2004.01881.
    Findings
  • Hu Xu, Bing Liu, Lei Shu, and P Yu. 2019. Openworld learning and application to product classification. In The World Wide Web Conference, pages 3413–3419.
    Google ScholarLocate open access versionFindings
  • Adams Wei Yu, David Dohan, Minh-Thang Luong, Rui Zhao, Kai Chen, Mohammad Norouzi, and Quoc V. Le. 2018. QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension. In 6th International Conference on Learning Representations (ICLR).
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科