Toward zero-shot Entity Recognition in Task-oriented Conversational Agents.

Simone Magnolini
Simone Magnolini
Vevake Balaraman
Vevake Balaraman

SIGDIAL Conference, pp. 317-326, 2018.

Cited by: 5|Views22
EI
Weibo:
We present a domain portable zero-shot learning approach for entity recognition in task-oriented conversational agents, which does not assume any annotated sentences at training time

Abstract:

We present a domain portable zero-shot learning approach for entity recognition in task-oriented conversational agents, which does not assume any annotated sentences at training time. Rather, we derive a neural model of the entity names based only on available gazetteers, and then apply the model to recognize new entities in the context o...More

Code:

Data:

0
ZH
Full Text
Bibtex
Weibo
Introduction
  • In this paper the authors focus on user utterance understanding, where a conversational system has to interpret the content of a user dialogue turn.
  • The authors propose an entity recognition method, the authors call it gazetteerbased, which takes advantage of available entity names for a certain category to train a neural model that is applied to label new unseen entities in a user utterance.
  • I-FOOD two main entity classes are distinguished: named entities and nominal entities
  • The authors focus on the latter, as this is more relevant for utterance understanding in the e-commerce scenario.
  • Compositionality is crucial in the approach, as the authors take advantage of it to syntethically generate negative training examples for a certain entity category, as detailed in Section 4.1
Highlights
  • In this paper we focus on user utterance understanding, where a conversational system has to interpret the content of a user dialogue turn
  • Our working hypothesis is that, in such scenarios, current entity recognition approaches based on supervision, need a huge amount of supervision to manage the variety of entity names, which would make those approaches ineffective in most practical situations
  • We run several experiments on three ecommerce domains and two languages (English and Italian), with different characteristics in terms of entity names, and show that: (i) the gazetteer-based approach significantly outperforms the pattern-based approach in our domains and languages; (ii) the method captures linguistic properties of the entity names related to their compositionality, which are reliable indicators of the complexity of the task
  • We report additional metrics that try to grasp the complexity of entity name in the gazetteer: (i) the normalized type-token ratio (TTR), as a rough measure of how much lexical diversity there is for the nominal entities in a gazetteer, see (Richards, 1987); (ii) the ratio of type1 tokens, i.e. tokens that can appear in the first position of an entity name and in other positions, and type2 tokens, i.e. tokens appearing at the end and elsewhere; (iii) the ratio of entities that contain another entity as sub-part of their name
  • We have provided experimental evidence that zero-shot entity recognition based on gazetteers is highly performing
  • This is the first time that a neural model has been applied to capture compositionality of entity names
Methods
  • The authors run a set of experiments to assess the best feature configuration for the gazetteer-based approach.
  • It should be noted that the effect of entity name complexity emerges clearly from the experiments: all the approaches tend to be affected by it.
  • In both languages the authors have the following order in term of performances food < furniture < clothing.
  • Still, being the other token type almost 0, either the beginning or the end of an entity name is unambiguous, and in case of adjacent entities in a sentence this is enough to recognize the boundaries between the two
Results
  • The authors run two different sets of experiments to explore the impact of compositionality on the task of entity recognition.
  • The first set was meant to find the optimal feature configuration for NNg, and the second one was the comparison of the three main approaches over the six SU datasets
Conclusion
  • The authors have provided experimental evidence that zero-shot entity recognition based on gazetteers is highly performing.
  • To the knowledge, this is the first time that a neural model has been applied to capture compositionality of entity names.
  • Due to the scarcity of annotated utterances, the proposed approach is recommendable for its portability through different domains and languages.
  • As for the future, the authors intend to test the approach on natural utterances
Summary
  • Introduction:

    In this paper the authors focus on user utterance understanding, where a conversational system has to interpret the content of a user dialogue turn.
  • The authors propose an entity recognition method, the authors call it gazetteerbased, which takes advantage of available entity names for a certain category to train a neural model that is applied to label new unseen entities in a user utterance.
  • I-FOOD two main entity classes are distinguished: named entities and nominal entities
  • The authors focus on the latter, as this is more relevant for utterance understanding in the e-commerce scenario.
  • Compositionality is crucial in the approach, as the authors take advantage of it to syntethically generate negative training examples for a certain entity category, as detailed in Section 4.1
  • Methods:

    The authors run a set of experiments to assess the best feature configuration for the gazetteer-based approach.
  • It should be noted that the effect of entity name complexity emerges clearly from the experiments: all the approaches tend to be affected by it.
  • In both languages the authors have the following order in term of performances food < furniture < clothing.
  • Still, being the other token type almost 0, either the beginning or the end of an entity name is unambiguous, and in case of adjacent entities in a sentence this is enough to recognize the boundaries between the two
  • Results:

    The authors run two different sets of experiments to explore the impact of compositionality on the task of entity recognition.
  • The first set was meant to find the optimal feature configuration for NNg, and the second one was the comparison of the three main approaches over the six SU datasets
  • Conclusion:

    The authors have provided experimental evidence that zero-shot entity recognition based on gazetteers is highly performing.
  • To the knowledge, this is the first time that a neural model has been applied to capture compositionality of entity names.
  • Due to the scarcity of annotated utterances, the proposed approach is recommendable for its portability through different domains and languages.
  • As for the future, the authors intend to test the approach on natural utterances
Tables
  • Table1: IOB annotation of food entities inside user request
  • Table2: Gazetteers used in the experiments. Description in terms of number of entity names, total number of tokens, average length and standard deviation (SD) of entities, type-token ratio (TTR, norm obtained by repeated sampling of 200 tokens), type1 and type2 unique tokens ratio and sub-entity ratio
  • Table3: Examples of intents and corresponding templates used to generate test utterances
  • Table4: Average F1 and standard deviation for various features configurations of NNg over the six SG data sets (three domains and two languages)
  • Table5: Experimental results (F1) over the six domain-language data sets
  • Table6: Results (F1) of the three approaches according to the number of entities in the SU datasets
Download tables as Excel
Funding
  • This work has been partially supported by the AdeptMind scholarship, and by the CBF EIT Digital project
Study subjects and analysis
SG data sets: 6
Experiments with NNg on SG

We run a set of experiments to assess the best feature configuration for the gazetteer-based approach. In Table 4 we report the overall results of NNg using different feature configurations, over the six SG data sets. The topological configuration of NNg is kept constant, as described in Section 4

SG data sets: 6
We run a set of experiments to assess the best feature configuration for the gazetteer-based approach. In Table 4 we report the overall results of NNg using different feature configurations, over the six SG data sets. The topological configuration of NNg is kept constant, as described in Section 4

SU data sets: 5
The NNg version that uses only gazetteer features (i.e. no linguistic knowledge is assumed), even if not reported in Table 5, showed to perform more poorly than the version using all features. Still, it is competitive against NNp, outperforming it in five SU data sets out of six, and providing an average F1 improvement of 10 points. Finally, in Table 6 we report the results of an additional analysis, where we computed the F1 scores according to the number of entities present in the test sentences (all domain and languages)

SG data sets: 6
Examples of intents and corresponding templates used to generate test utterances. Average F1 and standard deviation for various features configurations of NNg over the six SG data sets (three domains and two languages). Experimental results (F1) over the six domain-language data sets

domain-language data sets: 6
Average F1 and standard deviation for various features configurations of NNg over the six SG data sets (three domains and two languages). Experimental results (F1) over the six domain-language data sets. Results (F1) of the three approaches according to the number of entities in the SU datasets

Reference
  • Sadaf Abdul-Rauf, Holger Schwenk, Patrik Lambert, and Mohammad Nawaz. 2016. Empirical use of information retrieval to build synthetic data for SMT domain adaptation. IEEE/ACM Trans. Audio, Speech & Language Processing 24(4):745–754.
    Google ScholarLocate open access versionFindings
  • Ankur Bapna, Gokhan Tur, Dilek Hakkani-Tur, and Larry Heck. 2017. Towards zero shot frame semantic parsing for domain scaling. In Interspeech 2017.
    Google ScholarLocate open access versionFindings
  • Joe Cheri and Pushpak Bhattacharyya. 2017. Towards harnessing memory networks for coreference resolution. In Proceedings of the 2nd Workshop on Representation Learning for NLP. pages 37–42.
    Google ScholarLocate open access versionFindings
  • George Doddington, Alexis Mitchell, Mark Przybocki, Lance Ramshaw, Stephanie Strassel, and Ralph Weischedel. 200The automatic content extraction (ace) program - tasks, data, and evaluation. In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC-2004). European Language Resources Association (ELRA), Lisbon, Portugal. ACL Anthology Identifier: L04-1011. http://www.lrecconf.org/proceedings/lrec2004/pdf/5.pdf.
    Locate open access versionFindings
  • Tome Eftimov, Barbara Korousic Seljak, and Peter Korosec. 2017. A rule-based named-entity recognition method for knowledge extraction of evidence-based dietary recommendations. PLoS ONE 12(6):e0179488.
    Google ScholarLocate open access versionFindings
  • Layla El Asri, Hannes Schulz, Shikhar Sharma, Jeremie Zumer, Justin Harris, Emery Fine, Rahul Mehrotra, and Kaheer Suleman. 2017. Frames: a corpus for adding memory to goal-oriented dialogue systems. In Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue. Association for Computational Linguistics, pages 207–219. http://aclweb.org/anthology/W17-5526.
    Locate open access versionFindings
  • Mohamed Elhoseiny, Babak Saleh, and Ahmed Elgammal. 2013. Write a classifier: Zero-shot learning using purely textual descriptions. In Computer Vision (ICCV), 2013 IEEE International Conference on. IEEE, pages 2584–2591.
    Google ScholarLocate open access versionFindings
  • 2017. Key-value retrieval networks for taskoriented dialogue.
    Google ScholarFindings
  • http://arxiv.org/abs/1705.05414.
    Findings
  • Shizhu He, Cao Liu, Kang Liu, and Jun Zhao. 2017. Generating natural answers by incorporating copying and retrieving mechanisms in sequence-tosequence learning. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). volume 1, pages 199–208.
    Google ScholarLocate open access versionFindings
  • Joseph E. Hoag. 2008. Synthetic Data Generation: Theory, Techniques and Applications. Ph.D. thesis, Fayetteville, AR, USA. AAI3317844.
    Google ScholarFindings
  • Max Jaderberg, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014.
    Google ScholarFindings
  • http://arxiv.org/abs/1406.2227.
    Findings
  • Dyer. 2016. Neural architectures for named entity recognition.
    Google ScholarFindings
  • http://arxiv.org/abs/1603.01360.
    Findings
  • Mohammad Norouzi, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea Frome, Greg Corrado, and Jeffrey Dean. 2013. Zero-shot learning by convex combination of semantic embeddings. CoRR abs/1312.5650. http://arxiv.org/abs/1312.5650.
    Findings
  • Emanuele Pianta, Christian Girardi, and Roberto Zanoli. 2008. The textpro tool suite. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08). Marrakech, Morocco. Http://www.lrecconf.org/proceedings/lrec2008/.
    Locate open access versionFindings
  • Lance A. Ramshaw and Mitchell P. Marcus. 1995. Text chunking using transformation-based learning. CoRR cmp-lg/9505040. http://arxiv.org/abs/cmplg/9505040.
    Findings
  • Brian Richards. 1987. Type/token ratios: What do they really tell us? Journal of child language 14(2):201– 209.
    Google ScholarLocate open access versionFindings
  • M. Schuster and K.K. Paliwal. 1997. Bidirectional recurrent neural networks. Trans. Sig. Proc. 45(11):2673–2681. https://doi.org/10.1109/78.650093.
    Locate open access versionFindings
  • Pararth Shah, Dilek Hakkani-Tur, Gokhan Tur, Abhinav Rastogi, Ankur Bapna, Neha Nayak, and Larry P. Heck. 2018. Building a conversational agent overnight with dialogue self-play. CoRR abs/1801.04871.
    Findings
  • Richard Socher, Milind Ganjoo, Christopher D Manning, and Andrew Ng. 2013. Zero-shot learning through cross-modal transfer. In Advances in Neural Information Processing Systems 26, Curran Associates, Inc., pages 935– 943. http://papers.nips.cc/paper/5027-zero-shotlearning-through-cross-modal-transfer.pdf.
    Locate open access versionFindings
  • Erik F Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the conll-2003 shared task: Language-independent named entity recognition. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4. Association for Computational Linguistics, pages 142–147.
    Google ScholarLocate open access versionFindings
  • Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M Rush, Bart van Merrienboer, Armand Joulin, and Tomas Mikolov. 2015. Towards ai-complete question answering: A set of prerequisite toy tasks. arXiv preprint arXiv:1502.05698.
    Findings
  • Sihong Xie, Shaoxiong Wang, and Philip S. Yu. 2016. Active zero-shot learning. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM, New York, NY, USA, CIKM ’16, pages 1889–1892. https://doi.org/10.1145/2983323.2983866.
    Locate open access versionFindings
Your rating :
0

 

Tags
Comments