Active Learning for Coreference Resolution using Discrete Annotation

ACL, pp. 8320-8331, 2020.

Cited by: 0|Bibtex|Views80
EI
Other Links: arxiv.org|dblp.uni-trier.de|academic.microsoft.com
Weibo:
An attractive alternative to pairwise annotation in active learning of coreference resolution in low-resource domains

Abstract:

We improve upon pairwise annotation for active learning in coreference resolution, by asking annotators to identify mention antecedents if a presented mention pair is deemed not coreferent. This simple modification, when combined with a novel mention clustering algorithm for selecting which examples to label, is extremely cost-efficient...More
0
Introduction
  • Coreference resolution is the task of resolving anaphoric expressions to their antecedents.
  • Coreference resolution model The authors use the span ranking model introduced by Lee et al (2017), and later implemented in AllenNLP framework (Gardner et al, 2018)
  • This model computes span embeddings for all possible spans i in a document, and uses them to compute a probability distribution P (y = ant(i)) over the set of all candidate antecedents Y(i) = {K previous mentions in the document} ∪ { }, where is a dummy antecedent signifying that span i has no antecedent.
  • To incorporate these binary annotations into their clustering coreference model, Sachan et al (2015) introduced the notion of must-link and cannot-link penalties, which the authors describe and extend in Section 4
Highlights
  • Coreference resolution is the task of resolving anaphoric expressions to their antecedents
  • Active learning consists of two components: (1) a taskspecific learning algorithm, and (2) an iterative sample selection algorithm, which examines the performance of the model trained at the previous iteration and selects samples to add to the annotated
  • Our work relies on two main components: a coreference resolution model and a sample selection algorithm
  • The three non-random active learning frameworks outperform the fully-labelled baseline, showing that active learning is more effective for coreference resolution when annotation budget is limited
  • An attractive alternative to pairwise annotation in active learning of coreference resolution in low-resource domains
  • Our work suggests that improvements in annotation interfaces can elicit responses which are highly cost-effective in terms of the obtained performance versus the invested annotation time
Results
  • Figure 2 plots the performance of discrete annotation with the various selectors from Section 4, against the performance of pairwise annotation, calibrated according to the timing experiments.
  • The three non-random active learning frameworks outperform the fully-labelled baseline, show-.
  • Ing that active learning is more effective for coreference resolution when annotation budget is limited.
  • Figure 2 shows that every nonrandom discrete selection protocol outperforms pairwise annotation.
  • Discrete annotation −clustered probabilities −incremental link closures Pairwise annotation
  • Where the gap in performance is the largest (> 15 minutes per document), the authors consistently improve by ∼4% absolute F 1 over pairwise selection.
Conclusion
  • An attractive alternative to pairwise annotation in active learning of coreference resolution in low-resource domains.
  • By adding a simple question to the annotation interface, the authors obtained significantly better models per human-annotation hour.
  • The authors introduced a clustering technique which further optimizes sample selection during the annotation process.
  • The authors' work suggests that improvements in annotation interfaces can elicit responses which are highly cost-effective in terms of the obtained performance versus the invested annotation time
Summary
  • Introduction:

    Coreference resolution is the task of resolving anaphoric expressions to their antecedents.
  • Coreference resolution model The authors use the span ranking model introduced by Lee et al (2017), and later implemented in AllenNLP framework (Gardner et al, 2018)
  • This model computes span embeddings for all possible spans i in a document, and uses them to compute a probability distribution P (y = ant(i)) over the set of all candidate antecedents Y(i) = {K previous mentions in the document} ∪ { }, where is a dummy antecedent signifying that span i has no antecedent.
  • To incorporate these binary annotations into their clustering coreference model, Sachan et al (2015) introduced the notion of must-link and cannot-link penalties, which the authors describe and extend in Section 4
  • Results:

    Figure 2 plots the performance of discrete annotation with the various selectors from Section 4, against the performance of pairwise annotation, calibrated according to the timing experiments.
  • The three non-random active learning frameworks outperform the fully-labelled baseline, show-.
  • Ing that active learning is more effective for coreference resolution when annotation budget is limited.
  • Figure 2 shows that every nonrandom discrete selection protocol outperforms pairwise annotation.
  • Discrete annotation −clustered probabilities −incremental link closures Pairwise annotation
  • Where the gap in performance is the largest (> 15 minutes per document), the authors consistently improve by ∼4% absolute F 1 over pairwise selection.
  • Conclusion:

    An attractive alternative to pairwise annotation in active learning of coreference resolution in low-resource domains.
  • By adding a simple question to the annotation interface, the authors obtained significantly better models per human-annotation hour.
  • The authors introduced a clustering technique which further optimizes sample selection during the annotation process.
  • The authors' work suggests that improvements in annotation interfaces can elicit responses which are highly cost-effective in terms of the obtained performance versus the invested annotation time
Tables
  • Table1: Timing experiments sampling. For each of the 2 datasets, we collected 60 total active learning questions from 20 documents. We collected 5 documents and 15 questions for each of the 4 categories: trained with many/few labels per document, and early/late in active learning process. The 15 questions were sampled randomly from within an iteration
  • Table2: Average annotation time for the initial pairwise question, the discrete followup question, and the discrete question on its own
  • Table3: Ablations over the different model elements, at a single point (∼315 annotation hours). Entropy selector was used for all experiments
Download tables as Excel
Reference
  • Pradeep Dasigi, Nelson F. Liu, Ana Marasovic, Noah A. Smith, and Matt Gardner. 2019. Quoref: A reading comprehension dataset with questions requiring coreferential reasoning. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).
    Google ScholarLocate open access versionFindings
  • Matt Gardner, Joel Grus, Mark Neumann, Oyvind Tafjord, Pradeep Dasigi, Nelson F. Liu, Matthew E. Peters, Michael Schmitz, and Luke S. Zettlemoyer. 2018. Allennlp: A deep semantic natural language processing platform. CoRR, abs/1803.07640.
    Findings
  • Dan Garrette and Jason Baldridge. 201Learning a part-of-speech tagger from two hours of annotation. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 138–147, Atlanta, Georgia. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Caroline Gasperin. 2009. Active learning for anaphora resolution. In Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing, HLT ’09, pages 1–8, Stroudsburg, PA, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Matthew Honnibal and Ines Montani. 2017. spacy 2: Natural language understanding with bloom embeddings, convolutional neural networks and incremental parsing. To appear.
    Google ScholarFindings
  • Mahnoosh Kholghi, Laurianne Sitbon, Guido Zuccon, and Anthony Nguyen. 2015. Active learning: a step towards automating medical concept extraction. Journal of the American Medical Informatics Association, 23(2):289–296.
    Google ScholarLocate open access versionFindings
  • Florian Laws, Florian Heimerl, and Hinrich Schutze. 2012. Active learning for coreference resolution. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 508–512, Montreal, Canada. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Kenton Lee, Luheng He, Mike Lewis, and Luke S. Zettlemoyer. 2017. End-to-end neural coreference resolution. ArXiv, abs/1707.07045.
    Findings
  • Sameer Pradhan, Alessandro Moschitti, Nianwen Xue, Olga Uryupina, and Yuchen Zhang. 2012. Conll2012 shared task: Modeling multilingual unrestricted coreference in ontonotes. In Joint Conference on EMNLP and CoNLL-Shared Task, pages 1– 40. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Mrinmaya Sachan, Eduard Hovy, and Eric P. Xing. 2015. An active learning approach to coreference resolution. In Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI’15, pages 1312–1318. AAAI Press.
    Google ScholarLocate open access versionFindings
  • Burr Settles. 2010. Active learning literature survey. University of Wisconsin, Madison, 52(55-66):11.
    Google ScholarFindings
  • Gabriel Stanovsky, Noah A. Smith, and Luke Zettlemoyer. 2019. Evaluating gender bias in machine translation. In ACL, page (to appear), Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • A. R. Syed, A. Rosenberg, and E. Kislal. 2016. Supervised and unsupervised active learning for automatic speech recognition of low-resource languages. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5320–5324.
    Google ScholarLocate open access versionFindings
  • A. R. Syed, A. Rosenberg, and M. Mandel. 2017. Active learning for low-resource speech recognition: Impact of selection size and language modeling data. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5315–5319.
    Google ScholarLocate open access versionFindings
  • Shanheng Zhao and Hwee Tou Ng. 2014. Domain adaptation with active learning for coreference resolution. In Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis (Louhi), pages 21–29, Gothenburg, Sweden. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • 2. Clustered query-by-committee. To ensure we do not choose a mention we have already queried for, after each user judgment, for every M L(a, b) relation, we set V (A(a) = b) = M, and V (A(a) = c) = 0 for all other c = b. Moreover, for every CL(a, b) relation, we set V (A(a) = b) = 0, which decreases the vote entropy of a, making it less likely for the selector to choose a.
    Google ScholarFindings
  • 1. If pair (a, b) was added to must-link, both must-link and cannot-link needs to be updated. First, resolve the MLs by adding a ML relationship between every element in A and every element in B:
    Google ScholarFindings
  • 2. If pair (a, b) was added to cannot-link, only cannot-link needs to be updated. Add a CL relationship between every element of A and every element of B:
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments