Treebank Embedding Vectors for Out-of-domain Dependency Parsing

ACL, pp. 8812-8818, 2020.

Cited by: 1|Views20
EI
Weibo:
In experiments with Czech, English and French, we investigated treebank embedding vectors, exploring the ideas of interpolated vectors and vector weight prediction

Abstract:

A recent advance in monolingual dependency parsing is the idea of a treebank embedding vector, which allows all treebanks for a particular language to be used as training data while at the same time allowing the model to prefer training data from one treebank over others and to select the preferred treebank at test time. We build on thi...More
0
Full Text
Bibtex
Weibo
Introduction
  • The Universal Dependencies project (Nivre et al, 2016) has made available multiple treebanks for the same language annotated according to the same scheme, leading to a new wave of research which explores ways to use multiple treebanks in monolingual parsing (Shi et al, 2017; Sato et al, 2017; Che et al, 2017; Stymne et al, 2018).

    Stymne et al (2018) introduced a treebank embedding.
  • A single model is trained on the concatenation of the available treebanks for a language, and the input vector for each training token includes the treebank embedding which encodes the treebank the token comes from.
  • Treebank embeddings perform at about the same level as training on multiple treebanks and tuning on one, but they argue that a treebank embedding approach is preferable since it results in just one model per language
Highlights
  • The Universal Dependencies project (Nivre et al, 2016) has made available multiple treebanks for the same language annotated according to the same scheme, leading to a new wave of research which explores ways to use multiple treebanks in monolingual parsing (Shi et al, 2017; Sato et al, 2017; Che et al, 2017; Stymne et al, 2018).

    Stymne et al (2018) introduced a treebank embedding
  • We explore the usefulness of interpolated treebank vectors which are computed via a weighted combination of the predefined fixed ones
  • We develop a simple k-NN method based on sentence similarity to choose a treebank vector, either fixed or interpolated, for sentences or entire test sets, which, for 9 of our 10 test languages matches the performance of the best proxy treebank
  • The k-NN predictor clearly outperforms the random predictor for English and French, but not for Czech, suggesting that the treebank vector itself plays less of a role for Czech, perhaps due to high domain overlap between the treebanks
  • In experiments with Czech, English and French, we investigated treebank embedding vectors, exploring the ideas of interpolated vectors and vector weight prediction
  • Labelled attachment score is usually constant in large areas and there are clear, sharp steps to the labelled attachment score level
Results
  • The development results, averaged over the four development sets for each language, are shown in Tables 1 and 2.4 As discussed above, upper bounds for k-NN prediction are calculated by including an oracle setting in which the query item is added to the set of items to be retrieved, and k restricted to 1.
  • The authors are curious to see what happens when an equal combination of the three fixed vectors is used, and when treebank vectors are selected at random.
  • Table 1 shows the se-se results.
  • The oracle k-NN results indicate the substantial room for improvement for the predictor, and the potential of interpolated vectors since the results improve as the sample space is increased beyond the three fixed vectors
Conclusion
  • In experiments with Czech, English and French, the authors investigated treebank embedding vectors, exploring the ideas of interpolated vectors and vector weight prediction.
  • Testing on PUD languages, the authors match the performance of using the best fixed treebank embedding vector in nine of ten cases within the bounds of statistical significance and in five cases exactly match it
  • On the whole, it seems that the predictor is not yet good enough to find interpolated treebank vectors that are clearly superior to the basic, fixed vectors and that the authors know to exist from the oracle runs.
  • The authors plan to explore other methods to predict treebank vectors, e. g. neural sequence modelling, and to apply the ideas to the related task of language embedding prediction for zero-shot learning
Summary
  • Introduction:

    The Universal Dependencies project (Nivre et al, 2016) has made available multiple treebanks for the same language annotated according to the same scheme, leading to a new wave of research which explores ways to use multiple treebanks in monolingual parsing (Shi et al, 2017; Sato et al, 2017; Che et al, 2017; Stymne et al, 2018).

    Stymne et al (2018) introduced a treebank embedding.
  • A single model is trained on the concatenation of the available treebanks for a language, and the input vector for each training token includes the treebank embedding which encodes the treebank the token comes from.
  • Treebank embeddings perform at about the same level as training on multiple treebanks and tuning on one, but they argue that a treebank embedding approach is preferable since it results in just one model per language
  • Results:

    The development results, averaged over the four development sets for each language, are shown in Tables 1 and 2.4 As discussed above, upper bounds for k-NN prediction are calculated by including an oracle setting in which the query item is added to the set of items to be retrieved, and k restricted to 1.
  • The authors are curious to see what happens when an equal combination of the three fixed vectors is used, and when treebank vectors are selected at random.
  • Table 1 shows the se-se results.
  • The oracle k-NN results indicate the substantial room for improvement for the predictor, and the potential of interpolated vectors since the results improve as the sample space is increased beyond the three fixed vectors
  • Conclusion:

    In experiments with Czech, English and French, the authors investigated treebank embedding vectors, exploring the ideas of interpolated vectors and vector weight prediction.
  • Testing on PUD languages, the authors match the performance of using the best fixed treebank embedding vector in nine of ten cases within the bounds of statistical significance and in five cases exactly match it
  • On the whole, it seems that the predictor is not yet good enough to find interpolated treebank vectors that are clearly superior to the basic, fixed vectors and that the authors know to exist from the oracle runs.
  • The authors plan to explore other methods to predict treebank vectors, e. g. neural sequence modelling, and to apply the ideas to the related task of language embedding prediction for zero-shot learning
Tables
  • Table1: Development set LAS with per sentence tree-
  • Table2: Development set LAS with one treebank vecbank vectors tor for all input sentences k-NN oracle with k = 1 knows exactly what treebank vector is best for each test item while a basic k-NN model has to predict the best vector based on the training data. In the tr-tr setting, our k-NN classifier is selecting one of three treebanks for the fourth test treebank. In the oracle k-NN setting, it selects the test treebank itself and parses the sentences in that treebank with its best-performing treebank vector. When the treebank vector sample space is limited to the vectors for the three training treebanks (fixed), this method is the same as the best-proxy method of <a class="ref-link" id="cStymne_et+al_2018_a" href="#rStymne_et+al_2018_a">Stymne et al (2018</a>)
  • Table3: PUD Test Set Results: Statistically significant differences between proxy-best and our best method are marked with †
Download tables as Excel
Funding
  • This research is supported by Science Foundation Ireland through the ADAPT Centre for Digital Content Technology, which is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund
Reference
  • Miguel Ballesteros, Chris Dyer, and Noah A. Smith. 2015. Improved transition-based parsing by modeling characters instead of words with lstms. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 349–359, Lisbon, Portugal. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Wanxiang Che, Jiang Guo, Yuxuan Wang, Bo Zheng, Huaipeng Zhao, Yang Liu, Dechuan Teng, and Ting Liu. 2017. The hit-scir system for end-to-end parsing of universal dependencies. In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 52– 6Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Wanxiang Che, Yijia Liu, Yuxuan Wang, Bo Zheng, and Ting Liu. 2018. Towards better UD parsing: Deep contextualized word embeddings, ensemble, and treebank concatenation. In Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 55–64, Brussels, Belgium. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Danqi Chen and Christopher Manning. 201A fast and accurate dependency parser using neural networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 740–750. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Eliyahu Kiperwasser and Yoav Goldberg. 2016. Simple and accurate dependency parsing using bidirectional lstm feature representations. Transactions of the Association for Computational Linguistics, 4:313–327.
    Google ScholarLocate open access versionFindings
  • Miryam de Lhoneux, Sara Stymne, and Joakim Nivre. 2017. Arc-hybrid non-projective dependency parsing with a static-dynamic oracle. In Proceedings of the 15th International Conference on Parsing Technologies, pages 99–104, Pisa, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Yoav Goldberg, Jan Hajic, Christopher D Manning, Ryan McDonald, Slav Petrov, Sampo Pyysalo, Natalia Silveira, Reut Tsarfaty, and Daniel Zeman. 2016. Universal dependencies v1: A multilingual treebank collection. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), pages 1659–1666, Paris, France. European Language Resources Association (ELRA).
    Google ScholarLocate open access versionFindings
  • Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 201Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 2227–2237, New Orleans, Louisiana. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 619–625. Association for Computational Linguistics.
    Google ScholarFindings
  • Daniel Zeman, Jan Hajic, Martin Popel, Martin Potthast, Milan Straka, Filip Ginter, Joakim Nivre, and Slav Petrov. 2018. CoNLL 2018 shared task: Multilingual parsing from raw text to universal dependencies. In Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 1–21, Brussels, Belgium. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Motoki Sato, Hitoshi Manabe, Hiroshi Noji, and Yuji Matsumoto. 2017. Adversarial training for crossdomain universal dependency parsing. In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 71–79. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Tianze Shi, Felix G. Wu, Xilun Chen, and Yao Cheng. 2017. Combining global models for parsing universal dependencies. In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 31–39. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Sara Stymne, Miryam de Lhoneux, Aaron Smith, and Joakim Nivre. 2018. Parser training with heterogeneous treebanks. In Proceedings of the 56th Annual
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments