AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
Even though current language models are effective performers in the zero-shot cross-lingual setting, there is still room for improvement, especially for far languages such as Japanese or Korean

XL WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization

EMNLP 2020, pp.7193-7206, (2020)

Cited by: 0|Views286
Full Text
Bibtex
Weibo

Abstract

The ability to correctly model distinct meanings of a word is crucial for the effectiveness of semantic representation techniques. However, most existing evaluation benchmarks for assessing this criterion are tied to sense inventories (usually WordNet), restricting their usage to a small subset of knowledge-based representation techniques...More

Code:

Data:

0
Introduction
  • One of the desirable properties of contextualized models, such as BERT (Devlin et al, 2019) and its derivatives, lies in their ability to associate dynamic representations to words, i.e., embeddings that can change depending on the context
  • This provides the basis for the model to distinguish different meanings of words without the need to resort to an explicit sense disambiguation step.
  • Despite allowing a significantly wider range of models for direct WSD evaluations, WiC is limited to the English language only, preventing the evaluation of models in other languages and in cross-lingual settings
Highlights
  • One of the desirable properties of contextualized models, such as BERT (Devlin et al, 2019) and its derivatives, lies in their ability to associate dynamic representations to words, i.e., embeddings that can change depending on the context
  • Evaluation benchmarks for WSD are usually tied to external sense inventories (often WordNet (Fellbaum, 1998)), making it extremely difficult to evaluate systems that do not explicitly model sense distinctions in the inventory, effectively restricting the benchmark to inventory-based sense representation techniques and WSD systems
  • Even though current language models are effective performers in the zero-shot cross-lingual setting, there is still room for improvement, especially for far languages such as Japanese or Korean
Methods
  • The authors implemented a simple, yet effective, baseline based on a Transformer-based text encoder (Vaswani et al, 2017) and a logistic regression classifier, following Wang et al (2019).
  • The encoded representations of the target words are concatenated and fed to the logistic classifier
  • For those cases where the target word was split by the tokenizer into multiple sub-tokens, the authors followed Devlin et al (2019) and considered the representation of its first sub-token.
  • As regards the text encoder, the authors carried out the experiments with three different multilingual models, i.e., the multilingual version of BERT (Devlin et al, 2019) and the base and large versions of XLM-RoBERTa (Conneau et al, 2020) (XLMR-base and XLMR-large, respec-.
  • As for all the other languages covered by the WordNet datasets, i.e., Bulgarian, Chinese, Croatian, Danish, Dutch, Estonian, Japanese and Korean, the authors used the pre-trained models made available by TurkuNLP.12 The authors refer to each language-specific model as L-BERT
Results
  • The authors report the results for the configurations discussed in the previous section on the XL-WiC benchmark.
  • Table 4 shows results on the XL-WiC WordNet test sets, when only.
  • Train: EN+Target Language – Dev: EN.
  • Train: EN+All Languages – Dev: EN.
Conclusion
  • In this paper the authors have introduced XL-WiC, a large benchmark for evaluating context-sensitive models.
  • In WiC (Pilehvar and Camacho-Collados, 2019), providing an evaluation framework for contextualized models in those languages, and for experimentation in a cross-lingual transfer setting.
  • Even though current language models are effective performers in the zero-shot cross-lingual setting, there is still room for improvement, especially for far languages such as Japanese or Korean.
  • While in the comparative analysis the authors have focused on a quantitative evaluation for all languages, an additional error analysis per language would be beneficial in revealing the weaknesses and limitations of cross-lingual models
Tables
  • Table1: Sample instances from XL-WiC for different languages
  • Table2: Statistics for WordNet and Wiktionary datasets for different languages
  • Table3: Human performance (in terms of accuracy) for different languages in XL-WiC. ∗From the original
  • Table4: Results on the WordNet test sets when using only English training data in WiC, either in zero-shot crosslingual setting (top block) or translation-based settings (the lower two blocks). T-EN is a target language dataset, automatically constructed by translating English instances in WiC
  • Table5: Results on the WordNet test sets when using language-specific data, either for training or for tuning
  • Table6: Results on the Wiktionary test sets in different training settings: zero-shot (Z-Shot) and monolingual training (Mono). L-BERT stands for language-specific models, i.e., BERT-de, CamemBERT-large and BERTit for German, French and Italian, respectively
  • Table7: Results on the in-vocabulary (IV) and outof-vocabulary (OOV) Wiktionary test sets. L-BERT stands for each language-specific model, i.e., BERTbase-de, camemBERT and BERT-base-xxl-it for German, French and Italian, respectively
  • Table8: Statistics and comparison between the Italian WordNet and the Italian Wiktionary WiC datasets. Zero-shot results are computed by using the original English WiC (<a class="ref-link" id="cPilehvar_2019_a" href="#rPilehvar_2019_a">Pilehvar and Camacho-Collados, 2019</a>) for training and development
  • Table9: Number of parameters for our comparison systems
  • Table10: Zero-shot results on mBERT, XLMR-base and XLMR-large on the WordNet-based datasets when using the English WiC training and development sets
  • Table11: Results on mBERT, XLMR-base and XLMR-large on the Wiktionary-based datasets when using the language-specific training and development data
  • Table12: Results on the WordNet test sets when using automatically-translated data with a multilingual dictionaryalignment technique
  • Table13: BLEU score of the translation models
Download tables as Excel
Related work
  • XL-WiC is a benchmark for inventory-independent evaluation of WSD models (Section 2.1), while the multilingual nature of the dataset makes it an interesting resource for experimenting with crosslingual transfer (Section 2.2).

    2.1 Word Sense Disambiguation

    The ability to identify the intended sense of a polysemous word in a given context is one of the fundamental problems in lexical semantics. It is usually addressed with two different kinds of approaches relying on either sense-annotated corpora (Bevilacqua and Navigli, 2020; Scarlini et al, 2020; Blevins and Zettlemoyer, 2020) or knowledge bases (Moro et al, 2014; Agirre et al, 2014; Scozzafava et al, 2020). Both are usually evaluated on dedicated benchmarks, including at least five WSD tasks in Senseval and SemEval series, from 2001 (Edmonds and Cotton, 2001) to 2015 (Moro and Navigli, 2015a) that are included in the Raganato et al (2017)’s test suite. All these tasks are framed as classification problems, where disambiguation of a word is defined as selecting one of the predefined senses of the word listed by a sense inventory. This brings about different limitations such as restricting senses only to those defined by the inventory, or forcing the WSD system to explicitly model sense distinctions at the granularity level defined by the inventory.

    Stanford Contextual Word Similarity (Huang et al, 2012) is one of the first datasets that focuses on ambiguity but outside the boundaries of sense inventories, and as a similarity measurement between two words in their contexts. Pilehvar and Camacho-Collados (2019) highlighted some of the limitations of the dataset that prevent a reliable evaluation, and proposed the Word-in-Context (WiC) dataset. WiC is the closest dataset to ours, which provides around 10K instances (1400 instances for 1184 unique target nouns and verbs in the test set), but for the English language only.
Funding
  • Alessandro gratefully acknowledges the support of the FoTran project, funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 771113), and the CSC – IT Center for Science, Finland, for computational resources
  • Tommaso gratefully acknowledges the support of the ERC Consolidator Grant MOUSSE No 726487 under the European Union’s Horizon 2020 research and innovation programme
Study subjects and analysis
datasets: 8
True or False label, depending on the the intended meanings of the word in the two contexts. Table 3 reports human performance for eight datasets in XL-WiC. All accuracy figures are around 80%, i.e., in the same ballpark as the original WiC English dataset, which attests the reliability of underlying resources and the construction procedure

Reference
  • Eneko Agirre, Oier Lopez de Lacalle, and Aitor Soroa. 2014. Random walks for knowledge-based word sense disambiguation. Computational Linguistics, 40(1):57–84.
    Google ScholarLocate open access versionFindings
  • Antonios Anastasopoulos and Graham Neubig. 2020. Should all cross-lingual embeddings speak English? In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages
    Google ScholarLocate open access versionFindings
  • Alessandro gratefully acknowledges the support of the FoTran project, funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 771113), and the CSC – IT Center for Science, Finland, for computational resources.
    Google ScholarFindings
  • Tommaso gratefully acknowledges the support of the ERC Consolidator Grant MOUSSE No. 726487 under the European Union’s Horizon 2020 research and innovation programme.
    Google ScholarFindings
  • 8658–8679, Online. Association for Computational Linguistics.
    Google ScholarFindings
  • Mikel Artetxe, Sebastian Ruder, and Dani Yogatama. 2020a. On the cross-lingual transferability of monolingual representations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4623–4637, Online. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Mikel Artetxe, Sebastian Ruder, Dani Yogatama, Gorka Labaka, and Eneko Agirre. 2020b. A call for more rigor in unsupervised cross-lingual learning. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7375–7388, Online. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Jordi Atserias, Luıs Villarejo, and German Rigau. 2004. Spanish WordNet 1.6: Porting the Spanish Wordnet across Princeton versions. In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04), Lisbon, Portugal. European Language Resources Association (ELRA).
    Google ScholarLocate open access versionFindings
  • Siamak Barzegar, Brian Davis, Manel Zarrouk, Siegfried Handschuh, and Andre Freitas. 2018. SemR-11: A multi-lingual gold-standard for semantic similarity and relatedness for eleven languages. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018), Miyazaki, Japan. European Languages Resources Association (ELRA).
    Google ScholarLocate open access versionFindings
  • Laura Benıtez, Sergi Cervell, Gerard Escudero, Monica Lopez, German Rigau, and Mariona Taule. 1998. Methods and tools for building the catalan wordnet. Proceedings of ELRA Workshop on Language Resources for European Minority Languages.
    Google ScholarLocate open access versionFindings
  • Michele Bevilacqua and Roberto Navigli. 2020. Breaking through the 80% Glass Ceiling: Raising the State of the Art in Word Sense Disambiguation by Incorporating Knowledge Graph Information. In Proc. of ACL, pages 2854–2864.
    Google ScholarLocate open access versionFindings
  • Terra Blevins and Luke Zettlemoyer. 2020. Moving Down the Long Tail of Word Sense Disambiguation with Gloss Informed Bi-encoders. In Proc. of ACL, pages 1006–1017.
    Google ScholarLocate open access versionFindings
  • Francis Bond and Kyonghee Paik. 2012. A survey of wordnets and their licenses. In GWC 2012 6th International Global Wordnet Conference, volume 8, pages 64–71.
    Google ScholarLocate open access versionFindings
  • Jose Camacho-Collados, Mohammad Taher Pilehvar, Nigel Collier, and Roberto Navigli. 2017. SemEval2017 task 2: Multilingual and cross-lingual semantic word similarity. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval2017), pages 15–26, Vancouver, Canada. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Daniel Cer, Mona Diab, Eneko Agirre, Inigo LopezGazpio, and Lucia Specia. 2017. SemEval-2017 task 1: Semantic textual similarity multilingual and crosslingual focused evaluation. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 1–14, Vancouver, Canada. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Christos Christodouloupoulos and Mark Steedman. 2015. A massively parallel corpus: the bible in 100 languages. Language resources and evaluation, 49(2):375–395.
    Google ScholarLocate open access versionFindings
  • Jonathan H Clark, Eunsol Choi, Michael Collins, Dan Garrette, Tom Kwiatkowski, Vitaly Nikolaev, and Jennimaria Palomaki. 2020. Tydi qa: A benchmark for information-seeking question answering in typologically diverse languages. Transactionsof the Association of Computational Linguistics.
    Google ScholarFindings
  • Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzman, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2020. Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8440– 8451, Online. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 20Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186.
    Google ScholarLocate open access versionFindings
  • Cecily Jill Duffield, Jena D Hwang, Susan Windisch Brown, Dmitriy Dligach, Sarah Vieweg, Jenny Davis, and Martha Palmer. 2007. Criteria for the manual grouping of verb senses. In Proceedings of the Linguistic Annotation Workshop, pages 49–52.
    Google ScholarLocate open access versionFindings
  • Philip Edmonds and Scott Cotton. 2001. SENSEVAL2: Overview. In Proceedings of SENSEVAL2 Second International Workshop on Evaluating Word Sense Disambiguation Systems, pages 1–5, Toulouse, France. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Mehrdad Farahani, Mohammad Gharachorloo, Marzieh Farahani, and Mohammad Manthouri. 2020. Parsbert: Transformer-based model for persian language understanding.
    Google ScholarFindings
  • Christiane Fellbaum, editor. 1998. WordNet: An Electronic Database. MIT Press, Cambridge, MA.
    Google ScholarFindings
  • Darja Fiser, Jernej Novak, and Tomaz Erjavec. 2012. slownet 3.0: development, extension and cleaning. In Proceedings of 6th International Global Wordnet Conference (GWC 2012), pages 113–117.
    Google ScholarLocate open access versionFindings
  • Hamidreza Ghader and Christof Monz. 2017. What does attention in neural machine translation pay attention to? In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 30–39, Taipei, Taiwan. Asian Federation of Natural Language Processing.
    Google ScholarLocate open access versionFindings
  • Xavier Gomez Guinovart. 2011. Galnet: Wordnet 3.0 do galego. Linguamatica, 3(1):61–67.
    Google ScholarLocate open access versionFindings
  • Birgit Hamp and Helmut Feldweg. 1997. Germanet-a lexical-semantic net for german. In Automatic information extraction and building of lexical semantic resources for NLP applications.
    Google ScholarFindings
  • Daniel Hershcovich, Zohar Aizenbud, Leshem Choshen, Elior Sulem, Ari Rappoport, and Omri Abend. 2019. SemEval-2019 task 1: Cross-lingual semantic parsing with UCCA. In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 1–10, Minneapolis, Minnesota, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, and Melvin Johnson. 2020. XTREME: A massively multilingual multitask benchmark for evaluating cross-lingual generalization. Proceedings of the 37th International Conference on Machine Learning (ICML).
    Google ScholarLocate open access versionFindings
  • Chu-Ren Huang, Shu-Kai Hsieh, Jia-Fei Hong, YunZhu Chen, I-Li Su, Yong-Xiang Chen, and ShengWei Huang. 2010. Chinese Wordnet: Design, Implementation and Application of an Infrastructure for Cross-Lingual Knowledge Processing. Journal of Chinese Information Processing, 24(2):14.
    Google ScholarLocate open access versionFindings
  • Eric Huang, Richard Socher, Christopher Manning, and Andrew Ng. 2012. Improving word representations via global context and multiple word prototypes. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 873–882, Jeju Island, Korea. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Hitoshi Isahara, Francis Bond, Kiyotaka Uchimoto, Masao Utiyama, and Kyoko Kanzaki. 2008. Development of the Japanese WordNet. In Sixth International conference on Language Resources and Evaluation.
    Google ScholarLocate open access versionFindings
  • Marcin Junczys-Dowmunt, Roman Grundkiewicz, Tomasz Dwojak, Hieu Hoang, Kenneth Heafield, Tom Neckermann, Frank Seide, Ulrich Germann, Alham Fikri Aji, Nikolay Bogoychev, Andre F. T. Martins, and Alexandra Birch. 2018. Marian: Fast neural machine translation in C++. In Proceedings of ACL 2018, System Demonstrations, pages 116– 121, Melbourne, Australia. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
    Google ScholarLocate open access versionFindings
  • Philipp Koehn and Rebecca Knowles. 2017. Six challenges for neural machine translation. In Proceedings of the First Workshop on Neural Machine Translation, pages 28–39, Vancouver. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Patrick Lewis, Barlas Oguz, Ruty Rinott, Sebastian Riedel, and Holger Schwenk. 2020. MLQA: Evaluating cross-lingual extractive question answering. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7315– 7330, Online. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Xintong Li, Guanlin Li, Lemao Liu, Max Meng, and Shuming Shi. 2019. On the word alignment from neural machine translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1293–1303, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Louis Martin, Benjamin Muller, Pedro Javier Ortiz Suarez, Yoann Dupont, Laurent Romary, Eric de la Clergerie, Djame Seddah, and Benoıt Sagot. 2020. CamemBERT: a tasty French language model. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7203–7219, Online. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Marton Mihaltz, Csaba Hatvani, Judit Kuti, Gyorgy Szarvas, Janos Csirik, Gabor Proszeky, and Tamas Varadi. 2008. Methods and Results of the Hungarian WordNet Project. In Proceedings of The Fourth Global WordNet Conference, pages 311–321.
    Google ScholarLocate open access versionFindings
  • George A Miller. 1995. WordNet: a lexical database for english. Communications of the ACM, 38(11):39–41.
    Google ScholarLocate open access versionFindings
  • Andrea Moro and Roberto Navigli. 2015a. SemEval2015 task 13: Multilingual all-words sense disambiguation and entity linking. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pages 288–297, Denver, Colorado. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Andrea Moro and Roberto Navigli. 2015b. Semeval2015 task 13: Multilingual all-words sense disambiguation and entity linking. Proceedings of SemEval-2015.
    Google ScholarLocate open access versionFindings
  • Andrea Moro, Alessandro Raganato, and Roberto Navigli. 2014. Entity Linking meets Word Sense Disambiguation: a Unified Approach. Transactions of the Association for Computational Linguistics (TACL), 2:231–244.
    Google ScholarLocate open access versionFindings
  • Roberto Navigli. 2009. Word Sense Disambiguation: A survey. ACM Computing Surveys, 41(2):1–69.
    Google ScholarLocate open access versionFindings
  • Roberto Navigli, David Jurgens, and Daniele Vannella. 2013. SemEval-2013 task 12: Multilingual word sense disambiguation. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), pages 222–231, Atlanta, Georgia, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Roberto Navigli and Simone Paolo Ponzetto. 2012. BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence, 193:217–250.
    Google ScholarLocate open access versionFindings
  • Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Bolette S. Pedersen, Sanni Nimb, Jørg Asmussen, Nicolai Hartvig Sørensen, Lars Trap-Jensen, and Henrik Lorentzen. 2009. DanNet: the challenge of compiling a wordnet for Danish by reusing a monolingual dictionary. Language Resources and Evaluation, 43:269–299.
    Google ScholarLocate open access versionFindings
  • Emanuele Pianta, Luisa Bentivogli, and Christian Girardi. 2002. MultiWordNet: Developing an Aligned Multilingual Database. In Proceedings of the 1st International Conference on Global WordNet, pages 293–302.
    Google ScholarLocate open access versionFindings
  • Mohammad Taher Pilehvar and Jose CamachoCollados. 2019. WiC: the Word-in-Context dataset for evaluating context-sensitive meaning representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1267–1273, Minneapolis, Minnesota.
    Google ScholarLocate open access versionFindings
  • Eli Pociello, Antton Gurrutxaga, Eneko Agirre, Izaskun Aldezabal, and German Rigau. 2008. WNTERM: Enriching the MCR with a terminological dictionary. In LREC 2008.
    Google ScholarLocate open access versionFindings
  • Marten Postma, Emiel van Miltenburg, Roxane Segers, Anneleen Schoen, and Piek Vossen. 2016. Open Dutch WordNet. In Proceedings of the Eight Global Wordnet Conference.
    Google ScholarLocate open access versionFindings
  • Quentin Pradet, Gael de Chalendar, and Jeanne Desormeaux Baguenier. 2014.
    Google ScholarFindings
  • Wonef, an improved, expanded and evaluated automatic french translation of wordnet. In Proceedings of the Seventh Global Wordnet Conference (GWC2014), pages 32–39.
    Google ScholarLocate open access versionFindings
  • Peng Qi, Yuhao Zhang, Yuhui Zhang, Jason Bolton, and Christopher D. Manning. 2020. Stanza: A Python Natural Language Processing Toolkit for Many Human Languages. In In Association for Computational Linguistics (ACL) System Demonstrations.
    Google ScholarLocate open access versionFindings
  • Ida Raffaelli, Marko Tadic, Bozo Bekavac, and Zeljko Agic. 2008. Building croatian wordnet. In Fourth global wordnet conference (gwc 2008).
    Google ScholarLocate open access versionFindings
  • Alessandro Raganato, Jose Camacho-Collados, and Roberto Navigli. 2017. Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison. In Proc. of EACL, pages 99–110.
    Google ScholarLocate open access versionFindings
  • Ervin Ruci. 2008. On the current state of Albanet and related applications. Technical report, University of Vlora. (http://fjalnet.com/technicalreportalbanet.pdf).
    Findings
  • Bianca Scarlini, Tommaso Pasini, and Roberto Navigli. 2020. With More Contexts Comes Better Performance: Contextualized Sense Embeddings for AllRound Word Sense Disambiguation. In Proc. of EMNLP.
    Google ScholarLocate open access versionFindings
  • Karin Kipper Schuler, Anna Korhonen, and Susan Brown. 2009. VerbNet overview, extensions, mappings and applications. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Tutorial Abstracts, NAACL-Tutorials ’09, page 13–14, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Federico Scozzafava, Marco Maru, Fabrizio Brignone, Giovanni Torrisi, and Roberto Navigli. 2020. Personalized pagerank with syntagmatic information for multilingual word sense disambiguation. In Proc. of ACL, pages 37–46.
    Google ScholarLocate open access versionFindings
  • Mehrnoush Shamsfard, Akbar Hesabi, Hakimeh Fadaei, Niloofar Mansoory, Payam Noor, Ali Reza Gholi Famian, Somayeh Bagherbeigi, Elham Fekri, and Maliheh Monshizadeh. 2010. Semi automatic development of FarsNet; the Persian WordNet. In Proceedings of 5th global WordNet conference.
    Google ScholarLocate open access versionFindings
  • Kiril Simov and Petya Osenova. 2010. Constructing of an Ontology-based Lexicon for Bulgarian. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10), Valletta, Malta. European Language Resources Association (ELRA).
    Google ScholarLocate open access versionFindings
  • Jorg Tiedemann. 2012. Parallel data, tools and interfaces in OPUS. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12), pages 2214–2218, Istanbul, Turkey. European Language Resources Association (ELRA).
    Google ScholarLocate open access versionFindings
  • Jorg Tiedemann and Santhosh Thottingal. 2020. OPUS-MT — Building open translation services for the World. In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation (EAMT), Lisbon, Portugal.
    Google ScholarLocate open access versionFindings
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems, pages 5998–6008.
    Google ScholarLocate open access versionFindings
  • Kadri Vider and Heili Orav. 2002. Estonian WordNet and Lexicography. In Proceedings of the Eleventh International Symposium on Lexicography, pages 549–555.
    Google ScholarLocate open access versionFindings
  • Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. 2019. SuperGLUE: A stickier benchmark for general-purpose language understanding systems. In Proceedings of NeurIPS.
    Google ScholarLocate open access versionFindings
  • Ae-Sun Yoon, Soon-Hee Hwang, Eun-Ryoung Lee, and Hyuk-Chul Kwon. 2009. Construction of Korean WordNet. Journal of KIISE: Software and Applications, 36(1):92–108.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科