AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
In this work we present a large-scale study focused on the correlations between monolingual embedding space similarity and task performance, covering thousands of language pairs and four different tasks: bilingual lexicon induction, parsing, POS tagging and machine translation

The Secret is in the Spectra: Predicting Cross lingual Task Performance with Spectral Similarity Measures

EMNLP 2020, pp.2377-2390, (2020)

被引用0|浏览126
下载 PDF 全文
引用
微博一下

摘要

Performance in cross-lingual NLP tasks is impacted by the (dis)similarity of languages at hand: e.g., previous work has suggested there is a connection between the expected success of bilingual lexicon induction (BLI) and the assumption of (approximate) isomorphism between monolingual embedding spaces. In this work we present a large-scal...更多

代码

数据

0
简介
  • The effectiveness of joint multilingual modeling and cross-lingual transfer in cross-lingual NLP is critically impacted by the actual languages in consideration (Bender, 2011; Ponti et al, 2019).
  • Selecting suitable source languages is a prerequisite for successful cross-lingual transfer of dependency parsers or POS taggers (Naseem et al, 2012; Ponti et al, 2018; de Lhoneux et al, 2018)
  • In another example, with all other factors kept similar, the quality of machine translation depends heavily on the properties and language proximity of the actual language pair (Kudugunta et al, 2019).
  • The authors derive measures for the isomorphism between two embedding spaces based on these statistics
重点内容
  • The effectiveness of joint multilingual modeling and cross-lingual transfer in cross-lingual NLP is critically impacted by the actual languages in consideration (Bender, 2011; Ponti et al, 2019)
  • We further show that our findings generalize beyond bilingual lexicon induction (BLI), to cross-lingual transfer in dependency parsing and POS tagging, and we demonstrate strong correlations with machine translation (MT) performance
  • The only exception is the MT task, where our measures fall short of Typological distance (TYP), we mark that they still hold a strong advantage over the baseline Gromov-Hausdorff distance (GH) and IS isomorphism measures that do not seem to capture any useful language similarity properties needed for the MT task
  • This work introduces two spectral-based measures, Singular Value Gap (SVG) and ECOND-harmonic mean function (HM), that excel in predicting performance on a variety of cross-lingual tasks
  • Both measures leverage information from singular values in different ways: ECOND-HM uses the ratio between two singular values, and is grounded in linear algebra and numerical analysis (Blum, 2014; Roy and Vetterli, 2007), while SVG directly utilizes the full range of singular values
  • While the spectral methods are computed solely on word vectors from Wikipedia, the results in the downstream tasks are computed with different sets of embeddings, or the embeddings are learnt during training
方法
  • BLI Methods in Comparison

    The scores in each BLI setup were computed by several state-of-theart BLI methods based on cross-lingual word embeddings, briefly described here. 1) SUP is the standard supervised method (Artetxe et al, 2016; Smith et al, 2017) that learns a mapping between two embedding spaces based on a training dictionary by solving the orthogonal Procrustes problem (Schonemann, 1966). 2) SUP+ is another standard supervised method that applies a variety of pre-processing and post-processing steps before and after learning the mapping matrix, see (Artetxe et al, 2018). 3) UNSUP is a fully unsupervised method based on the “similarity of monolingual similarities” heuristic to extract the seed dictionary from monolingual data.
  • The authors' analyses are conducted in three BLI setups (PanLex, MUSE, GTrans) and examine three types of state-of-the-art mapping-based methods, both supervised and unsupervised (SUP, SUP+, UNSUP).
  • These span 556 language pairs, and cover both related and distant languages.8.
  • The authors note that identical findings emerge from running the correlation analyses based on Precision@1 scores in lieu of MRR
结果
  • The results are summarized in Tables 1 and 2.
  • Based isomorphism measures are strongly correlated with performance across all tasks and settings.12.
  • They show the strongest individual correlations with task performance among all isomorphism measures and linguistic distances alike.
  • A general finding across all tasks is that the spectral measures are the most robust isomorphism measures: they substantially outperform the widely used baselines GH and IS
结论
  • Further Discussion and Conclusion

    This work introduces two spectral-based measures, SVG and ECOND-HM, that excel in predicting performance on a variety of cross-lingual tasks.
  • On the other hand, is to extract the true embedding dimensionality directly from the embedding space
  • Another recent study (Yin and Shen, 2018) employed perturbation analysis to study the robustness of embedding spaces to noise in monolingual settings, and established that it is related to effective dimensionality of the embedding space.
  • All these inspired them to replace the standard matrix rank with effective rank when computing the condition number, and to introduce the statistic of effective condition number in §2.1
总结
  • Introduction:

    The effectiveness of joint multilingual modeling and cross-lingual transfer in cross-lingual NLP is critically impacted by the actual languages in consideration (Bender, 2011; Ponti et al, 2019).
  • Selecting suitable source languages is a prerequisite for successful cross-lingual transfer of dependency parsers or POS taggers (Naseem et al, 2012; Ponti et al, 2018; de Lhoneux et al, 2018)
  • In another example, with all other factors kept similar, the quality of machine translation depends heavily on the properties and language proximity of the actual language pair (Kudugunta et al, 2019).
  • The authors derive measures for the isomorphism between two embedding spaces based on these statistics
  • Objectives:

    The authors' aim is to quantify the difference between two embedding spaces by comparing statistics of their singular values.
  • Methods:

    BLI Methods in Comparison

    The scores in each BLI setup were computed by several state-of-theart BLI methods based on cross-lingual word embeddings, briefly described here. 1) SUP is the standard supervised method (Artetxe et al, 2016; Smith et al, 2017) that learns a mapping between two embedding spaces based on a training dictionary by solving the orthogonal Procrustes problem (Schonemann, 1966). 2) SUP+ is another standard supervised method that applies a variety of pre-processing and post-processing steps before and after learning the mapping matrix, see (Artetxe et al, 2018). 3) UNSUP is a fully unsupervised method based on the “similarity of monolingual similarities” heuristic to extract the seed dictionary from monolingual data.
  • The authors' analyses are conducted in three BLI setups (PanLex, MUSE, GTrans) and examine three types of state-of-the-art mapping-based methods, both supervised and unsupervised (SUP, SUP+, UNSUP).
  • These span 556 language pairs, and cover both related and distant languages.8.
  • The authors note that identical findings emerge from running the correlation analyses based on Precision@1 scores in lieu of MRR
  • Results:

    The results are summarized in Tables 1 and 2.
  • Based isomorphism measures are strongly correlated with performance across all tasks and settings.12.
  • They show the strongest individual correlations with task performance among all isomorphism measures and linguistic distances alike.
  • A general finding across all tasks is that the spectral measures are the most robust isomorphism measures: they substantially outperform the widely used baselines GH and IS
  • Conclusion:

    Further Discussion and Conclusion

    This work introduces two spectral-based measures, SVG and ECOND-HM, that excel in predicting performance on a variety of cross-lingual tasks.
  • On the other hand, is to extract the true embedding dimensionality directly from the embedding space
  • Another recent study (Yin and Shen, 2018) employed perturbation analysis to study the robustness of embedding spaces to noise in monolingual settings, and established that it is related to effective dimensionality of the embedding space.
  • All these inspired them to replace the standard matrix rank with effective rank when computing the condition number, and to introduce the statistic of effective condition number in §2.1
表格
  • Table1: Correlations with BLI performance in three BLI setups, see §4.1. The best distance measure for each setup and BLI method is bolded. ris the score from the stepwise regression model, see §4.3. Superscripts indicate the distance measures that are statistically significant and included in the stepwise regression model (e.g., .911,3,6−8 means: SVG, ECOND-HM and all the linguistic distances have a combined contribution equivalent to 0.91 Pearson). *See the scatter plot in Appendix C
  • Table2: Correlations with performance in three other cross-lingual tasks: Machine Translation (MT), dependency parsing (DEP), and POS tagging. Results for the best distance measure are highlighted in bold. ris computed using the stepwise regression model (see §4.3)
  • Table3: Correlation scores in source-language (Source) and target-language (Target) selection analyses. The best distance measure per column is provided in bold. The percentage of cases a measure topped the others is shown in superscript (see details in Appendix B). rrefers to the unified correlation coefficient from the multiple regression model (see details in Appendix B)
  • Table4: Summary of all the languages included in our analyses. The numbers in each cell indicate the number of different language pairs where each language was included, per each task and dataset. IE refers to the IndoEuropean language group
Download tables as Excel
相关工作
  • Related Work and Baselines

    We now provide an overview of prior research that focused on two relevant themes: 1) measuring approximate isomorphism between two embedding spaces, and 2) more generally, quantifying the (dis)similarity between languages, going beyond isomorphism measures. The discussed approaches will also be used as the main baselines later in §5.

    Measuring Approximate Isomorphism. We focus on two standard isomorphism measures from prior work which are most similar to our work, and use them as our main baselines. The first measure, termed Isospectrality (IS) (Søgaard et al, 2018), is based on spectral analysis as well, but of the Laplacian eigenvalues of the nearest neighborhood graphs that originate from the initial embedding spaces X1 and X2 (for further technical details see Appendix A). Søgaard et al (2018) argue that these eigenvalues are compact representations of the graph Laplacian, and that their comparison reveals the degree of (approximate) isomorphism. Although similar in spirit to our approach, constructing nearest neighborhood graphs (and then analyzing their eigenvalues) removes useful information on the interaction between all vectors from the initial space, which our spectral method retains.
基金
  • The work of IV and AK is supported by the ERC Consolidator Grant LEXICAL: Lexical Acquisition Across Languages (no 648909) awarded to AK
  • HD is supported by the Blavatnik Postdoctoral Fellowship Programme
研究对象与分析
language pairs: 556
For more technical details on the fully unsupervised model, we refer the reader to prior work (Ruder et al, 2019a; Vulicet al., 2019).7

In sum, our analyses are conducted in three BLI setups (PanLex, MUSE, GTrans) and examine three types of state-of-the-art mapping-based methods, both supervised and unsupervised (SUP, SUP+, UNSUP). Altogether, these span 556 language pairs, and cover both related and distant languages.8. Following prior work (Glavaset al., 2019), our BLI evaluation measure is Mean Reciprocal Rank (MRR)

pairs: 8
While both IS and GH were reported to have strong correlations with BLI performance in prior work, they have not been evaluated in large-scale experiments before. In fact, the correlations were computed on a very small number of language pairs (IS: 8 pairs, GH: 10 pairs). Further, both measures do not scale well computationally

language pairs: 100
The conducted empirical analyses can be divided into two major parts. First, we run large-scale BLI analyses across several hundred language pairs from dozens of languages, comparing the correlation of spectral-based isomorphism measures (§2.2). and all baselines (§3) with performance of a wide spectrum of state-of-the-art BLI methods

language pairs: 210
BLI Setups and Scores. Vulicet al. (2019) ran BLI experiments on 210 language pairs, spanning 15 diverse languages. Their training and test dictionaries (5k and 2k translation pairs) are derived from PanLex (Baldwin et al, 2010; Kamholz et al, 2014)

pairs with additional 210 language pairs of 15 closely: 210
Their training and test dictionaries (5k and 2k translation pairs) are derived from PanLex (Baldwin et al, 2010; Kamholz et al, 2014). We complement the original 210 pairs with additional 210 language pairs of 15 closely related (European) languages using dictionaries extracted from PanLex following the procedure of Vulicet al. (2019). With the additional language set, the aim is

language pairs: 108
to probe if isomorphism measures can also capture more subtle and smaller language differences.6. We also analyze the BLI results of 108 language pairs from MUSE (Conneau et al, 2018). This dataset systematically covers English, with 88 translation pairs that involve English as either the source or target language

translation pairs: 88
We also analyze the BLI results of 108 language pairs from MUSE (Conneau et al, 2018). This dataset systematically covers English, with 88 translation pairs that involve English as either the source or target language. Finally, we analyze the available BLI results of Glavaset al. (2019) (referred to as GTrans) that are based on dictionaries obtained from Google Translate and include 28 language pairs spanning 8 different languages

language pairs: 28
This dataset systematically covers English, with 88 translation pairs that involve English as either the source or target language. Finally, we analyze the available BLI results of Glavaset al. (2019) (referred to as GTrans) that are based on dictionaries obtained from Google Translate and include 28 language pairs spanning 8 different languages. For the full list of language pairs involved in previous BLI studies, we refer the reader to prior work (Conneau et al, 2018; Glavaset al., 2019; Vulicet al., 2019)

language pairs: 556
In sum, our analyses are conducted in three BLI setups (PanLex, MUSE, GTrans) and examine three types of state-of-the-art mapping-based methods, both supervised and unsupervised (SUP, SUP+, UNSUP). Altogether, these span 556 language pairs, and cover both related and distant. 6The initial set of Vulicet al. (2019) comprises Bulgarian, Catalan, Esperanto, Estonian, Basque, Finnish, Hebrew, Hungarian, Indonesian, Georgian, Korean, Lithuanian, Norwegian, Thai, Turkish

language pairs: 210
6The initial set of Vulicet al. (2019) comprises Bulgarian, Catalan, Esperanto, Estonian, Basque, Finnish, Hebrew, Hungarian, Indonesian, Georgian, Korean, Lithuanian, Norwegian, Thai, Turkish. The additional 210 language pairs are only composed of Germanic, Romance and Slavic languages. For a full list of the languages see Table 4 in the appendix

pairs: 930
We base our analysis on the cross-lingual zero-shot parser transfer results of Lin et al (2019): The standard biaffine dependency parser (Dozat and Manning, 2017; Dozat et al, 2017) is trained on the training portions of Universal Dependencies (UD) treebanks from 31 languages (Nivre et al, 2018), and is then used to parse the test treebank of each language, now used as the target language. We report correlations between the language distance measures and the Labeled Attachment Scores (LAS) for all combinations of 31 languages, resulting in 930 pairs. POS Tagging

language pairs: 840
These scores span 26 low-resource target languages and 60 source languages which measure the utility of each source language to each of the 26 target languages in POS tagging. We use a sample of 840 language pairs for the correlation analysis, as 16 lowresource target languages and 49 source languages have readily available pretrained fastText vectors. 8We report all results for each BLI method, dictionary and language pairs in the supplementary material (and also here https://tinyurl.com/skn5cf7)

BLI datasets: 3
The only exception is the MT task, where our measures fall short of TYP (see Table 2), although we mark that they still hold a strong advantage over the baseline GH and IS isomorphism measures that do not seem to capture any useful language similarity properties needed for the MT task. ECOND-HM systematically outperforms CONDHM on 2 of 3 BLI datasets and 2 of 3 downstream tasks, validating our assumption that discarding the smallest singular values reduces noise. Additionally, SVG shows greater stability across tasks and datasets than ECOND-HM

language pairs: 420
The results demonstrate this across all tasks and settings (see bottom rows of the tables). For instance, when combining spectral measures with the linguistic distances, the regression model reaches outstanding correlation scores up to r = .91 on PanLex BLI (Table 1); with 420 language pairs, PanLex is the most comprehensive BLI dataset in our study. In addition, GH and IS are not chosen as significant regressors in the stepwise regression model, which indicates that they capture less information than our spectral methods.13

引用论文
  • Zeljko Agic. 2017. Cross-lingual parser selection for low-resource languages. In Proceedings of the NoDaLiDa 2017 Workshop on Universal Dependencies (UDW 2017), pages 1–10.
    Google ScholarLocate open access versionFindings
  • Sanjeev Arora, Nadav Cohen, Wei Hu, and Yuping Luo. 2019. Implicit regularization in deep matrix factorization. In Proceedings of NeurIPS, pages 7411– 7422.
    Google ScholarLocate open access versionFindings
  • Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2016. Learning principled bilingual mappings of word embeddings while preserving monolingual invariance. In Proceedings of EMNLP, pages 2289–2294.
    Google ScholarLocate open access versionFindings
  • Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2018. A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings. In Proceedings of ACL, pages 789–798.
    Google ScholarLocate open access versionFindings
  • Mikel Artetxe, Sebastian Ruder, and Dani Yogatama. 2020. On the cross-lingual transferability of monolingual representations. In Proceedings of ACL.
    Google ScholarLocate open access versionFindings
  • Timothy Baldwin, Jonathan Pool, and Susan Colowick. 2010. PanLex and LEXTRACT: Translating all words of all languages of the world. In Proceedings of COLING (Demo Papers), pages 37–40.
    Google ScholarLocate open access versionFindings
  • Antonio Valerio Miceli Barone. 2016. Towards crosslingual distributed representations without parallel text trained with adversarial autoencoders. In Proceedings of the 1st Workshop on Representation Learning for NLP, pages 121–126.
    Google ScholarLocate open access versionFindings
  • Emily M. Bender. 2011. On achieving and evaluating language-independence in NLP. Linguistic Issues in Language Technology, 6(3):1–26.
    Google ScholarLocate open access versionFindings
  • Malavika Bhaskaranand and Jerry D Gibson. 2010. Spectral entropy-based quantization matrices for H264/AVC video coding. In 2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers, pages 421–425.
    Google ScholarLocate open access versionFindings
  • Johannes Bjerva, Robert Ostling, Maria Han Veiga, Jorg Tiedemann, and Isabelle Augenstein. 2019. What do language representations really represent? Computational Linguistics, 45(2):381–389.
    Google ScholarLocate open access versionFindings
  • Lenore Blum. 2014. Alan Turing and the other theory of computation (expanded), volume 42 of Lecture Notes in Logic, pages 48–69. Cambridge University Press.
    Google ScholarLocate open access versionFindings
  • Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5:135–146.
    Google ScholarLocate open access versionFindings
  • Rachel Carrington, Karthik Bharath, and Simon Preston. 2019. Invariance and identifiability issues for word embeddings. In Proceedings of NeurIPS, pages 15114–15123.
    Google ScholarLocate open access versionFindings
  • Frederic Chazal, David Cohen-Steiner, Leonidas J Guibas, Facundo Memoli, and Steve Y Oudot. 2009. Gromov-Hausdorff stable signatures for shapes using persistence. In Computer Graphics Forum, volume 28, pages 1393–1403.
    Google ScholarLocate open access versionFindings
  • Alexis Conneau, Guillaume Lample, Marc’Aurelio Ranzato, Ludovic Denoyer, and Herve Jegou. 2018. Word translation without parallel data. In Proceedings of ICLR.
    Google ScholarLocate open access versionFindings
  • Ryan Cotterell and Georg Heigold. 2017. Crosslingual character-level neural morphological tagging. In Proceedings of EMNLP, pages 748–759.
    Google ScholarLocate open access versionFindings
  • Yerai Doval, Jose Camacho-Collados, Luis EspinosaAnke, and Steven Schockaert. 2019. On the robustness of unsupervised and semi-supervised cross-lingual word embedding learning. CoRR, abs/1908.07742.
    Findings
  • Timothy Dozat and Christopher D. Manning. 2017. Deep biaffine attention for neural dependency parsing. In Proceedings of ICLR.
    Google ScholarLocate open access versionFindings
  • Timothy Dozat, Peng Qi, and Christopher D. Manning. 2017. Stanford’s graph-based neural dependency parser at the CoNLL 2017 shared task. In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 20–30.
    Google ScholarLocate open access versionFindings
  • Norman R. Draper and Harry Smith. 1998. Applied Regression Analysis, 3rd Edition. John Wiley & Sons.
    Google ScholarFindings
  • Matthew S. Dryer and Martin Haspelmath, editors. 2013. WALS Online. Max Planck Institute for Evolutionary Anthropology, Leipzig.
    Google ScholarFindings
  • Julian Eisenschlos, Sebastian Ruder, Piotr Czapla, Marcin Kadras, Sylvain Gugger, and Jeremy Howard. 2019. MultiFiT: Efficient multi-lingual language model fine-tuning. In Proceedings of EMNLP, pages 5702–5707.
    Google ScholarLocate open access versionFindings
  • John Rupert Firth. 1957. A synopsis of linguistic theory, 1930-1955. Studies in Linguistic Analysis.
    Google ScholarLocate open access versionFindings
  • William Ford. 2015. Chapter 15 - the singular value decomposition. In William Ford, editor, Numerical Linear Algebra with Applications, pages 299 – 320. Academic Press, Boston.
    Google ScholarLocate open access versionFindings
  • Daniela Gerz, Ivan Vulic, Edoardo Maria Ponti, Jason Naradowsky, Roi Reichart, and Anna Korhonen. 2018. Language modeling for morphologically rich languages: Character-aware modeling for wordlevel prediction. Transactions of the Association for Computational Linguistics, 6:451–465.
    Google ScholarLocate open access versionFindings
  • Goran Glavas, Robert Litschko, Sebastian Ruder, and Ivan Vulic. 2019. How to (properly) evaluate crosslingual word embeddings: On strong baselines, comparative analyses, and some misconceptions. In Proceedings of ACL, pages 710–721.
    Google ScholarLocate open access versionFindings
  • Zellig S. Harris. 1954. Distributional structure. Word, 10(23):146–162.
    Google ScholarLocate open access versionFindings
  • N.J. Higham, M.R. Dennis, P. Glendinning, P.A. Martin, F. Santosa, and J. Tanner. 2015. The Princeton Companion to Applied Mathematics. Princeton University Press.
    Google ScholarFindings
  • Ronald R. Hocking. 1976. The analysis and selection of variables in linear regression. Biometrics, 32(1):1–49.
    Google ScholarLocate open access versionFindings
  • Armand Joulin, Piotr Bojanowski, Tomas Mikolov, Herve Jegou, and Edouard Grave. 2018. Loss in translation: Learning bilingual word mapping with a retrieval criterion. In Proceedings of EMNLP, pages 2979–2984.
    Google ScholarLocate open access versionFindings
  • David Kamholz, Jonathan Pool, and Susan M. Colowick. 2014. PanLex: Building a resource for panlingual lexical translation. In Proceedings of LREC, pages 3145–3150.
    Google ScholarLocate open access versionFindings
  • Sneha Kudugunta, Ankur Bapna, Isaac Caswell, and Orhan Firat. 2019. Investigating multilingual NMT representations at scale. In Proceedings of EMNLPIJCNLP, pages 1565–1575.
    Google ScholarLocate open access versionFindings
  • Olwijn Leeuwenburgh and Rob Arts. 2014. Distance parameterization for efficient seismic history matching with the ensemble kalman filter. Computational Geosciences, 18(3-4):535–548.
    Google ScholarLocate open access versionFindings
  • Miryam de Lhoneux, Johannes Bjerva, Isabelle Augenstein, and Anders Søgaard. 2018. Parameter sharing between dependency parsers for related languages. In Proceedings of EMNLP, pages 4992–4997.
    Google ScholarLocate open access versionFindings
  • Yu-Hsiang Lin, Chian-Yu Chen, Jean Lee, Zirui Li, Yuyan Zhang, Mengzhou Xia, Shruti Rijhwani, Junxian He, Zhisong Zhang, Xuezhe Ma, Antonios Anastasopoulos, Patrick Littell, and Graham Neubig. 2019. Choosing transfer languages for cross-lingual learning. In Proceedings of ACL, pages 3125–3135.
    Google ScholarLocate open access versionFindings
  • Patrick Littell, David R. Mortensen, Ke Lin, Katherine Kairis, Carlisle Turner, and Lori Levin. 2017. URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors. In Proceedings of EACL, pages 8–14.
    Google ScholarLocate open access versionFindings
  • Tomas Mikolov, Quoc V Le, and Ilya Sutskever. 2013. Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168.
    Findings
  • Tahira Naseem, Regina Barzilay, and Amir Globerson. 2012. Selective sharing for multilingual dependency parsing. In Proceedings of ACL, pages 629–637.
    Google ScholarLocate open access versionFindings
  • Joakim Nivre, Mitchell Abrams, Zeljko Agic, Lars Ahrenberg, Lene Antonsen, Katya Aplonova, Maria Jesus Aranzabe, et al. 2018. Universal Dependencies 2.3.
    Google ScholarFindings
  • Helen O’Horan, Yevgeni Berzak, Ivan Vulic, Roi Reichart, and Anna Korhonen. 2016. Survey on the use of typological information in natural language processing. In Proceedings of COLING, pages 1297– 1308.
    Google ScholarLocate open access versionFindings
  • Barun Patra, Joel Ruben Antony Moniz, Sarthak Garg, Matthew R. Gormley, and Graham Neubig. 2019. Bilingual lexicon induction with semi-supervision in non-isometric embedding spaces. In Proceedings of ACL, pages 184–193.
    Google ScholarLocate open access versionFindings
  • Telmo Pires, Eva Schlinger, and Dan Garrette. 2019. How multilingual is multilingual BERT? In Proceedings of ACL, pages 4996–5001.
    Google ScholarLocate open access versionFindings
  • Edoardo Maria Ponti, Helen O’Horan, Yevgeni Berzak, Ivan Vulic, Roi Reichart, Thierry Poibeau, Ekaterina Shutova, and Anna Korhonen. 2019. Modeling language variation and universals: A survey on typological linguistics for natural language processing. Computational Linguistics, 45(3):559–601.
    Google ScholarLocate open access versionFindings
  • Edoardo Maria Ponti, Roi Reichart, Anna Korhonen, and Ivan Vulic. 2018. Isomorphic transfer of syntactic structures in cross-lingual NLP. In Proceedings of ACL, pages 1531–1542.
    Google ScholarLocate open access versionFindings
  • Vikas Raunak, Vivek Gupta, and Florian Metze. 2019. Effective dimensionality reduction for word embeddings. In Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), pages 235–243.
    Google ScholarLocate open access versionFindings
  • Olivier Roy and Martin Vetterli. 2007. The effective rank: A measure of effective dimensionality. In Proceedings of the 15th European Signal Processing Conference, pages 606–610.
    Google ScholarLocate open access versionFindings
  • Sebastian Ruder, Anders Søgaard, and Ivan Vulic. 2019a. Unsupervised cross-lingual representation learning. In Proceedings of ACL: Tutorial Abstracts, pages 31–38.
    Google ScholarLocate open access versionFindings
  • Sebastian Ruder, Ivan Vulic, and Anders Søgaard. 2019b. A survey of cross-lingual word embedding models. Journal of Artificial Intelligence Research, 65:569–631.
    Google ScholarLocate open access versionFindings
  • Peter H. Schonemann. 1966. A generalized solution of the orthogonal Procrustes problem. Psychometrika, 31(1):1–10.
    Google ScholarLocate open access versionFindings
  • Samuel L. Smith, David H.P. Turban, Steven Hamblin, and Nils Y. Hammerla. 2017. Offline bilingual word vectors, orthogonal transformations and the inverted softmax. In Proceedings of ICLR.
    Google ScholarLocate open access versionFindings
  • Anders Søgaard, Sebastian Ruder, and Ivan Vulic. 2018. On the limitations of unsupervised bilingual dictionary induction. In Proceedings of ACL, pages 778–788.
    Google ScholarLocate open access versionFindings
  • Vladimir Tourbabin and Boaz Rafaely. 2015. Direction of arrival estimation using microphone array processing for moving humanoid robots. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 23(11):2046–2058.
    Google ScholarLocate open access versionFindings
  • Ivan Vulic, Goran Glavas, Roi Reichart, and Anna Korhonen. 2019. Do we really need fully unsupervised cross-lingual embeddings? In Proceedings of EMNLP, pages 4398–4409.
    Google ScholarLocate open access versionFindings
  • Ivan Vulic, Sebastian Ruder, and Anders Søgaard. 2020. Are all good word vector spaces isomorphic? In Proceedings of EMNLP.
    Google ScholarLocate open access versionFindings
  • Yu Wang. 2019. Single training dimension selection for word embedding with PCA. In Proceedings of EMNLP-IJCNLP, pages 3588–3593.
    Google ScholarLocate open access versionFindings
  • Søren Wichmann, Andre Muller, Viveka Velupillai, Cecil H Brown, Eric W Holman, Pamela Brown, Sebastian Sauppe, Oleg Belyaev, Matthias Urban, Zarina Molochieva, et al. 2018. The ASJP database (version 18).
    Google ScholarLocate open access versionFindings
  • Zi Yin and Yuanyuan Shen. 2018. On the dimensionality of word embedding. In Proceedings of NeurIPS, pages 887–898.
    Google ScholarLocate open access versionFindings
  • Meng Zhang, Yang Liu, Huanbo Luan, and Maosong Sun. 2017. Earth mover’s distance minimization for unsupervised bilingual lexicon induction. In Proceedings of EMNLP, pages 1934–1945.
    Google ScholarLocate open access versionFindings
  • Isospectrality (IS) After length-normalizing the vectors, Søgaard et al. (2018) compute the nearest neighbor graphs using a subset of the top N most frequent words in each space, and then calculate the Laplacian matrices LP1 and LP2 of each graph.
    Google ScholarLocate open access versionFindings
  • Computing GH directly is computationally intractable in practice, but it can be tractably approximated by computing the Bottleneck distance between the metric spaces (Chazal et al., 2009).
    Google ScholarFindings
  • We also observe interesting patterns in the selection analyses for the POS tagging task in Table 3: While the results in the target-language selection analysis largely follow the main-text results, the same does not hold for source-language selection (Table 3, POS Target and Source columns). We speculate that this is in fact an artefact of the experimental design of Lin et al. (2019). Their set of target languages deliberately comprises only truly low-resource languages, and such languages are expected to have lower-quality embedding spaces. Transferring to such languages is bound to fail with most source languages regardless of the actual source-target language similarity. The difficulty of this setting is reflected in the actual scores: average accuracy scores for the best source-target combination is 0.55 in the source-language selection analysis, and 0.92 for target-language selection.
    Google ScholarLocate open access versionFindings
您的评分 :
0

 

标签
评论
小科