We use offthe-shelf distributed word representation tools to encourage a subset of translation table entries that are common between semantically similar words
Using Word Vectors to Improve Word Alignments for Low Resource Machine Translation.
NAACL-HLT, pp.524-528, (2018)
We present a method for improving word alignments using word similarities. This method is based on encouraging common alignment links between semantically similar words. We use word vectors trained on monolingual data to estimate similarity. Our experiments on translating fifteen languages into English show consistent BLEU score improveme...更多
下载 PDF 全文
- Word alignments are essential for statistical machine translation (MT), especially in low-resource settings where neural MT systems often do not compete with phrase-based and syntax-based MT (Koehn and Knowles, 2017).
- Works that deal with the rare-word problem in word alignment include those that alter the probability distribution of IBM models’ parameters by adding prior distributions (Vaswani et al, 2012; Mermer and Saraclar, 2011), smoothing the probabilities (Moore, 2004; Zhang and Chiang, 2014; Van Bui and Le, 2016) or introducing symmetrization (Liang et al, 2006; Pourdamghani et al, 2014)
- These works, effective, merely rely on the information extracted from the parallel data.
- These methods need languagespecific knowledge or tools like morphological analyzers or syntax parsers that is costly and time consuming to obtain for any given language
- Word alignments are essential for statistical machine translation (MT), especially in low-resource settings where neural machine translation systems often do not compete with phrase-based and syntax-based machine translation (Koehn and Knowles, 2017)
- Our work addresses a major problem of previous works, which is taking substitutability for synonymy without discrimination
- Machine translation accuracy is tested on fifteen languages were we show a consistent BLEU score improvement
- We use offthe-shelf distributed word representation tools to encourage a subset of translation table entries that are common between semantically similar words
- The authors improve the alignment of rare words by encouraging them to align to what their semantic neighbors align to.
- Distributed word representation methods like (Mikolov et al, 2013; Pennington et al, 2014) often define word similarity as the ability to substitute one word for another given a context.
- Some words might have multiple meanings and a semantically simi-
- In this paper the authors present a method for improving word alignments using word similarities.
- The method is simple and yet efficient.
- The authors use offthe-shelf distributed word representation tools to encourage a subset of translation table entries that are common between semantically similar words.
- End-to-end experiments on translating 15 languages into English, as well as alignmentaccuracy experiments for three languages, show consistent improvement over the baseline
- Table1: Data split and size of monolingual data (tokens) for different languages. For parallel data, size refers to the number of English plus foreign language tokens
- Table2: Machine translation experiments (BLEU). For languages with less than 10M monolingual tokens (first five) we only use Le, otherwise we use both lexicons Le+Lf . This way we improve baseline for almost all languages
- Table3: Word alignment experiments (alignment precision/recall/f-score). The proposed method (Le + Lf ) improves baseline in all cases
- If we put the threshold at 10M tokens of monolingual data, we im- This work was supported by DARPA contract HR0011-15-C-0115
- Peter F. Brown, Vincent J. Della Pietra Stephen A. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational linguistics 19(2).
- Ciprian Chelba, Tomas Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, and Tony Robinson. 2013. One billion word benchmark for measuring progress in statistical language modeling. arXiv preprint arXiv:1312.3005.
- Colin Cherry and Dekang Lin. 2006. Soft syntactic constraints for word alignment through discriminative training. In Proc. COLING.
- Adria De Gispert, Deepa Gupta, Maja Popovic, Patrik Lambert, Jose B Marino, Marcello Federico, Hermann Ney, and Rafael Banchs. 2006. Improving statistical word alignments with morpho-syntactic transformations. In Advances in Natural Language Processing.
- Victoria Fossum, Kevin Knight, and Steven Abney. 2008. Using syntax to improve word alignment precision for syntax-based machine translation. In Proc. Workshop on Statistical Machine Translation.
- Ulf Hermjakob. 2009. Improved word alignment with statistics and linguistic heuristics. In Proc. EMNLP.
- Tomas Kocisky, Karl Moritz Hermann, and Phil Blunsom. 2014. Learning bilingual word representations by marginalizing alignments. arXiv preprint arXiv:1405.0947.
- Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, et al. 2007. Moses: Open source toolkit for statistical machine translation. In Proc. ACL, interactive poster and demonstration sessions.
- Philipp Koehn and Rebecca Knowles. 2017. Six challenges for neural MT. In Proc. Workshop on Neural Machine Translation.
- Robert C Moore. 2004. Improving IBM wordalignment model 1. In Proc. ACL.
- Franz Josef Och and Hermann Ney. 2003. A systematic comparison of various statistical alignment models. Computational linguistics 29(1).
- Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proc. EMNLP.
- Mohammad Taher Pilevar, Heshaam Faili, and Abdol Hamid Pilevar. 2011. Tep: Tehran EnglishPersian parallel corpus. In Proc. CICLing.
- Nima Pourdamghani, Yang Gao, Ulf Hermjakob, and Kevin Knight. 20Aligning English strings with abstract meaning representation graphs. In Proc. EMNLP.
- Theerawat Songyot and David Chiang. 2014. Improving word alignment using word similarity. In Proc. EMNLP.
- Jorg Tiedemann. 2003. Combining clues for word alignment. In Proc. EACL.
- Kristina Toutanova, H Tolga Ilhan, and Christopher D Manning. 2002. Extensions to HMM-based statistical word alignment models. In Proc. EMNLP.
- Vuong Van Bui and Cuong Anh Le. 2016. Smoothing parameter estimation framework for IBM word alignment models. arXiv preprint arXiv:1601.03650.
- Ashish Vaswani, Liang Huang, and David Chiang. 2012. Smaller alignment models for better translations: unsupervised word alignment with the l0norm. In Proc. ACL.
- Hui Zhang and David Chiang. 2014. Kneser-Ney smoothing on expected counts. In Proc. ACL.
- Young-Suk Lee. 2004. Morphological analysis for statistical machine translation. In Proc. NAACL.
- Percy Liang, Ben Taskar, and Dan Klein. 2006. Alignment by agreement. In Proc. NAACL.
- Jeff Ma, Spyros Matsoukas, and Richard Schwartz. 2011. Improving low-resource statistical machine translation with a novel semantic word clustering algorithm. Proc. MT Summit XIII.
- Coskun Mermer and Murat Saraclar. 2011. Bayesian word alignment for statistical machine translation. In Proc. ACL.
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.