Nematus: a Toolkit for Neural Machine Translation

EACL (Software Demonstrations), 2017.

Cited by: 233|Bibtex|Views120
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com|arxiv.org
Weibo:
We have presented Nematus, a toolkit for Neural Machine Translation

Abstract:

We present Nematus, a toolkit for Neural Machine Translation. The toolkit prioritizes high translation accuracy, usability, and extensibility. Nematus has been used to build top-performing submissions to shared translation tasks at WMT and IWSLT, and has been used to train systems for production environments.

Code:

Data:

0
Introduction
  • Neural Machine Translation (NMT) (Bahdanau et al, 2015; Sutskever et al, 2014) has recently established itself as a new state-of-the art in machine translation.
  • The authors found the codebase of the tutorial to be compact, simple and easy to extend, while producing high translation quality.
  • These characteristics make it a good starting point for research in NMT.
  • Nematus has been extended to include new functionality based on recent research, and has been used to build top-performing systems to last year’s shared translation tasks at WMT (Sennrich et al, 2016) and IWSLT (Junczys-Dowmunt and Birch, 2016)
Highlights
  • Neural Machine Translation (NMT) (Bahdanau et al, 2015; Sutskever et al, 2014) has recently established itself as a new state-of-the art in machine translation
  • We present Nematus1, a new toolkit for Neural Machine Translation
  • We found the codebase of the tutorial to be compact, simple and easy to extend, while producing high translation quality
  • Nematus has been extended to include new functionality based on recent research, and has been used to build top-performing systems to last year’s shared translation tasks at WMT (Sennrich et al, 2016) and IWSLT (Junczys-Dowmunt and Birch, 2016)
  • We have described implementation differences to the architecture by Bahdanau et al (2015); due to the empirically strong performance of Nematus, we consider these to be of wider interest
  • Nematus is available under a permissive BSD license
Conclusion
  • The authors have presented Nematus, a toolkit for Neural Machine Translation. The authors have described implementation differences to the architecture by Bahdanau et al (2015); due to the empirically strong performance of Nematus, the authors consider these to be of wider interest.

    The authors hope that researchers will find Nematus an accessible and well documented toolkit to support their research.
  • The authors have presented Nematus, a toolkit for Neural Machine Translation.
  • The authors have described implementation differences to the architecture by Bahdanau et al (2015); due to the empirically strong performance of Nematus, the authors consider these to be of wider interest.
  • The authors hope that researchers will find Nematus an accessible and well documented toolkit to support their research.
  • The toolkit is by no means limited to research, and has been used to train MT systems that are currently in production (WIPO, 2016).
  • Nematus is available under a permissive BSD license
Summary
  • Introduction:

    Neural Machine Translation (NMT) (Bahdanau et al, 2015; Sutskever et al, 2014) has recently established itself as a new state-of-the art in machine translation.
  • The authors found the codebase of the tutorial to be compact, simple and easy to extend, while producing high translation quality.
  • These characteristics make it a good starting point for research in NMT.
  • Nematus has been extended to include new functionality based on recent research, and has been used to build top-performing systems to last year’s shared translation tasks at WMT (Sennrich et al, 2016) and IWSLT (Junczys-Dowmunt and Birch, 2016)
  • Conclusion:

    The authors have presented Nematus, a toolkit for Neural Machine Translation. The authors have described implementation differences to the architecture by Bahdanau et al (2015); due to the empirically strong performance of Nematus, the authors consider these to be of wider interest.

    The authors hope that researchers will find Nematus an accessible and well documented toolkit to support their research.
  • The authors have presented Nematus, a toolkit for Neural Machine Translation.
  • The authors have described implementation differences to the architecture by Bahdanau et al (2015); due to the empirically strong performance of Nematus, the authors consider these to be of wider interest.
  • The authors hope that researchers will find Nematus an accessible and well documented toolkit to support their research.
  • The toolkit is by no means limited to research, and has been used to train MT systems that are currently in production (WIPO, 2016).
  • Nematus is available under a permissive BSD license
Tables
  • Table1: Decoder phase differences
Download tables as Excel
Funding
  • This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreements 645452 (QT21), 644333 (TraMOOC), 644402 (HimL) and 688139 (SUMMA)
Reference
  • Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In Proceedings of the International Conference on Learning Representations (ICLR).
    Google ScholarLocate open access versionFindings
  • Boxing Chen and Colin Cherry. 2014. A Systematic Comparison of Smoothing Techniques for SentenceLevel BLEU. In Proceedings of the Ninth Workshop on Statistical Machine Translation, pages 362–367, Baltimore, Maryland, USA.
    Google ScholarLocate open access versionFindings
  • Pre-trained models for 8 translation directions are available at http://statmt.org/rsennrich/wmt16_systems/
    Findings
  • Michael Denkowski and Alon Lavie. 2011. Meteor 1.3: Automatic Metric for Reliable Optimization and Evaluation of Machine Translation Systems. In Proceedings of the Sixth Workshop on Statistical Machine Translation, pages 85–91, Edinburgh, Scotland.
    Google ScholarLocate open access versionFindings
  • Salah El Hihi and Yoshua Bengio. 199Hierarchical Recurrent Neural Networks for Long-Term Dependencies. In Nips, volume 409.
    Google ScholarLocate open access versionFindings
  • Yarin Gal. 2015. A Theoretically Grounded Application of Dropout in Recurrent Neural Networks. ArXiv e-prints.
    Google ScholarFindings
  • Alex Graves. 2013. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850.
    Findings
  • Hakan Inan, Khashayar Khosravi, and Richard Socher. 2016. Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling. CoRR, abs/1611.01462.
    Findings
  • Marcin Junczys-Dowmunt and Alexandra Birch. 2016. The University of Edinburgh’s systems submission to the MT task at IWSLT. In The International Workshop on Spoken Language Translation (IWSLT), Seattle, USA.
    Google ScholarLocate open access versionFindings
  • Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
    Findings
  • Razvan Pascanu, Çaglar Gülçehre, Kyunghyun Cho, and Yoshua Bengio. 2014. How to Construct Deep Recurrent Neural Networks. In International Conference on Learning Representations 2014 (Conference Track).
    Google ScholarLocate open access versionFindings
  • Ofir Press and Lior Wolf. 2017. Using the Output Embedding to Improve Language Models. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL), Valencia, Spain.
    Google ScholarLocate open access versionFindings
  • Jürgen Schmidhuber. 1992. Learning complex, extended sequences using the principle of history compression. Neural Computation, 4(2):234–242.
    Google ScholarLocate open access versionFindings
  • Rico Sennrich and Barry Haddow. 2016. Linguistic Input Features Improve Neural Machine Translation. In Proceedings of the First Conference on Machine Translation, Volume 1: Research Papers, pages 83– 91, Berlin, Germany.
    Google ScholarLocate open access versionFindings
  • Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Edinburgh Neural Machine Translation Systems for WMT 16. In Proceedings of the First Conference on Machine Translation, Volume 2: Shared Task Papers, pages 368–373, Berlin, Germany.
    Google ScholarLocate open access versionFindings
  • Shiqi Shen, Yong Cheng, Zhongjun He, Wei He, Hua Wu, Maosong Sun, and Yang Liu. 20Minimum Risk Training for Neural Machine Translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany.
    Google ScholarLocate open access versionFindings
  • Milos Stanojevic and Khalil Sima’an. 2014. BEER: BEtter Evaluation as Ranking. In Proceedings of the Ninth Workshop on Statistical Machine Translation, pages 414–419, Baltimore, Maryland, USA.
    Google ScholarLocate open access versionFindings
  • Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, pages 3104–3112, Montreal, Quebec, Canada.
    Google ScholarLocate open access versionFindings
  • Theano Development Team. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints, abs/1605.02688.
    Findings
  • Tijmen Tieleman and Geoffrey Hinton. 2012. Lecture 6.5 - rmsprop.
    Google ScholarFindings
  • WIPO. 2016. WIPO Develops Cutting-Edge Translation Tool For Patent Documents, Oct. http://www.wipo.int/pressroom/en/articles/2016/article\_0014.html.
    Findings
  • Matthew D Zeiler. 2012. ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701.
    Findings
  • Julian Georg Zilly, Rupesh Kumar Srivastava, Jan Koutník, and Jürgen Schmidhuber. 2016. Recurrent highway networks. arXiv preprint arXiv:1607.03474.
    Findings
Full Text
Your rating :
0

 

Tags
Comments