AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
We present an uncertainty-aware semantic augmentation method to bridge the discrepancy of the data distribution between the training and the inference phases for dominant neural machine translation models

Uncertainty Aware Semantic Augmentation for Neural Machine Translation

EMNLP 2020, pp.2724-2735, (2020)

被引用0|浏览334
下载 PDF 全文
引用
微博一下

摘要

As a sequence-to-sequence generation task, neural machine translation (NMT) naturally contains intrinsic uncertainty, where a single sentence in one language has multiple valid counterparts in the other. However, the dominant methods for NMT only observe one of them from the parallel corpora for the model training but have to deal with ad...更多

代码

数据

0
简介
  • In recent years neural machine translation (NMT) has demonstrated state-of-the-art performance on many language pairs with advanced architectures and large scale data (Bahdanau et al, 2015; Wu et al, 2016; Vaswani et al, 2017).
  • The NMT model should be trained under the guidance of the same latent semantics that it will access at inference time
  • In their seminal work, the variational models (Blunsom et al, 2008; Zhang et al, 2016; Shah and Barber, 2018) introduce a continuous latent variable to serve as a global semantic signal to guide the generation of target translations.
  • Their methods yield notable results, they are still limited to one-to-one parallel sentence pairs
重点内容
  • In recent years neural machine translation (NMT) has demonstrated state-of-the-art performance on many language pairs with advanced architectures and large scale data (Bahdanau et al, 2015; Wu et al, 2016; Vaswani et al, 2017)
  • Since typically there are several semantically-equivalent source sentences that can be translated to the same target sentence, but the model only observes one at
  • We model the inherent uncertainty by representing multiple source sentences into a closed semantic region, and use this semantic information to enhance NMT models where diverse literal expressions intuitively be supported by their underlying semantics
  • We present an uncertainty-aware semantic augmentation method to bridge the discrepancy of the data distribution between the training and the inference phases for dominant NMT models
  • Extensive experiments on various translation tasks reveal that our approach significantly outperforms the strong baselines and the existing methods
  • We first synthesize a proper number of source sentences to play the role of intrinsic uncertainties via the controllable sampling for each target sentence
方法
  • For En→De/NIST Zh→En, each model was repeatedly run 4 times and the authors reported the average BLEU, while each model was trained only once on the larger WMT18 Zh→En dataset.
  • 我认为我们可以重新启动这些品牌,而 且现在时间正合适。 the author thinks the authors can restart these brands, and the time is right.
  • The author thinks the authors can relaunch these brands, and is the right time.
  • 我认为我们可以重新上新这些品牌,而 且现在时间正合适。 the author thinks the authors can renew these brands, and is the right time.
  • The authors think that reasonable uncertainties can be mined via the controllable sampling strategy
结果
  • Table 2 shows the results on Zh→En tasks.
  • For NIST Zh→En, the authors first compare the approach with the TRANSFORMER model on which the model is built.
  • The authors' method can bring substantial improvements, which achieves notable gain of.
  • Cheng et al (2019) 28.34 N/A N/A TRANSFORMER.
  • Transformer big model Gao et al (2019) 29.70 N/A N/A.
  • Cheng et al (2019) 30.01 N/A N/A
结论
  • The authors present an uncertainty-aware semantic augmentation method to bridge the discrepancy of the data distribution between the training and the inference phases for dominant NMT models.
  • While the authors showed that uncertainty-aware semantic augmentation with Gaussian priors is effective, more work is required to investigate if such an approach will be successful for more sophisticated priors.
  • Learning universal representations among semantically-equivalent source and target sentences (Wei et al, 2020) can complete the proposed method
表格
  • Table1: Multinomial sampling and greedy search, as special cases covered in controllable sampling
  • Table2: BLEU [%] on Zh→En tasks. † denotes replicated results using tensor2tensor (T2T) toolkit. Both the training time and the number of parameters are related to the NIST Zh→En task. ‡The time spent in synthesizing pseudo data was included. §Both the time spent in generating synthetic data and training models were included
  • Table3: BLEU [%] on En→De and En→Fr translation tasks. ‡denotes our replicated results
  • Table4: Effect of various numbers of synthetic source sentences on validation sets
  • Table5: Effect of on validation sets with respect to BLEU scores as well as edit distances among synthetic and real source sentences. “BS-3” indicates that synthetic sentences are generated by beam search with a beam size of 3
  • Table6: Ablation study on WMT18 Zh→En validation set. “ ” means the loss function is included in the training objective
  • Table7: Effect of different methods to generate multiple synthetic data. Experiments are conducted on WMT18 Zh→En validation set
  • Table8: Translation examples of TRANSFORMERsyn (TRANSsyn for short) and our method on various inputs under the same meaning on WMT18 Zh→En
  • Table9: BLEU scores [%] on WMT16 En→De test sets (newstest2014∼2016) with monolingual data. <a class="ref-link" id="cWang_et+al_2019_a" href="#rWang_et+al_2019_a">Wang et al (2019</a>) used 2M extra back-translated data and <a class="ref-link" id="cEdunov_et+al_2018_a" href="#rEdunov_et+al_2018_a">Edunov et al (2018</a>) used 226M German monolingual sentences during back-translation
Download tables as Excel
相关工作
基金
  • This work is supported by the National Key Research and Development Programs under Grant No 2017YFB0803301, No 2016YFB0801003 and No 2018YFB1403202
引用论文
  • Mikel Artetxe, Gorka Labaka, Eneko Agirre, and Kyunghyun Cho. 2018. Unsupervised neural machine translation. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings.
    Google ScholarLocate open access versionFindings
  • Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
    Google ScholarLocate open access versionFindings
  • Phil Blunsom, Trevor Cohn, and Miles Osborne. 2008. A discriminative latent variable model for statistical machine translation. In Proceedings of ACL-08: HLT, pages 200–208, Columbus, Ohio. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew Dai, Rafal Jozefowicz, and Samy Bengio. 2016. Generating sentences from a continuous space. In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, pages 10–21, Berlin, Germany. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Yong Cheng, Lu Jiang, and Wolfgang Macherey. 2019. Robust neural machine translation with doubly adversarial inputs. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4324–4333, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Yong Cheng, Zhaopeng Tu, Fandong Meng, Junjie Zhai, and Yang Liu. 2018. Towards robust neural machine translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1756– 1766, Melbourne, Australia. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Yong Cheng, Wei Xu, Zhongjun He, Wei He, Hua Wu, Maosong Sun, and Yang Liu. 2016. Semisupervised learning for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1965–1974, Berlin, Germany. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Sufeng Duan, Hai Zhao, Dongdong Zhang, and Rui Wang. 2020. Syntax-aware data augmentation for neural machine translation. In arXiv:2004.14200.
    Findings
  • Sergey Edunov, Myle Ott, Michael Auli, and David Grangier. 2018. Understanding back-translation at scale. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 489–500, Brussels, Belgium. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Marzieh Fadaee, Arianna Bisazza, and Christof Monz. 2017. Data augmentation for low-resource neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 567– 573, Vancouver, Canada. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Marzieh Fadaee and Christof Monz. 2018. Backtranslation sampling by targeting difficult words in neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 436–446, Brussels, Belgium. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Fei Gao, Jinhua Zhu, Lijun Wu, Yingce Xia, Tao Qin, Xueqi Cheng, Wengang Zhou, and Tie-Yan Liu. 2019. Soft contextual data augmentation for neural machine translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5539–5544, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Di He, Yingce Xia, Tao Qin, Liwei Wang, Nenghai Yu, Tie-Yan Liu, and Wei-Ying Ma. 2016. Dual learning for machine translation. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain, pages 820–828.
    Google ScholarLocate open access versionFindings
  • Vu Cong Duy Hoang, Philipp Koehn, Gholamreza Haffari, and Trevor Cohn. 2018. Iterative backtranslation for neural machine translation. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, pages 18–24, Melbourne, Australia. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Kenji Imamura, Atsushi Fujita, and Eiichiro Sumita. 2018. Enhancement of encoder and attention using target monolingual corpora in neural machine translation. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, pages 55– 63, Melbourne, Australia. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Mohit Iyyer, Varun Manjunatha, Jordan Boyd-Graber, and Hal Daume III. 2015. Deep unordered composition rivals syntactic methods for text classification. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages
    Google ScholarLocate open access versionFindings
  • 1681–1691, Beijing, China. Association for Computational Linguistics.
    Google ScholarFindings
  • Alex Kendall, Vijay Badrinarayanan, and Roberto Cipolla. 2017. Bayesian segnet: Model uncertainty in deep convolutional encoder–decoder architectures for scene understanding. In British Machine Vision Conference 2017, BMVC 2017, London, UK, September 4-7, 2017.
    Google ScholarLocate open access versionFindings
  • Alex Kendall and Yarin Gal. 2017. What uncertainties do we need in bayesian deep learning for computer vision? In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA, pages 5574–5584.
    Google ScholarLocate open access versionFindings
  • Diederik P Kingma, Shakir Mohamed, Danilo Jimenez Rezende, and Max Welling. 2014. Semi-supervised learning with deep generative models. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, pages 3581–3589.
    Google ScholarLocate open access versionFindings
  • Sosuke Kobayashi. 2018. Contextual augmentation: Data augmentation by words with paradigmatic relations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 452–457, New Orleans, Louisiana. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, pages 177–180, Prague, Czech Republic. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Guillaume Lample, Alexis Conneau, Ludovic Denoyer, and Marc’Aurelio Ranzato. 2018. Unsupervised machine translation using monolingual corpora only. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings.
    Google ScholarLocate open access versionFindings
  • Guanlin Li, Lemao Liu, Guoping Huang, Conghui Zhu, and Tiejun Zhao. 2019. Understanding data augmentation in neural machine translation: Two perspectives towards generalization. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5693–5699, Hong Kong, China. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Xing Niu, Michael Denkowski, and Marine Carpuat. 2018. Bi-directional neural machine translation with synthetic parallel data. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, pages 84–91, Melbourne, Australia. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Myle Ott, Michael Auli, David Granger, and Marc’Aurelio Ranzato. 2018. Analyzing uncertainty in neural machine translation. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmassan, Stockholm, Sweden, July 10-15, 2018, pages 3953–3962.
    Google ScholarLocate open access versionFindings
  • Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016a. Improving neural machine translation models with monolingual data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 86–96, Berlin, Germany. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016b. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1715– 1725, Berlin, Germany. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Harshil Shah and David Barber. 2018. Generative neural machine translation. In Advances in Neural Information Processing Systems, pages 1346–1355.
    Google ScholarLocate open access versionFindings
  • Aili Shen, Daniel Beck, Bahar Salehi, Jianzhong Qi, and Timothy Baldwin. 2019. Modelling uncertainty in collaborative document quality assessment. In Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019), pages 191–201, Hong Kong, China. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Huihsin Tseng, Pichuan Chang, Galen Andrew, Daniel Jurafsky, and Christopher Manning. 2005. A conditional random field word segmenter for sighan bakeoff 2005. In Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing.
    Google ScholarLocate open access versionFindings
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA, pages 5998–6008.
    Google ScholarLocate open access versionFindings
  • Shuo Wang, Yang Liu, Chao Wang, Huanbo Luan, and Maosong Sun. 2019. Improving back-translation with uncertainty-based confidence estimation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 791– 802, Hong Kong, China. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Xinyi Wang, Hieu Pham, Zihang Dai, and Graham Neubig. 2018. SwitchOut: an efficient data augmentation algorithm for neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 856–861, Brussels, Belgium. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Xiangpeng Wei, Yue Hu, Luxi Xing, Yipeng Wang, and Li Gao. 2019. Translating with bilingual topic knowledge for neural machine translation. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, pages 7257–7264. AAAI Press.
    Google ScholarLocate open access versionFindings
  • Xiangpeng Wei, Yue Hu Hu, Rongxiang Weng, Luxi Xing, Heng Yu, and Weihua Luo. 2020. On learning universal representations across languages. CoRR, abs/2007.15960.
    Findings
  • Lijun Wu, Fei Tian, Tao Qin, Jianhuang Lai, and TieYan Liu. 2018. A study of reinforcement learning for neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3612–3621, Brussels, Belgium. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, and Klaus Macherey. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. In arXiv:1609.08144.
    Findings
  • Mengzhou Xia, Xiang Kong, Antonios Anastasopoulos, and Graham Neubig. 2019. Generalized data augmentation for low-resource translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5786– 5796, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Yijun Xiao and William Yang Wang. 2018. Quantifying uncertainties in natural language processing tasks. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, pages 7322–7329.
    Google ScholarLocate open access versionFindings
  • Ziang Xie, Sida I. Wang, Jiwei Li, Daniel Levy, Aiming Nie, Dan Jurafsky, and Andrew Y. Ng. 2017. Data noising as smoothing in neural network language models. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings.
    Google ScholarLocate open access versionFindings
  • Mingming Yang, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Min Zhang, and Tiejun Zhao. 2019. Sentence-level agreement for neural machine translation. In Proceedings of the 57th Annual Meeting of the Association for Computational
    Google ScholarLocate open access versionFindings
  • Linguistics, pages 3076–3082, Florence, Italy. Association for Computational Linguistics.
    Google ScholarFindings
  • Poorya Zaremoodi and Gholamreza Haffari. 2018. Incorporating syntactic uncertainty in neural machine translation with a forest-to-sequence model. In Proceedings of the 27th International Conference on Computational Linguistics, pages 1421–1429, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Biao Zhang, Deyi Xiong, Jinsong Su, Hong Duan, and Min Zhang. 2016. Variational neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 521–530, Austin, Texas. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Xiang Zhang, Shizhu He, Kang Liu, and Jun Zhao. 2019a. AdaNSP: Uncertainty-driven adaptive decoding in neural semantic parsing. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4265–4270, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Xuchao Zhang, Fanglan Chen, Chang-Tien Lu, and Naren Ramakrishnan. 2019b. Mitigating uncertainty in document classification. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3126–3136, Minneapolis, Minnesota. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Zhirui Zhang, Shujie Liu, Mu Li, Ming Zhou, and Enhong Chen. 2018. Joint training for neural machine translation models with monolingual data. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, pages 555–562.
    Google ScholarLocate open access versionFindings
  • Tiancheng Zhao, Ran Zhao, and Maxine Eskenazi. 2017. Learning discourse-level diversity for neural dialog models using conditional variational autoencoders. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 654–664, Vancouver, Canada. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科