AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
We presented Big BIRD, an adaptation of the Insertion Transformer to documentlevel translation

Big Bidirectional Insertion Representations for Documents.

NGT@EMNLP-IJCNLP, pp.194-198, (2019)

被引用5|浏览12
EI
下载 PDF 全文
引用
微博一下

摘要

The Insertion Transformer is well suited for long form text generation due to its parallel generation capabilities, requiring $O(\log_2 n)$ generation steps to generate $n$ tokens. However, modeling long sequences is difficult, as there is more ambiguity captured in the attention mechanism. This work proposes the Big Bidirectional Inser...更多

代码

数据

0
简介
  • Insertion-based models (Stern et al, 2019; Welleck et al, 2019; Gu et al, 2019; Chan et al, 2019) have been introduced for text generation.
  • An autoregressive left-to-right model would require O(n) generation steps to generate n tokens, whereas the Insertion Transformer (Stern et al, 2019) and KERMIT (Chan et al, 2019) following a balanced binary tree policy requires only O(log2 n) generation steps to generate n tokens
  • This is especially important for longform text generation, for example, DocumentLevel Machine Translation.
  • There are two primary methods to include context in a documentlevel machine translation model compared to a sentence-level translation model
重点内容
  • Insertion-based models (Stern et al, 2019; Welleck et al, 2019; Gu et al, 2019; Chan et al, 2019) have been introduced for text generation
  • We present Big Bidirectional Insertion Representations for Documents (Big BIRD)
  • The Big BIRD model is as described in Section 2, and the baseline Insertion Transformer model has exactly the same configurations except without sentence-positional embeddings
  • We presented Big BIRD, an adaptation of the Insertion Transformer to documentlevel translation
  • In addition to a large context window, Big BIRD uses sentence-positional embeddings to directly capture sentence alignment between source and target documents. We show both quantitatively and qualitatively the promise of Big BIRD, with a +4.3 BLEU improvement over the baseline model and examples where Big BIRD achieves better translation quality via sentence alignment
  • We believe Big BIRD is a promising direction for document level understanding and generation
方法
  • The authors experiment with the WMT’19 English→German document-level translation task (Barrault et al, 2019).
  • The training dataset consists of parallel document-level data (Eu-.
  • Roparl, Rapid, News-Commentary) and parallel sentence-level data (WikiTitles, Common Crawl, Paracrawl).
  • The authors' baseline Insertion Transformer model is given the prior knowledge of number of source sentences in the document.
  • All models were trained with the SM3 optimizer (Anil et al, 2019) with momentum 0.9, learning rate 0.1, and a quadratic learning rate warm-up schedule with 10k warm-up steps.
结论
  • The authors presented Big BIRD, an adaptation of the Insertion Transformer to documentlevel translation.
  • In addition to a large context window, Big BIRD uses sentence-positional embeddings to directly capture sentence alignment between source and target documents.
  • The authors believe Big BIRD is a promising direction for document level understanding and generation
表格
  • Table1: WMT19 English→German Document-Level Translation
  • Table2: An example where the Insertion Transformer gets confused with sentence alignment: it maps one sentence from the source into two sentences in the translation and loses semantic accuracy. When given sentence alignment explicitly, i.e. Big BIRD, it translates the sentence coherently
Download tables as Excel
引用论文
  • Rohan Anil, Vineet Gupta, Tomer Koren, and Yoram Singer. 2019. Memory-Efficient Adaptive Optimization for Large-Scale Learning. In arXiv.
    Google ScholarFindings
  • Loc Barrault, Ondej Bojar, Marta R. Costa-juss, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Philipp Koehn, Shervin Malmasi, Christof Monz, Mathias Mller, Santanu Pal, Matt Post, and Marcos Zampieri. 2019. Findings of the 2019 Conference on Machine Translation. In ACL.
    Google ScholarLocate open access versionFindings
  • William Chan, Nikita Kitaev, Kelvin Guu, Mitchell Stern, and Jakob Uszkoreit. 2019. KERMIT: Generative Insertion-Based Modeling for Sequences. In arXiv.
    Google ScholarFindings
  • Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 201Learning Phrase Representations using RNN EncoderDecoder for Statistical Machine Translation. In EMNLP.
    Google ScholarFindings
  • Jiatao Gu, Qi Liu, and Kyunghyun Cho. 2019. Insertion-based Decoding with Automatically Inferred Generation Order. In arXiv.
    Google ScholarFindings
  • Hany Hassan, Anthony Aue andChang Chen, Vishal Chowdhary, Jonathan Clark, Christian Federmann, Xuedong Huang, Marcin Junczys-Dowmunt, William Lewis, Mu Li, Shujie Liu, Tie-Yan Liu, Renqian Luo, Arul Menezes, Tao Qin, Frank Seide, Xu Tan, Fei Tian, Lijun Wu, Shuangzhi Wu, Yingce Xia, Dongdong Zhang, Zhirui Zhang, and Ming Zhou. 2018. Achieving Human Parity on Automatic Chinese to English News Translation. In arXiv.
    Google ScholarFindings
  • Marcin Junczys-Dowmunt. 2019. Microsoft Translator at WMT 2019: Towards Large-Scale DocumentLevel Neural Machine Translation. In WMT.
    Google ScholarLocate open access versionFindings
  • Taku Kudo and John Richardson. 201Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 66–71.
    Google ScholarLocate open access versionFindings
  • Samuel Lubli, Rico Sennrich, and Martin Volk. 2018. Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation. In EMNLP.
    Google ScholarFindings
  • Sameen Maruf and Gholamreza Haffari. 2018. Document Context Neural Machine Translation with Memory Networks. In ACL.
    Google ScholarFindings
  • Matt Post. 2018. A Call for Clarity in Reporting BLEU Scores. In WMT.
    Google ScholarFindings
  • Mitchell Stern, William Chan, Jamie Kiros, and Jakob Uszkoreit. 2019. Insertion Transformer: Flexible Sequence Generation via Insertion Operations. In ICML.
    Google ScholarFindings
  • Ilya Sutskever, Oriol Vinyals, and Quoc Le. 2014. Sequence to Sequence Learning with Neural Networks. In NIPS.
    Google ScholarLocate open access versionFindings
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. In NIPS.
    Google ScholarLocate open access versionFindings
  • Sean Welleck, Kiante Brantley, Hal Daume, and Kyunghyun Cho. 2019. Non-Monotonic Sequential Text Generation. In ICML.
    Google ScholarFindings
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科