AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We reported competitive results on five different natural language processing problems, improving the state of the art on text normalization, sentence splitting, and the JFLEG test set for grammatical error correction

Seq2Edits: Sequence Transduction Using Span level Edit Operations

EMNLP 2020, pp.5147-5159, (2020)

Cited by: 0|Views201
Full Text
Bibtex
Weibo

Abstract

We propose Seq2Edits, an open-vocabulary approach to sequence editing for natural language processing (NLP) tasks with a high degree of overlap between input and output texts. In this approach, each sequence-to-sequence transduction is represented as a sequence of edit operations, where each operation either replaces an entire source span...More

Code:

Data:

0
Introduction
  • Neural models that generate a target sequence conditioned on a source sequence were initially proposed for machine translation (MT) (Sutskever et al, 2014; Kalchbrenner and Blunsom, 2013; Bahdanau et al, 2015; Vaswani et al, 2017), but are used widely as a central component of a variety of NLP systems (e.g. Tan et al (2017); Chollampatt and Ng (2018)). Raffel et al (2019) argue that even problems that are traditionally not viewed from a sequence transduction perspective can benefit from massive pre-training when framed as a text-to-text problem.
  • Employing a full sequence model in these cases is often wasteful as most tokens are copied over from the input to the output
  • Another disadvantage of a full sequence model is that it does not provide an explanation for why it proposes a particular target sequence.
  • The authors apply our edit operation based model to five NLP tasks: text normalization, sentence fusion, sentence splitting & rephrasing, text simplification, and grammatical error correction (GEC).
  • The authors' model is competitive across all of these tasks, and improves the state-of-the-art on text normalization (Sproat and Jaitly, 2016), sentence splitting & rephrasing (Botha et al, 2018), and the JFLEG test set (Napoles et al, 2017) for GEC
Highlights
  • Neural models that generate a target sequence conditioned on a source sequence were initially proposed for machine translation (MT) (Sutskever et al, 2014; Kalchbrenner and Blunsom, 2013; Bahdanau et al, 2015; Vaswani et al, 2017), but are used widely as a central component of a variety of natural language processing (NLP) systems (e.g. Tan et al (2017); Chollampatt and Ng (2018)). Raffel et al (2019) argue that even problems that are traditionally not viewed from a sequence transduction perspective can benefit from massive pre-training when framed as a text-to-text problem
  • Neural models that generate a target sequence conditioned on a source sequence were initially proposed for machine translation (MT) (Sutskever et al, 2014; Kalchbrenner and Blunsom, 2013; Bahdanau et al, 2015; Vaswani et al, 2017), but are used widely as a central component of a variety of NLP systems (e.g. Tan et al (2017); Chollampatt and Ng (2018))
  • We have presented a neural model that represents sequence transduction using span-based edit operations
  • We reported competitive results on five different NLP problems, improving the state of the art on text normalization, sentence splitting, and the JFLEG test set for grammatical error correction
  • We showed that our approach is 2.0-5.2 times faster than a full sequence model for grammatical error correction
  • The underlying neural model in Seq2Edits is as much of a black-box as a regular full sequence model
Methods
  • The authors evaluate the edit model on five NLP tasks:3

    Text normalization for speech applications (Sproat and Jaitly, 2016) – converting number expressions such as “123” to their verbalizations (e.g. “one two three” or “one hundred twenty three”, etc.) depending on the context.

    Sentence fusion (Geva et al, 2019) – merging two independent sentences to a single coherent one, e.g. “The author needs his spirit to be free.
  • The authors' models are trained on packed examples with Adafactor (Shazeer and Stern, 2018) using the Tensor2Tensor (Vaswani et al, 2018) library.
  • The authors report results both with and without pre-training.
  • On other tasks the authors use a minimum edit distance heuristic to find a token-level edit sequence and convert it to span-level edits by merging neighboring edits
Results
  • The authors' span constrained model achieves a recall of 52.4%, i.e. more than half of the non-self tags are classified correctly (28 tags).
Conclusion
  • The authors have presented a neural model that represents sequence transduction using span-based edit operations.
  • The authors reported competitive results on five different NLP problems, improving the state of the art on text normalization, sentence splitting, and the JFLEG test set for grammatical error correction.
  • The authors showed that the approach is 2.0-5.2 times faster than a full sequence model for grammatical error correction.
  • The authors' model can predict labels that explain each edit to improve the interpretability for the end-user.
  • The authors do not make any claim that Seq2Edits can provide insights into the internal mechanics of the neural model.
  • The underlying neural model in Seq2Edits is as much of a black-box as a regular full sequence model
Tables
  • Table1: The “Base” and “Big” configurations
  • Table2: Statistics for the task-specific training data. The I, J, and N variables are introduced in Sec. 2.1. Our subword-based systems use the implementation available in Tensor2Tensor (Vaswani et al, 2018) with a vocabulary size of 32K. The pre-training data is described in the text. See Appendix A for the full tag vocabularies
  • Table3: Decoding parameters that are tuned on the respective development sets. Feature weight tuning refers to the λ-parameters in Sec. 2.2. We use the length normalization scheme from Wu et al (2016) with the parameter α
  • Table4: Single model results. For metrics marked with “↑” (SARI, P(recision), R(ecall), F0.5) high scores are favorable, whereas the sentence error rate (SER) is marked with “↓” to indicate the preference for low values. Tuning refers to optimizing the decoding parameters listed in Table 3 on the development sets
  • Table5: Sentence error rates on the English and Russian text normalization test sets of <a class="ref-link" id="cSproat_2016_a" href="#rSproat_2016_a">Sproat and Jaitly (2016</a>). ∗: best system from <a class="ref-link" id="cMansfield_et+al_2019_a" href="#rMansfield_et+al_2019_a">Mansfield et al (2019</a>) without access to gold semiotic class labels
  • Table6: Sentence fusion results on the DiscoFuse (<a class="ref-link" id="cGeva_et+al_2019_a" href="#rGeva_et+al_2019_a">Geva et al, 2019</a>) test set
  • Table7: Sentence splitting results (<a class="ref-link" id="cBotha_et+al_2018_a" href="#rBotha_et+al_2018_a">Botha et al, 2018</a>)
  • Table8: Text simplification results.7
  • Table9: Single model results for grammatical error correction
  • Table10: Ensemble results for grammatical error correction. Our full sequence baseline achieves 68.2 F0.5 on BEA-test, 63.8 F0.5 on CoNLL-14, and 62.4 GLEU on JFLEG-test
  • Table11: CPU decoding speeds without iterative refinement on BEA-dev averaged over three runs. Speed-ups compared to the full sequence baseline are in parentheses
  • Table12: Partially constraining the decoder with oracle tags and/or span positions (no iterative refinement)
  • Table13: Tagging accuracy on BEA-dev (no iterative refinement)
  • Table14: Semiotic class tags for text normalization copied verbatim from the Table 3 caption of <a class="ref-link" id="cSproat_2016_a" href="#rSproat_2016_a">Sproat and Jaitly (2016</a>)
  • Table15: DiscoFuse discourse types. The type descriptions are copied verbatim from Table 7 of <a class="ref-link" id="cGeva_et+al_2019_a" href="#rGeva_et+al_2019_a">Geva et al (2019</a>). The SINGLE and PAIR prefixes indicate whether the input is a single sentence or two consecutive sentences
  • Table16: ERRANT tag vocabulary for grammatical error correction copied verbatim from Table 2 of <a class="ref-link" id="cBryant_et+al_2017_a" href="#rBryant_et+al_2017_a">Bryant et al (2017</a>)
  • Table17: Sentence fusion examples from the DiscoFuse dataset (<a class="ref-link" id="cGeva_et+al_2019_a" href="#rGeva_et+al_2019_a">Geva et al, 2019</a>)
  • Table18: English text normalization examples from the dataset provided by <a class="ref-link" id="cSproat_2016_a" href="#rSproat_2016_a">Sproat and Jaitly (2016</a>)
  • Table19: Grammatical error correction examples from BEA-dev (<a class="ref-link" id="cBryant_et+al_2019_a" href="#rBryant_et+al_2019_a">Bryant et al, 2019</a>)
Download tables as Excel
Related work
Funding
  • Our span constrained model achieves a recall of 52.4%, i.e. more than half of the non-self tags are classified correctly (28 tags)
Reference
  • Abhijeet Awasthi, Sunita Sarawagi, Rasna Goyal, Sabyasachi Ghosh, and Vihari Piratla. 2019. Parallel iterative edit models for local sequence transduction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4260– 4270, Hong Kong, China. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Dzmitry Bahdanau, Kyung Hyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In 3rd International Conference on Learning Representations, ICLR 2015.
    Google ScholarLocate open access versionFindings
  • Jan A. Botha, Manaal Faruqui, John Alex, Jason Baldridge, and Dipanjan Das. 2018. Learning to split and rephrase from Wikipedia edit history. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 732–737, Brussels, Belgium. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Christopher Bryant, Mariano Felice, Øistein E. Andersen, and Ted Briscoe. 2019. The BEA-2019 shared task on grammatical error correction. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 52–75, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Christopher Bryant, Mariano Felice, and Ted Briscoe. 2017. Automatic annotation and evaluation of error types for grammatical error correction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 793–805, Vancouver, Canada. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Yen-Chun Chen and Mohit Bansal. 2018. Fast abstractive summarization with reinforce-selected sentence rewriting. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 675–686, Melbourne, Australia. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Yo Joong Choe, Jiyeon Ham, Kyubyong Park, and Yeoil Yoon. 2019. A neural grammatical error correction system built on better pre-training and sequential transfer learning. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 213–227, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Shamil Chollampatt and Hwee Tou Ng. 201A multilayer convolutional encoder-decoder neural network for grammatical error correction. In Thirty-Second AAAI Conference on Artificial Intelligence.
    Google ScholarLocate open access versionFindings
  • Ronan Collobert and Jason Weston. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning, ICML 08, page 160167, New York, NY, USA. Association for Computing Machinery.
    Google ScholarLocate open access versionFindings
  • Daniel Dahlmeier and Hwee Tou Ng. 2012. Better evaluation for grammatical error correction. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 568–572, Montreal, Canada. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Daxiang Dong, Hua Wu, Wei He, Dianhai Yu, and Haifeng Wang. 2015. Multi-task learning for multiple language translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1723–1732, Beijing, China. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Yue Dong, Zichao Li, Mehdi Rezagholizadeh, and Jackie Chi Kit Cheung. 2019. EditNTS: An neural programmer-interpreter model for sentence simplification through explicit editing. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3393–3402, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Micha Elsner, Andrea D Sims, Alexander Erdmann, Antonio Hernandez, Evan Jaffe, Lifeng Jin, Martha Booker Johnson, Shuan Karim, David L King, Luana Lamberti Nunes, et al. 2019. Modeling morphological learning, typology, and change: What can the neural sequence-to-sequence framework contribute? Journal of Language Modelling, 7(1):53–98.
    Google ScholarLocate open access versionFindings
  • Mariano Felice, Christopher Bryant, and Ted Briscoe. 2016. Automatic extraction of learner errors in ESL sentences using linguistically enhanced alignments. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 825–835, Osaka, Japan. The COLING 2016 Organizing Committee.
    Google ScholarLocate open access versionFindings
  • Tao Ge, Furu Wei, and Ming Zhou. 2018. Reaching human-level performance in automatic grammatical error correction: An empirical study. arXiv preprint arXiv:1807.01270.
    Findings
  • Mor Geva, Eric Malmi, Idan Szpektor, and Jonathan Berant. 2019. DiscoFuse: A large-scale dataset for discourse-based sentence fusion. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3443–3455, Minneapolis, Minnesota. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Roman Grundkiewicz, Marcin Junczys-Dowmunt, and Kenneth Heafield. 2019. Neural grammatical error correction systems with unsupervised pre-training on synthetic data. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 252–263, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Jiatao Gu, Qi Liu, and Kyunghyun Cho. 2019a. Insertion-based decoding with automatically inferred generation order. Transactions of the Association for Computational Linguistics, 7:661–676.
    Google ScholarLocate open access versionFindings
  • Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O.K. Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1631–1640, Berlin, Germany. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Jiatao Gu, Changhan Wang, and Junbo Zhao. 2019b. Levenshtein transformer. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alche Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 11181– 11191. Curran Associates, Inc.
    Google ScholarLocate open access versionFindings
  • Caglar Gulcehre, Sungjin Ahn, Ramesh Nallapati, Bowen Zhou, and Yoshua Bengio. 2016. Pointing the unknown words. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 140–149, Berlin, Germany. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770– 778.
    Google ScholarLocate open access versionFindings
  • Robin Jia and Percy Liang. 2016. Data recombination for neural semantic parsing. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 12–22, Berlin, Germany. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Nal Kalchbrenner and Phil Blunsom. 2013. Recurrent continuous translation models. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1700–1709, Seattle, Washington, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Abdul Khan, Subhadarshi Panda, Jia Xu, and Lampros Flokas. 2018. Hunter NMT system for WMT18 biomedical translation task: Transfer learning in neural machine translation. In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pages 655–661, Belgium, Brussels. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Shun Kiyono, Jun Suzuki, Masato Mita, Tomoya Mizumoto, and Kentaro Inui. 2019. An empirical study of incorporating pseudo data into grammatical error correction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP), pages 1236–1242, Hong Kong, China. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Jared Lichtarge, Chris Alberti, Shankar Kumar, Noam Shazeer, Niki Parmar, and Simon Tong. 2019. Corpora generation for grammatical error correction. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3291–3301, Minneapolis, Minnesota. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Minh-Thang Luong, Quoc V Le, Ilya Sutskever, Oriol Vinyals, and Lukasz Kaiser. 2015. Multi-task sequence to sequence learning. arXiv preprint arXiv:1511.06114.
    Findings
  • Jonathan Mallinson, Aliaksei Severyn, Eric Malmi, and Guillermo Garrido. 2020. Felix: Flexible text editing through tagging and insertion. arXiv preprint arXiv:2003.10687.
    Findings
  • Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, and Aliaksei Severyn. 2019.
    Google ScholarFindings
  • Encode, tag, realize: High-precision text editing. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5054–5065, Hong Kong, China. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Courtney Mansfield, Ming Sun, Yuzong Liu, Ankur Gandhe, and Bjorn Hoffmeister. 2019. Neural text normalization with subword units. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Industry Papers), pages 190–196, Minneapolis - Minnesota. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Tomoya Mizumoto, Yuta Hayashibe, Mamoru Komachi, Masaaki Nagata, and Yuji Matsumoto. 2012. The effect of learner corpus size in grammatical error correction of ESL writings. In Proceedings of COLING 2012: Posters, pages 863–872, Mumbai, India. The COLING 2012 Organizing Committee.
    Google ScholarLocate open access versionFindings
  • Ramesh Nallapati, Bowen Zhou, Cicero dos Santos, Caglar Gulcehre, and Bing Xiang. 2016. Abstractive text summarization using sequence-to-sequence RNNs and beyond. In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, pages 280–290, Berlin, Germany. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Courtney Napoles, Keisuke Sakaguchi, and Joel Tetreault. 2017. JFLEG: A fluency corpus and benchmark for grammatical error correction. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 229–234, Valencia, Spain. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Hwee Tou Ng, Siew Mei Wu, Ted Briscoe, Christian Hadiwinoto, Raymond Hendy Susanto, and Christopher Bryant. 2014. The CoNLL-2014 shared task on grammatical error correction. In Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task, pages 1–14, Baltimore, Maryland. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Robert Ostling and Jorg Tiedemann. 2017. Neural machine translation for low-resource languages. arXiv preprint arXiv:1708.05729.
    Findings
  • Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2019. Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683.
    Findings
  • Joana Ribeiro, Shashi Narayan, Shay B. Cohen, and Xavier Carreras. 2018. Local string transduction as sequence labeling. In Proceedings of the 27th International Conference on Computational Linguistics, pages 1360–1371, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Sascha Rothe, Shashi Narayan, and Aliaksei Severyn. 2019. Leveraging pre-trained checkpoints for sequence generation tasks. arXiv preprint arXiv:1907.12461.
    Findings
  • Danielle Saunders, Felix Stahlberg, and Bill Byrne. 2019. UCAM biomedical translation at WMT19: Transfer learning multi-domain ensembles. In Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2), pages 169–174, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get to the point: Summarization with pointergenerator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1073– 1083, Vancouver, Canada. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1715– 1725, Berlin, Germany. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Noam Shazeer and Mitchell Stern. 2018. Adafactor: Adaptive learning rates with sublinear memory cost. In Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 4596–4604, Stockholmsmssan, Stockholm Sweden. PMLR.
    Google ScholarLocate open access versionFindings
  • Anders Søgaard and Yoav Goldberg. 2016. Deep multitask learning with low level tasks supervised at lower layers. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 231–235, Berlin, Germany. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Richard Sproat and Navdeep Jaitly. 2016. RNN approaches to text normalization: A challenge. arXiv preprint arXiv:1611.00068.
    Findings
  • Felix Stahlberg, Danielle Saunders, and Bill Byrne. 2018. An operation sequence model for explainable neural machine translation. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 175–186, Brussels, Belgium. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Mitchell Stern, William Chan, Jamie Kiros, and Jakob Uszkoreit. 2019. Insertion transformer: Flexible sequence generation via insertion operations. In Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 5976–5985, Long Beach, California, USA. PMLR.
    Google ScholarLocate open access versionFindings
  • Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 27, pages 3104–3112. Curran Associates, Inc.
    Google ScholarLocate open access versionFindings
  • Jiwei Tan, Xiaojun Wan, and Jianguo Xiao. 2017. Abstractive document summarization with a graphbased attentional neural model. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pages 1171–1181, Vancouver, Canada. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Ashish Vaswani, Samy Bengio, Eugene Brevdo, Francois Chollet, Aidan Gomez, Stephan Gouws, Llion Jones, Łukasz Kaiser, Nal Kalchbrenner, Niki Parmar, Ryan Sepassi, Noam Shazeer, and Jakob Uszkoreit. 2018. Tensor2Tensor for neural machine translation. In Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Papers), pages 193–199, Boston, MA. Association for Machine Translation in the Americas.
    Google ScholarLocate open access versionFindings
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 5998–6008. Curran Associates, Inc.
    Google ScholarLocate open access versionFindings
  • Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer networks. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems 28, pages 2692–2700. Curran Associates, Inc.
    Google ScholarLocate open access versionFindings
  • Macherey, et al. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144.
    Findings
  • Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, and Chris Callison-Burch. 2016. Optimizing statistical machine translation for text simplification. Transactions of the Association for Computational Linguistics, 4:401–415.
    Google ScholarLocate open access versionFindings
  • Helen Yannakoudakis, Ted Briscoe, and Ben Medlock. 2011. A new dataset and method for automatically grading ESOL texts. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 180–189, Portland, Oregon, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Hao Zhang, Richard Sproat, Axel H. Ng, Felix Stahlberg, Xiaochang Peng, Kyle Gorman, and Brian Roark. 2019. Neural models of text normalization for speech applications. Computational Linguistics, 45(2):293–337.
    Google ScholarLocate open access versionFindings
  • Xingxing Zhang and Mirella Lapata. 2017. Sentence simplification with deep reinforcement learning. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 584–594, Copenhagen, Denmark. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Yuan Zhang and David Weiss. 2016. Stackpropagation: Improved representation learning for syntax. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1557–1566, Berlin, Germany. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Wei Zhao, Liang Wang, Kewei Shen, Ruoyu Jia, and Jingming Liu. 2019. Improving grammatical error correction via pre-training a copy-augmented architecture with unlabeled data. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 156–165, Minneapolis, Minnesota. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
Author
Felix Stahlberg
Felix Stahlberg
Shankar Kumar
Shankar Kumar
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科