ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations

ACL, pp. 4668-4679, 2020.

Cited by: 1|Bibtex|Views130
EI
Other Links: arxiv.org|dblp.uni-trier.de|academic.microsoft.com
Weibo:
We have introduced ASSET, a new dataset for tuning and evaluation of Sentence Simplification models

Abstract:

In order to simplify a sentence, human editors perform multiple rewriting transformations: they split it into several shorter sentences, paraphrase words (i.e. replacing complex words or phrases by simpler synonyms), reorder components, and/or delete information deemed unnecessary. Despite these varied range of possible text alterations...More

Code:

Data:

Introduction
  • Sentence Simplification (SS) consists in modifying the content and structure of a sentence to make it easier to understand, while retaining its main idea and most of its original meaning (Alva-Manchego et al, 2020).
  • Simplified texts can benefit non-native speakers (Paetzold, 2016), people suffering from aphasia (Carroll et al, 1998), dyslexia (Rello et al, 2013) or autism (Evans et al, 2014).
  • The Newsela corpus (Xu et al, 2015) contains simplifications produced by professionals applying multiple rewriting transformations, but sentence alignments are automatically computed and imperfect, and its data can only be accessed after signing a restrictive publicsharing licence and cannot be redistributed, hampering reproducibility
Highlights
  • Sentence Simplification (SS) consists in modifying the content and structure of a sentence to make it easier to understand, while retaining its main idea and most of its original meaning (Alva-Manchego et al, 2020)
  • In this paper we introduce ASSET (Abstractive Sentence Simplification Evaluation and Tuning), a new dataset for tuning and evaluation of automatic Sentence Simplification models
  • We extended TurkCorpus (Xu et al, 2016) by using the same original sentences, but crowdsourced manual simplifications that encompass a richer set of rewriting transformations
  • We have introduced ASSET, a new dataset for tuning and evaluation of Sentence Simplification models
  • Simplifications in ASSET were crowdsourced, and annotators were instructed to apply multiple rewriting transformations. This improves current publicly-available evaluation datasets, which are focused on only one type of transformation
  • We hope that ASSET’s multi-transformation features will motivate the development of Sentence Simplification models that benefit a variety of target audiences according to their specific needs such as people with low literacy or cognitive disabilities
Methods
  • SARI measures improvement in the simplicity of a sentence based on the n-grams added, deleted and kept by the simplification system.
  • It does so by comparing the output of the simplification model to multiple references and the original sentence, using both precision and recall.
  • The authors used a smoothed sentence-level version of BLEU so that comparison is possible, 5https://github.com/feralvam/easse
Results
  • Results and Analysis

    Figure 1 shows the density of all features in ASSET, and compares them with those in TurkCorpus and

    4github.com/explosion/spaCy density

    Sentence splits sen1tence sp2lits Added words (%)

    6 4 2 0 0.0added0.2words0p.4ercent0a.g6e

    Compression levels

    6 4 2 replace-only Levenshtein Distance

    Deleted words (%)

    HSplit TurkCorpus ASSET

    0.c5ompressio1n.0level 1.5 0.00 0 Lev2e0nshtei4n0distan6c0e

    0 0.0delet0e.d2word0s.4perce0.n6tage0.8.
  • Judges preferred ASSET’s simplifications in terms of fluency and simplicity.
  • They found TurkCorpus’ simplifications more meaning preserving.
  • This is expected since they were produced mainly by replacing words/phrases with virtually no deletion of content.
  • A similar behaviour was observed when comparing ASSET to HSplit
  • In this case, the differences in preferences are greater than with TurkCorpus.
  • This could indicate that changes in syntactic structure are not enough for a sentence to be consider simpler
Conclusion
  • Simplifications in ASSET were crowdsourced, and annotators were instructed to apply multiple rewriting transformations.
  • This improves current publicly-available evaluation datasets, which are focused on only one type of transformation.
  • The authors have motivated the need to develop new metrics for automatic evaluation of SS models, especially when evaluating simplifications with multiple rewriting operations.
  • The authors hope that ASSET’s multi-transformation features will motivate the development of SS models that benefit a variety of target audiences according to their specific needs such as people with low literacy or cognitive disabilities
Summary
  • Introduction:

    Sentence Simplification (SS) consists in modifying the content and structure of a sentence to make it easier to understand, while retaining its main idea and most of its original meaning (Alva-Manchego et al, 2020).
  • Simplified texts can benefit non-native speakers (Paetzold, 2016), people suffering from aphasia (Carroll et al, 1998), dyslexia (Rello et al, 2013) or autism (Evans et al, 2014).
  • The Newsela corpus (Xu et al, 2015) contains simplifications produced by professionals applying multiple rewriting transformations, but sentence alignments are automatically computed and imperfect, and its data can only be accessed after signing a restrictive publicsharing licence and cannot be redistributed, hampering reproducibility
  • Methods:

    SARI measures improvement in the simplicity of a sentence based on the n-grams added, deleted and kept by the simplification system.
  • It does so by comparing the output of the simplification model to multiple references and the original sentence, using both precision and recall.
  • The authors used a smoothed sentence-level version of BLEU so that comparison is possible, 5https://github.com/feralvam/easse
  • Results:

    Results and Analysis

    Figure 1 shows the density of all features in ASSET, and compares them with those in TurkCorpus and

    4github.com/explosion/spaCy density

    Sentence splits sen1tence sp2lits Added words (%)

    6 4 2 0 0.0added0.2words0p.4ercent0a.g6e

    Compression levels

    6 4 2 replace-only Levenshtein Distance

    Deleted words (%)

    HSplit TurkCorpus ASSET

    0.c5ompressio1n.0level 1.5 0.00 0 Lev2e0nshtei4n0distan6c0e

    0 0.0delet0e.d2word0s.4perce0.n6tage0.8.
  • Judges preferred ASSET’s simplifications in terms of fluency and simplicity.
  • They found TurkCorpus’ simplifications more meaning preserving.
  • This is expected since they were produced mainly by replacing words/phrases with virtually no deletion of content.
  • A similar behaviour was observed when comparing ASSET to HSplit
  • In this case, the differences in preferences are greater than with TurkCorpus.
  • This could indicate that changes in syntactic structure are not enough for a sentence to be consider simpler
  • Conclusion:

    Simplifications in ASSET were crowdsourced, and annotators were instructed to apply multiple rewriting transformations.
  • This improves current publicly-available evaluation datasets, which are focused on only one type of transformation.
  • The authors have motivated the need to develop new metrics for automatic evaluation of SS models, especially when evaluating simplifications with multiple rewriting operations.
  • The authors hope that ASSET’s multi-transformation features will motivate the development of SS models that benefit a variety of target audiences according to their specific needs such as people with low literacy or cognitive disabilities
Tables
  • Table1: Examples of simplifications collected for ASSET together with their corresponding version from TurkCorpus and HSplit for the same original sentences
  • Table2: General surface statistics for ASSET compared with TurkCorpus and HSplit. A simplification instance is an original-simplified sentence pair
  • Table3: Percentage of simplifications featuring one of different rewriting transformations operated in ASSET, TurkCorpus and HSplit. A simplification is considered as compressed when its character length is less than 75% of that of the original sentence
  • Table4: Percentages of human judges who preferred simplifications in ASSET or TurkCorpus, and ASSET or HSplit, out of 359 comparisons. * indicates a statistically significant difference between the two datasets (binomial test with p-value < 0.001)
  • Table5: Pearson correlation of human ratings with automatic metrics on system simplifications. * indicates a significance level of p-value < 0.05
  • Table6: Pearson correlation of human ratings with text features on system simplifications. * indicates a significance level of p-value < 0.01
Download tables as Excel
Related work
  • 2.1 Studies on Human Simplification

    A few corpus studies have been carried out to analyse how humans simplify sentences, and to attempt to determine the rewriting transformations that are performed.

    Petersen and Ostendorf (2007) analysed a corpus of 104 original and professionally simplified news articles in English. Sentences were manually aligned and each simplification instance was categorised as dropped (1-to-0 alignment), split (1-to-N), total (1-to-1) or merged (2-to-1). Some splits were further sub-categorised as edited (i.e. the sentence was split and some part was dropped) or different (i.e. same information but very different wording). This provides evidence that sentence splitting and deletion of information can be performed simultaneously.

    Aluısio et al (2008) studied six corpora of simple texts (different genres) and a corpus of complex news texts in Brazilian Portuguese, to produce a manual for Portuguese text simplification (Specia et al, 2008). It contains several rules to perform the task focused on syntactic alterations: to split adverbial/coordinated/subordinated sentences, to reorder clauses to a subject-verb-object structure, to transform passive to active voice, among others.
Funding
  • This work was partly supported by Benoıt Sagot’s chair in the PRAIRIE institute, funded by the French national agency ANR as part of the “Investissements d’avenir” programme under the reference ANR-19-P3IA-0001
Reference
  • Sandra M. Aluısio, Lucia Specia, Thiago A. S. Pardo, Erick G. Maziero, Helena M. Caseli, and Renata P. M. Fortes. 2008. A corpus analysis of simple account texts and the proposal of simplification strategies: First steps towards text simplification systems. In Proceedings of the 26th Annual ACM International Conference on Design of Communication, SIGDOC ’08, pages 15–22, Lisbon, Portugal. ACM.
    Google ScholarLocate open access versionFindings
  • Fernando Alva-Manchego, Joachim Bingel, Gustavo Paetzold, Carolina Scarton, and Lucia Specia. 2017. Learning how to simplify from explicit labeling of complex-simplified text pairs. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 295–305, Taipei, Taiwan. Asian Federation of Natural Language Processing.
    Google ScholarLocate open access versionFindings
  • Fernando Alva-Manchego, Louis Martin, Carolina Scarton, and Lucia Specia. 2019. EASSE: Easier automatic sentence simplification evaluation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations, pages 49–54, Hong Kong, China. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Fernando Alva-Manchego, Carolina Scarton, and Lucia Specia. 2020. Data-driven sentence simplification: Survey and benchmark. Computational Linguistics, 46(1):135–187.
    Google ScholarLocate open access versionFindings
  • Loıc Barrault, Ondrej Bojar, Marta R. Costa-jussa, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Philipp Koehn, Shervin Malmasi, Christof Monz, Mathias Muller, Santanu Pal, Matt Post, and Marcos Zampieri. 2019. Findings of the 2019 conference on machine translation (WMT19). In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), pages 1–61, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 201Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606.
    Findings
  • Ondrej Bojar, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Philipp Koehn, and Christof Monz. 2018. Findings of the 2018 conference on machine translation (WMT18). In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pages 272–303, Belgium, Brussels. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Stefan Bott and Horacio Saggion. 2011. Spanish text simplification: An exploratory study. Procesamiento del Lenguaje Natural, 47:87–95.
    Google ScholarLocate open access versionFindings
  • Dominique Brunato, Lorenzo De Mattei, Felice Dell’Orletta, Benedetta Iavarone, and Giulia Venturi. 2018. Is this sentence difficult? do you agree? In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2690–2699, Brussels, Belgium. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • John Carroll, Guido Minnen, Yvonne Canning, Siobhan Devlin, and John Tait. 1998. Practical simplification of english newspaper text to assist aphasic readers. In Proceedings of AAAI-98 Workshop on Integrating Artificial Intelligence and Assistive Technology, pages 7–10.
    Google ScholarLocate open access versionFindings
  • R. Chandrasekar, Christine Doran, and B. Srinivas. 1996. Motivations and methods for text simplification. In Proceedings of the 16th Conference on Computational Linguistics, volume 2 of COLING ’96, pages 1041–1044, Copenhagen, Denmark. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Jacob Cohen. 1968. Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4):213–220.
    Google ScholarLocate open access versionFindings
  • William Coster and David Kauchak. 2011. Simple english wikipedia: A new text simplification task. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers - Volume 2, HLT ’11, pages 665–669, Stroudsburg, PA, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Richard Evans, Constantin Orasan, and Iustin Dornescu. 20An evaluation of syntactic simplification rules for people with autism. In Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations, PIT 2014, pages 131–140, Gothenburg, Sweden. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Yvette Graham, Timothy Baldwin, Alistair Moffat, and Justin Zobel. 2013. Continuous measurement scales in human evaluation of machine translation. In Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, pages 33–41, Sofia, Bulgaria. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Eva Hasler, Adri de Gispert, Felix Stahlberg, Aurelien Waite, and Bill Byrne. 2017. Source sentence simplification for statistical machine translation. Computer Speech & Language, 45(C):221–235.
    Google ScholarLocate open access versionFindings
  • William Hwang, Hannaneh Hajishirzi, Mari Ostendorf, and Wei Wu. 2015. Aligning Sentences from Standard Wikipedia to Simple Wikipedia. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 211–217, Denver, Colorado. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • VI Levenshtein. 1966. Binary Codes Capable of Correcting Deletions, Insertions and Reversals. Soviet Physics Doklady, 10:707.
    Google ScholarLocate open access versionFindings
  • Edward Loper and Steven Bird. 2002. NLTK: the natural language toolkit. CoRR, cs.CL/0205028.
    Google ScholarLocate open access versionFindings
  • Louis Martin, Samuel Humeau, Pierre-Emmanuel Mazare, Eric de La Clergerie, Antoine Bordes, and Benoıt Sagot. 2018. Reference-less quality estimation of text simplification systems. In Proceedings of the 1st Workshop on Automatic Text Adaptation (ATA), pages 29–38, Tilburg, the Netherlands. ACL.
    Google ScholarLocate open access versionFindings
  • Louis Martin, Benoıt Sagot, Eric de la Clergerie, and Antoine Bordes. 2020. Controllable sentence simplification. In Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020).
    Google ScholarLocate open access versionFindings
  • Shashi Narayan and Claire Gardent. 2014. Hybrid simplification using deep semantics and machine translation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 435–445, Baltimore, Maryland. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Sergiu Nisioi, Sanja Stajner, Simone Paolo Ponzetto, and Liviu P. Dinu. 2017. Exploring neural text simplification models. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 85–91, Vancouver, Canada. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Charles Kay Ogden. 1930. Basic English: A General Introduction with Rules and Grammar. Kegan Paul, Trench, Trubner & Co.
    Google ScholarFindings
  • Gustavo Paetzold and Lucia Specia. 2016. SemEval 2016 task 11: Complex word identification. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pages 560– 569, San Diego, California. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Gustavo Henrique Paetzold. 2016. Lexical Simplification for Non-Native English Speakers. Ph.D. thesis, University of Sheffield, Sheffield, UK.
    Google ScholarFindings
  • Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL ’02, pages 311–318, Philadelphia, Pennsylvania. ACL.
    Google ScholarLocate open access versionFindings
  • Ellie Pavlick and Joel Tetreault. 2016. An empirical analysis of formality in online communication. Transactions of the Association for Computational Linguistics, 4:61–74.
    Google ScholarLocate open access versionFindings
  • David Pellow and Maxine Eskenazi. 2014a. An open corpus of everyday documents for simplification tasks. In Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR), pages 84–93, Gothenburg, Sweden. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • David Pellow and Maxine Eskenazi. 2014b. Tracking human process using crowd collaboration to enrich data. In Human Computation and Crowdsourcing: Works in Progress and Demonstration Abstracts. An Adjunct to the Proceedings of the Second AAAI Conference on Human Computation and Crowdsourcing, pages 52–53.
    Google ScholarLocate open access versionFindings
  • Sarah E. Petersen. 2007. Natural Language Processing Tools for Reading Level Assessment and Text Simplification for Bilingual Education. Ph.D. thesis, University of Washington, Seattle, WA, USA. AAI3275902.
    Google ScholarFindings
  • Sarah E. Petersen and Mari Ostendorf. 2007. Text simplification for language learners: a corpus analysis. In Proceedings of the Speech and Language Technology for Education Workshop, SLaTE 2007, pages 69–72.
    Google ScholarLocate open access versionFindings
  • Luz Rello, Clara Bayarri, Azuki Gorriz, Ricardo BaezaYates, Saurabh Gupta, Gaurang Kanvinde, Horacio Saggion, Stefan Bott, Roberto Carlini, and Vasile Topac. 2013. ”dyswebxia 2.0!: More accessible text for people with dyslexia”. In Proceedings of the 10th International Cross-Disciplinary Conference on Web Accessibility, W4A ’13, pages 25:1– 25:2, Rio de Janeiro, Brazil. ACM.
    Google ScholarLocate open access versionFindings
  • Carolina Scarton, Gustavo H. Paetzold, and Lucia Specia. 2018. Simpa: A sentence-level simplification corpus for the public administration domain. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA).
    Google ScholarLocate open access versionFindings
  • Sara Botelho Silveira and Antonio Branco. 2012. Enhancing multi-document summaries with sentence simplificatio. In Proceedings of the 14th International Conference on Artificial Intelligence,, ICAI 2012, pages 742–748, Las Vegas, USA.
    Google ScholarLocate open access versionFindings
  • Lucia Specia, Sandra Maria Aluısio, and Thiago A. Salgueiro Pardo. 2008. Manual de simplificacao sintatica para o portugues. Technical Report NILC-TR-08-06, NILC–ICMC–USP, Sao Carlos, SP, Brasil. Available in http://www.nilc.icmc.usp.br/nilc/download/NILC_TR_08_06.pdf.
    Findings
  • Elior Sulem, Omri Abend, and Ari Rappoport. 2018a. Bleu is not suitable for the evaluation of text simplification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 738–744. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Elior Sulem, Omri Abend, and Ari Rappoport. 2018b. Semantic structural evaluation for text simplification. In Proceedings of the 2018 Conference of the
    Google ScholarLocate open access versionFindings
  • North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 685–696, New Orleans, Louisiana. Association for Computational Linguistics.
    Google ScholarFindings
  • Sanja Stajner, Marc Franco-Salvador, Paolo Rosso, and Simone Paolo Ponzetto. 2018. Cats: A tool for customized alignment of text simplification corpora. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA).
    Google ScholarLocate open access versionFindings
  • Sanja Stajner, Ruslan Mitkov, and Horacio Saggion. 2014. One step closer to automatic evaluation of text simplification systems. In Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR), pages 1–10, Gothenburg, Sweden. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • pages 3164–3173, Brussels, Belgium. Association for Computational Linguistics.
    Google ScholarFindings
  • Zhemin Zhu, Delphine Bernhard, and Iryna Gurevych. 2010. A monolingual tree-based translation model for sentence simplification. In Proceedings of the 23rd International Conference on Computational Linguistics, COLING ’10, pages 1353–1361, Stroudsburg, PA, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Sanja Stajner, Maja Popovic, Horacio Saggion, Lucia Specia, and Mark Fishel. 2016. Shared task on quality assessment for text simplification. In Proceeding of the Workshop on Quality Assessment for Text Simplification - LREC 2016, QATS 2016, pages 22–31, Portoroz, Slovenia. European Language Resources Association (ELRA).
    Google ScholarLocate open access versionFindings
  • Sander Wubben, Antal van den Bosch, and Emiel Krahmer. 2012. Sentence simplification by monolingual machine translation. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1, ACL ’12, pages 1015–1024, Stroudsburg, PA, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Wei Xu, Chris Callison-Burch, and Courtney Napoles. 2015. Problems in current text simplification research: New data can help. Transactions of the Association for Computational Linguistics, 3:283–297.
    Google ScholarLocate open access versionFindings
  • Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, and Chris Callison-Burch. 2016. Optimizing statistical machine translation for text simplification. Transactions of the Association for Computational Linguistics, 4:401–415.
    Google ScholarLocate open access versionFindings
  • Taha Yasseri, Andras Kornai, and Janos Kertesz. 2012. A practical approach to language complexity: A wikipedia case study. PLOS ONE, 7(11):1–8.
    Google ScholarLocate open access versionFindings
  • Xingxing Zhang and Mirella Lapata. 2017. Sentence simplification with deep reinforcement learning. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 595–605, Copenhagen, Denmark. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Sanqiang Zhao, Rui Meng, Daqing He, Andi Saptono, and Bambang Parmanto. 2018. Integrating transformer and paraphrase rules for sentence simplification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing,
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments