PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction

Xinyao Ma
Xinyao Ma
Hannah Rashkin
Hannah Rashkin

EMNLP 2020, pp. 7426-7441, 2020.

Other Links: arxiv.org|academic.microsoft.com
Weibo:
We introduce a new text revision task of Controllable Debiasing, to help debias the portrayal of characters through the lens of connotation frames of power and agency

Abstract:

Unconscious biases continue to be prevalent in modern text and media, calling for algorithms that can assist writers with bias correction. For example, a female character in a story is often portrayed as passive and powerless (“_She daydreams about being a doctor_”) while a man is portrayed as more proactive and powerful (“_He pursues his...More

Code:

Data:

0
Introduction
  • Narratives and news texts often reflect societal biases and stereotypes, such as the traditional gender role that women are passive and submissive (Lakoff, 1973; Fiske, 1993; Fast et al, 2016).
  • AGENT to daydream agency(AG) = low.
  • AGENT to pursue agency(AG) = high.
  • Low agency.
  • Alex received a book from their friend.
  • Alex calls their friend.
  • Alex took a book from the friend.
  • Explanation Alex picked up the phone but did not actively initiate the conversation.
  • Alex is portrayed passively receiving things not actively asking for the book.
  • Alex is actively participating in borrowing the book
Highlights
  • Narratives and news texts often reflect societal biases and stereotypes, such as the traditional gender role that women are passive and submissive (Lakoff, 1973; Fiske, 1993; Fast et al, 2016)
  • We study the portrayal biases through the lens of connotation frames of power and agency (Sap et al, 2017), which provide pragmatic knowledge about implied power and agency levels projected onto characters by a predicate
  • Using POWERTRANSFORMER, we revise the movie scripts and significantly increase the agency levels of female characters, thereby reducing the gender bias
  • Our results show that using the joint objective with boosting increases the diversity of output, but causes marginally more repetition of bigrams
  • We introduce a new text revision task of Controllable Debiasing, to help debias the portrayal of characters through the lens of connotation frames of power and agency
  • We create POWERTRANSFORMER, a transformer-based encoderdecoder trained on a joint reconstruction and paraphrasing objective
Methods
  • The authors randomize ROC story and paraphrase data, and use OpenAI GPT LM as the pretrained model.
  • 4.4 Investigating Effectiveness of Approach.
  • Set, and investigate the importance of various components in the approach through ablation analyses.
  • The authors first investigate the importance of the reconstruction objective, by comparing the joint objective model (Joint) with a model trained with just the paraphrasing objective.
  • Note that ParaOnly+noBoost is equivalent to a GPT-based encoder-decoder model, similar to seq2seq frameworks commonly used in paraphrasing tasks (Cao et al, 2017; Li et al, 2018b; Prakash et al, 2016).
Results
  • In Table 2, the results show that the full model (Joint+Boost) yields text revisions with the most accurate target agency and the most meaning preservation.
  • While the BST revisions obtain slightly higher accuracy on the output agency levels, these revisions have the both the lowest diversity and meaning preservation, suggesting the model ignores the input (Table 4).
  • While this seemingly contradicts BST’s low perplexity scores, this is in line with previous work showing automatic fluency metrics can favor degenerate, bland, or repetitive language (Holtzman et al, 2020)
Conclusion
  • The authors introduce a new text revision task of Controllable Debiasing, to help debias the portrayal of characters through the lens of connotation frames of power and agency.
  • To this end, the authors create POWERTRANSFORMER, a transformer-based encoderdecoder trained on a joint reconstruction and paraphrasing objective.
  • The authors' findings highlight the potential of neural models as a tool for editing out social biases in text
Summary
  • Introduction:

    Narratives and news texts often reflect societal biases and stereotypes, such as the traditional gender role that women are passive and submissive (Lakoff, 1973; Fiske, 1993; Fast et al, 2016).
  • AGENT to daydream agency(AG) = low.
  • AGENT to pursue agency(AG) = high.
  • Low agency.
  • Alex received a book from their friend.
  • Alex calls their friend.
  • Alex took a book from the friend.
  • Explanation Alex picked up the phone but did not actively initiate the conversation.
  • Alex is portrayed passively receiving things not actively asking for the book.
  • Alex is actively participating in borrowing the book
  • Objectives:

    Given the known bias that female characters are portrayed with less agency (Sap et al, 2017), the goal is to re-balance their agency levels to be more on par with those of male characters.
  • Methods:

    The authors randomize ROC story and paraphrase data, and use OpenAI GPT LM as the pretrained model.
  • 4.4 Investigating Effectiveness of Approach.
  • Set, and investigate the importance of various components in the approach through ablation analyses.
  • The authors first investigate the importance of the reconstruction objective, by comparing the joint objective model (Joint) with a model trained with just the paraphrasing objective.
  • Note that ParaOnly+noBoost is equivalent to a GPT-based encoder-decoder model, similar to seq2seq frameworks commonly used in paraphrasing tasks (Cao et al, 2017; Li et al, 2018b; Prakash et al, 2016).
  • Results:

    In Table 2, the results show that the full model (Joint+Boost) yields text revisions with the most accurate target agency and the most meaning preservation.
  • While the BST revisions obtain slightly higher accuracy on the output agency levels, these revisions have the both the lowest diversity and meaning preservation, suggesting the model ignores the input (Table 4).
  • While this seemingly contradicts BST’s low perplexity scores, this is in line with previous work showing automatic fluency metrics can favor degenerate, bland, or repetitive language (Holtzman et al, 2020)
  • Conclusion:

    The authors introduce a new text revision task of Controllable Debiasing, to help debias the portrayal of characters through the lens of connotation frames of power and agency.
  • To this end, the authors create POWERTRANSFORMER, a transformer-based encoderdecoder trained on a joint reconstruction and paraphrasing objective.
  • The authors' findings highlight the potential of neural models as a tool for editing out social biases in text
Tables
  • Table1: Statistics for our main story sentences dataset (ROC) and for the external paraphrase corpus (Para.)
  • Table2: Ablation study results on the development set. We present separate metrics for evaluating the change in agency, the meaning preservation, fluency, repetitiveness and diversity of the output (bolding the best performance). (↑) indicates that higher is better and (↓) indicates that lower is better
  • Table3: Performance of different re-writing methods on the neg-to-pos and pos-to-neg subsets of the test set (bolding the best performance). We evaluate the change in agency and the meaning preservation. As secondary metrics, we include fluency, repetitiveness, and diversity of the output
  • Table4: Example sentences from our dev. set, along with their revisions from various models and the achieved agency levels (Agency(out)). Examples (a)-(c) should be rewritten from high to low agency, and (d)-(f) from low to high agency. Confirming our quantitative results in Tables 2 and 3, POWERTRANSFORMER (Joint+Boost) is the most effective at making purposeful and precise changes to the input sentences to alter their agency while minimally changing their meaning. Revisions from more models are listed in Table 6 (in the appendix)
  • Table5: POWERTRANSFORMER hyperparameters
  • Table6: Full version of Table 4. Example revisions from various models for sentences from the dev. set. Columns are: the target change in agency from the original to the target agency, the input sentence, the model, generated output, and the actual agency level of the output measured by the connotation frame lexicon
Download tables as Excel
Related work
  • Controllable Debiasing is a new formalization of the unsupervised stylistic rewriting task, contrasting with supervised approaches which benefit from parallel corpora (e.g., Xu et al, 2012, 2015; Rao and Tetreault, 2018; Pryzant et al, 2020). In unsupervised settings, a majority of work has dealt with the dearth of parallel data by using encoderdecoder setups paired with discriminators to disentangle style from content and steer generations (e.g., Shen et al, 2017; Zhang et al, 2018; Fu et al, 2018; Yang et al, 2018; Niu and Bansal, 2018; Romanov et al, 2019; Dai et al, 2019; John et al, 2019) or backtranslation setups (Prabhumoye et al, 2018; Lample et al, 2018). In contrast, Li et al (2018a) introduce a modular approach (later adapted to transformer models by Sudhakar et al, 2019) that relies on drop-in replacement of attribute markers followed by language correction. POWERTRANSFORMER improves on this approach with an additional out-of-domain paraphrasing objective.

    While a majority of related existing stylistic rewriting work defines style as sentiment (e.g., on reviews), a notable exception is Nogueira dos Santos et al (2018), who use stylistic rewriting to make text less hateful or offensive. Similar in spirit, Controllable Debiasing is a novel formalization that aims to address and revise social biases expressed in text, but using the nuanced implications distilled in connotation frames of power and agency instead of binary offensiveness.
Funding
  • This research was supported in part by NSF (IIS1524371, IIS-1714566), DARPA under the CwC program through the ARO (W911NF-15-1-0543), DARPA under the MCS program through NIWC Pacific (N66001-19-2-4031), and the National Science Foundation Graduate Research Fellowship Program under Grant No DGE-1256082
Study subjects and analysis
narrations for male: 34
This bias in representation is also present at the narrative level. Specifically, female characters are only mentioned in nnarr,f =27 narrations on average, compared to nnarr,m =34 narrations for male characters (Cohen’s |d| = 0.13, p < 0.001). Similarly, compared to their male counterparts, female characters are described in significantly fewer words (nwords,f = 329, nwords,m = 435, |d| = 0.14, p < 0.001) and with fewer verbs (nverbs,f = 41, nverbs,m = 54, |d| = 0.13, p < 0.001)

Reference
  • Colin Bannard and Chris Callison-Burch. 2005. Paraphrasing with bilingual parallel corpora. In ACL.
    Google ScholarFindings
  • Elizabeth Behm-Morawitz and Dana E Mastro. 2008. Mean girls? the influence of gender portrayals in teen movies on emerging adults’ gender-based attitudes and beliefs. Journalism & Mass Communication Quarterly, 85(1):131–146.
    Google ScholarLocate open access versionFindings
  • Steven Bird, Ewan Klein, and Edward Loper. 2009. Natural language processing with Python: analyzing text with the natural language toolkit. " O’Reilly Media, Inc.".
    Google ScholarFindings
  • Ziqiang Cao, Chuwei Luo, Wenjie Li, and Sujian Li. 2017. Joint copying and restricted generation for paraphrase. In AAAI.
    Google ScholarFindings
  • Sapna Cheryan and Hazel Rose Markus. 2020. Masculine defaults: Identifying and mitigating hidden cultural biases. Psychological Review.
    Google ScholarLocate open access versionFindings
  • Elizabeth Clark, Anne Spencer Ross, Chenhao Tan, Yangfeng Ji, and Noah A Smith. 2018. Creative writing with a machine in the loop: Case studies on slogans and stories. In IUI.
    Google ScholarFindings
  • Mathias Creutz. 2018. Open subtitles paraphrase corpus for six languages. In LREC. Corpus available at http://urn.fi/urn:nbn:fi:lb-201804191.
    Locate open access versionFindings
  • Ning Dai, Jianze Liang, Xipeng Qiu, and Xuanjing Huang. 2019. Style transformer: Unpaired text style transfer without disentangled latent representation. In ACL.
    Google ScholarFindings
  • Sumanth Dathathri, Andrea Madotto, Janice Lan, Jane Hung, Eric Frank, Piero Molino, Jason Yosinski, and Rosanne Liu. 2020. Plug and play language models: A simple approach to controlled text generation. In ICLR.
    Google ScholarFindings
  • Daniel Clement Dennett. 1989. The intentional stance. MIT press.
    Google ScholarFindings
  • Ethan Fast, Tina Vachovsky, and Michael S Bernstein. 2016. Shirtless and dangerous: Quantifying linguistic signals of gender bias in an online fiction writing community. In ICWSM.
    Google ScholarFindings
  • Jessica Ficler and Yoav Goldberg. 2017. Controlling linguistic style aspects in neural language generation. In EMNLP Workshop on Stylistic Variation.
    Google ScholarLocate open access versionFindings
  • Anjalie Field, Gayatri Bhat, and Yulia Tsvetkov. 2019. Contextual affective analysis: A case study of people portrayals in online #metoo stories. In ICWSM.
    Google ScholarFindings
  • Anjalie Field and Yulia Tsvetkov. 2019. Entity-centric contextual affective analysis. In ACL.
    Google ScholarFindings
  • Susan T Fiske. 1993. Controlling other people. the impact of power on stereotyping. American psychologist, 48(6):621–628.
    Google ScholarLocate open access versionFindings
  • Zhenxin Fu, Xiaoye Tan, Nanyun Peng, Dongyan Zhao, and Rui Yan. 2018. Style transfer in text: Exploration and evaluation. In AAAI.
    Google ScholarFindings
  • Marjan Ghazvininejad, Xing Shi, Yejin Choi, and Kevin Knight. 2016. Generating topical poetry. In EMNLP.
    Google ScholarFindings
  • Marjan Ghazvininejad, Xing Shi, Jay Priyadarshi, and Kevin Knight. 2017. Hafez: an interactive poetry generation system. In ACL Demonstrations.
    Google ScholarFindings
  • Sayan Ghosh, Mathieu Chollet, Eugene Laksana, Louis-Philippe Morency, and Stefan Scherer. 2017. Affect-LM: A neural language model for customizable affective text generation. In ACL.
    Google ScholarFindings
  • Google. 2017. Using technology to address gender bias in film. https://www.google.com/about/main/gender-equality-films/index.html.
    Findings
  • Philip Gorinski and Mirella Lapata. 2015. Movie script summarization as graph-based scene extraction. In NAACL.
    Google ScholarFindings
  • Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, and Yejin Choi. 2020. The curious case of neural text degeneration. In ICLR.
    Google ScholarFindings
  • Zhiting Hu, Zichao Yang, Xiaodan Liang, Ruslan Salakhutdinov, and Eric P Xing. 2017. Toward controlled generation of text. In ICML.
    Google ScholarFindings
  • Vineet John, Lili Mou, Hareesh Bahuleyan, and Olga Vechtomova. 2019. Disentangled representation learning for non-parallel text style transfer. In ACL.
    Google ScholarFindings
  • Svetlana Kiritchenko and Saif Mohammad. 2017. Bestworst scaling more reliable than rating scales: A case study on sentiment intensity annotation. In ACL.
    Google ScholarFindings
  • Rik Koncel-Kedziorski, Ioannis Konstas, Luke Zettlemoyer, and Hannaneh Hajishirzi. 2016. A themerewriting approach for generating algebra word problems. In EMNLP.
    Google ScholarFindings
  • Robin Lakoff. 1973. Language and woman’s place. Language in society, 2(1):45–79.
    Google ScholarLocate open access versionFindings
  • Guillaume Lample, Sandeep Subramanian, Eric Smith, Ludovic Denoyer, Marc’aurelio Ranzato, and Y-Lan Boureau. 2018. Multiple-Attribute text rewriting. In ICLR.
    Google ScholarFindings
  • Juncen Li, Robin Jia, He He, and Percy Liang. 2018a. Delete, retrieve, generate: A simple approach to sentiment and style transfer. In NAACL.
    Google ScholarFindings
  • Zichao Li, Xin Jiang, Lifeng Shang, and Hang Li. 2018b. Paraphrase generation with deep reinforcement learning. In EMNLP.
    Google ScholarFindings
  • Chia-Wei Liu, Ryan Lowe, Iulian Serban, Mike Noseworthy, Laurent Charlin, and Joelle Pineau. 2016. How NOT to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. In EMNLP.
    Google ScholarFindings
  • Judith Lorber, Susan A Farrell, et al. 1991. The social construction of gender. Newbury Park, 5.
    Google ScholarFindings
  • Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization. In ICLR.
    Google ScholarFindings
  • Remi Mir, Bjarke Felbo, Nick Obradovich, and Iyad Rahwan. 2019. Evaluating style transfer for text. In NAACL.
    Google ScholarFindings
  • Nasrin Mostafazadeh, Nathanael Chambers, Xiaodong He, Devi Parikh, Dhruv Batra, Lucy Vanderwende, Pushmeet Kohli, and James Allen. 2016. A corpus and cloze evaluation for deeper understanding of commonsense stories. In NAACL. Corpus available at https://www.cs.rochester.edu/nlp/rocstories/.
    Locate open access versionFindings
  • Tong Niu and Mohit Bansal. 2018. Polite dialogue generation without parallel data. TACL.
    Google ScholarLocate open access versionFindings
  • Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global vectors for word representation. In EMNLP.
    Google ScholarFindings
  • Shrimai Prabhumoye, Yulia Tsvetkov, Ruslan Salakhutdinov, and Alan W Black. 2018. Style transfer through Back-Translation. In ACL. Code available at https://github.com/shrimai/ Style-Transfer-Through-Back-Translation.
    Locate open access versionFindings
  • Aaditya Prakash, Sadid A. Hasan, Kathy Lee, Vivek Datla, Ashequl Qadir, Joey Liu, and Oladimeji Farri. 2016. Neural paraphrase generation with stacked residual LSTM networks. In COLING.
    Google ScholarFindings
  • Reid Pryzant, Richard Diehl Martinez, Nathan Dass, Sadao Kurohashi, Dan Jurafsky, and Diyi Yang. 2020. Automatically neutralizing subjective bias in text. In AAAI.
    Google ScholarFindings
  • Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. Unpublished.
    Google ScholarFindings
  • Anil Ramakrishna, Victor R Martínez, Nikolaos Malandrakis, Karan Singla, and Shrikanth Narayanan. 2017. Linguistic analysis of differences in portrayal of movie characters. In ACL.
    Google ScholarFindings
  • Sudha Rao and Joel Tetreault. 2018. Dear sir or madam, may I introduce the GYAFC dataset: Corpus, benchmarks and metrics for formality style transfer. In NAACL.
    Google ScholarFindings
  • Zichao Yang, Zhiting Hu, Chris Dyer, Eric P Xing, and Taylor Berg-Kirkpatrick. 2018. Unsupervised text style transfer using language models as discriminators. In NeurIPS.
    Google ScholarFindings
  • Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. 2020. Bertscore: Evaluating text generation with BERT. In ICLR.
    Google ScholarFindings
  • Ye Zhang, Nan Ding, and Radu Soricut. 2018. SHAPED: Shared-Private Encoder-Decoder for text style adaptation. In NAACL.
    Google ScholarFindings
  • Hannah Rashkin, Sameer Singh, and Yejin Choi. 2016. Connotation frames: A data-driven investigation. In ACL.
    Google ScholarFindings
  • Alexey Romanov, Anna Rumshisky, Anna Rogers, and David Donahue. 2019. Adversarial decomposition of text representation. In NAACL.
    Google ScholarFindings
  • Cicero Nogueira dos Santos, Igor Melnyk, and Inkit Padhi. 2018. Fighting offensive language on social media with unsupervised text style transfer. In ACL.
    Google ScholarFindings
  • Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A Smith, and Yejin Choi. 2020. Social bias frames: Reasoning about social and power implications of language. In ACL.
    Google ScholarFindings
  • Maarten Sap, Marcella Cindy Prasettio, Ari Holtzman, Hannah Rashkin, and Yejin Choi. 2017. Connotation frames of power and agency in modern films. In EMNLP. Connotation Frames downloaded from http://maartensap.com/movie-bias/.
    Locate open access versionFindings
  • Tianxiao Shen, Tao Lei, Regina Barzilay, and Tommi Jaakkola. 2017. Style transfer from Non-Parallel text by Cross-Alignment. In NeurIPS.
    Google ScholarFindings
  • Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, and Nanyun Peng. 2019. The woman worked as a babysitter: On biases in language generation. In EMNLP.
    Google ScholarFindings
  • Akhilesh Sudhakar, Bhargav Upadhyay, and Arjun Maheswaran. 2019. Transforming delete, retrieve, generate approach for controlled text style transfer. In EMNLP.
    Google ScholarFindings
  • Thomas Wolf, L Debut, V Sanh, J Chaumond, C Delangue, A Moi, P Cistac, T Rault, R Louf, M Funtowicz, et al. 2019. Huggingface’s transformers: State-of-the-art natural language processing. Unpublished.
    Google ScholarFindings
  • Wei Xu, Chris Callison-Burch, and Courtney Napoles. 2015. Problems in current text simplification research: New data can help. TACL.
    Google ScholarLocate open access versionFindings
  • Wei Xu, Alan Ritter, Bill Dolan, Ralph Grishman, and Colin Cherry. 2012. Paraphrasing for style. In COLING.
    Google ScholarFindings
  • This corpus contains paraphrases of spoken dialogue extracted from movie and TV subtitles.13 OpusParcus was created by automatically aligning the subtitles sentences using several probabilistic metrics, including likelihood under a roundtrip translation paraphrasing model (Bannard and Callison-Burch, 2005) and pointwise mutual information. For our paraphrasing dataset, we apply the same filtering as with the ROC story corpus to the English portion of the OpusParcus training corpus and select the top 10% highest scoring paraphrases using the PMI scoring from the original paper. We extract agency levels for each pair of paraphrases, and select pairs to obtain roughly equal number of agency-level pairs (i.e., 1/9th positive-neutral, 1/9th positive-negative, etc.) We preprocess the text by stripping any leading periods and commas.
    Google ScholarFindings
  • We use the Hugging Face (Wolf et al., 2019) implementation of OpenAI’s GPT model (117M parameters; Radford et al., 2018). our final setup uses AdamW (Loshchilov and Hutter, 2019) as our optimizer with a learning weight of 1e-5, batch size of 4 and maximum sequence length of 64. In preliminary results, we find that β=5 aptly steers the generation while avoiding repetition issues.
    Google ScholarFindings
  • From http://www.opensubtitles.org
    Findings
Full Text
Your rating :
0

 

Tags
Comments