s few-shot capabilities. Inspired by the recent success of leveraging a retrieval module to augment large-scale neural network models, we propose to retrieve examples that are semantically-similar to a test sample to formulate its corresponding prompt. Intuitively, the in-context examples selected with such a strategy may serve as more informative inputs to unleash GPT-$3 s extensive knowledge. We evaluate the proposed approach on several natural language understanding and generation benchmarks, where the retrieval-based prompt selection approach consistently outperforms the random baseline. Moreover, it is observed that the sentence encoders fine-tuned on task-related datasets yield even more helpful retrieval results. Notably, significant gains are observed on tasks such as table-to-text generation (41.9% on the ToTTo dataset) and open-domain question answering (45.5% on the NQ dataset). We hope our investigation could help understand the behaviors of GPT-$3$ and large-scale pre-trained LMs in general and enhance their few-shot capabilities. ","authors":[{"name":"Jiachang Liu"},{"id":"561d7d0145cedb33980841c8","name":"Dinghan Shen"},{"id":"562f456b45cedb33995dbe96","name":"Yizhe Zhang"},{"id":"53f43f9cdabfaedd74ddb705","name":"Bill Dolan"},{"id":"53f58452dabfaeaca9f8045b","name":"Lawrence Carin"},{"id":"53f44b16dabfaedf435df98d","name":"Weizhu Chen"}],"id":"6006bb4891e0111a1b6a2346","num_citation":0,"order":3,"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fstorage\u002Fpdf\u002Farxiv\u002F21\u002F2101\u002F2101.06804.pdf","title":"What Makes Good In-Context Examples for GPT-$3$?","urls":["https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.06804"],"versions":[{"id":"6006bb4891e0111a1b6a2346","sid":"2101.06804","src":"arxiv","year":2021}],"year":2021},{"abstract":" Adversarial examples expose the vulnerabilities of natural language processing (NLP) models, and can be used to evaluate and improve their robustness. Existing techniques of generating such examples are typically driven by local heuristic rules that are agnostic to the context, often resulting in unnatural and ungrammatical outputs. This paper presents CLARE, a ContextuaLized AdversaRial Example generation model that produces fluent and grammatical outputs through a mask-then-infill procedure. CLARE builds on a pre-trained masked language model and modifies the inputs in a context-aware manner. We propose three contextualized perturbations, Replace, Insert and Merge, allowing for generating outputs of varied lengths. With a richer range of available strategies, CLARE is able to attack a victim model more efficiently with fewer edits. Extensive experiments and human evaluation demonstrate that CLARE outperforms the baselines in terms of attack success rate, textual similarity, fluency and grammaticality. ","authors":[{"id":"54590eb4dabfaeb0fe2d049a","name":"Dianqi Li"},{"id":"562f456b45cedb33995dbe96","name":"Yizhe Zhang"},{"id":"5603e6cf45cedb339628c644","name":"Hao Peng"},{"id":"54055501dabfae8faa5c3a71","name":"Liqun Chen"},{"id":"53f4804edabfae963d2596a1","name":"Chris Brockett"},{"id":"53f5687cdabfae65cff804a4","name":"Ming-Ting Sun"},{"id":"53f43f9cdabfaedd74ddb705","name":"Bill Dolan"}],"flags":[{"flag":"affirm_author","person_id":"53f43f9cdabfaedd74ddb705"}],"id":"5f632df091e011242e3f2b42","num_citation":4,"order":6,"pages":{"end":"5069","start":"5053"},"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fstorage\u002Fpdf\u002Farxiv\u002F20\u002F2009\u002F2009.07502.pdf","title":"Contextualized Perturbation for Textual Adversarial Attack","urls":["https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.07502","https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Fnaacl\u002FLiZPCBSD21","https:\u002F\u002Fwww.aclweb.org\u002Fanthology\u002F2021.naacl-main.400\u002F"],"venue":{"info":{"name":"NAACL-HLT"}},"versions":[{"id":"5f632df091e011242e3f2b42","sid":"2009.07502","src":"arxiv","year":2020},{"id":"60bf411491e0110bd0f6c9cc","sid":"conf\u002Fnaacl\u002FLiZPCBSD21","src":"dblp","vsid":"conf\u002Fnaacl","year":2021}],"year":2021},{"abstract":"Existing language models excel at writing from scratch, but many real-world scenarios require rewriting an existing document to fit a set of constraints. Although sentence-level rewriting has been fairly well-studied, little work has addressed the challenge of rewriting an entire document coherently. In this work, we introduce the task of document-level targeted content transfer and address it in the recipe domain, with a recipe as the document and a dietary restriction (such as vegan or dairy-free) as the targeted constraint. We propose a novel model for this task based on the generative pre-trained language model (GPT-2) and train on a large number of roughly-aligned recipe pairs. Both automatic and human evaluations show that our model out-performs existing methods by generating coherent and diverse rewrites that obey the constraint while remaining close to the original document. Finally, we analyze our model’s rewrites to assess progress toward the goal of making language generation more attuned to constraints that are substantive rather than stylistic.","authors":[{"name":"Allison Hegel"},{"name":"Sudha Rao"},{"id":"53f46d9cdabfaee2a1dcb9b3","name":"Asli Celikyilmaz"},{"id":"53f43f9cdabfaedd74ddb705","name":"Bill Dolan"}],"doi":"10.18653\u002FV1\u002F2020.EMNLP-MAIN.526","id":"5f7fe6d80205f07f68973308","num_citation":0,"order":3,"pages":{"end":"6504","start":"6485"},"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fstorage\u002Fpdf\u002Farxiv\u002F20\u002F2010\u002F2010.08618.pdf","title":"Substance over Style: Document Level Targeted Content Transfer","urls":["https:\u002F\u002F2020.emnlp.org\u002Fpapers\u002Fmain","https:\u002F\u002Farxiv.org\u002Fabs\u002F2010.08618","https:\u002F\u002Fdblp.uni-trier.de\u002Fdb\u002Fjournals\u002Fcorr\u002Fcorr2010.html#abs-2010-08618","https:\u002F\u002Fwww.aclweb.org\u002Fanthology\u002F2020.emnlp-main.526\u002F"],"venue":{"info":{"name":"EMNLP 2020"}},"versions":[{"id":"5f7fe6d80205f07f68973308","sid":"emnlp2020#496","src":"conf_emnlp","year":2020},{"id":"5f8ea9fe91e01153024c4b78","sid":"2010.08618","src":"arxiv","year":2020},{"id":"5ff68b88d4150a363ccf9b67","sid":"3099500276","src":"mag","vsid":"1192655580","year":2020}],"year":2020},{"abstract":" Large-scale pre-trained language models, such as BERT and GPT-2, have achieved excellent performance in language representation learning and free-form text generation. However, these models cannot be directly employed to generate text under specified lexical constraints. To address this challenge, we present POINTER, a simple yet novel insertion-based approach for hard-constrained text generation. The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner. This procedure is recursively applied until a sequence is completed. The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable. Since our training objective resembles the objective of masked language modeling, BERT can be naturally utilized for initialization. We pre-train our model with the proposed progressive insertion-based objective on a 12GB Wikipedia dataset, and fine-tune it on downstream hard-constrained generation tasks. Non-autoregressive decoding yields a logarithmic time complexity during inference time. Experimental results on both News and Yelp datasets demonstrate that POINTER achieves state-of-the-art performance on constrained text generation. We intend to release the pre-trained model to facilitate future research. ","authors":[{"id":"562f456b45cedb33995dbe96","name":"Zhang Yizhe"},{"id":"562d0a9545cedb3398d38079","name":"Wang Guoyin"},{"id":"53f42d9adabfaee1c0a36753","name":"Li Chunyuan"},{"id":"5622795d45cedb33983cc300","name":"Gan Zhe"},{"id":"53f4804edabfae963d2596a1","name":"Brockett Chris"},{"id":"53f43f9cdabfaedd74ddb705","name":"Dolan Bill"}],"flags":[{"flag":"affirm_author","person_id":"53f43f9cdabfaedd74ddb705"}],"id":"5eb78919da5629cf24430377","num_citation":2,"order":5,"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fstorage\u002Fpdf\u002Farxiv\u002F20\u002F2005\u002F2005.00558.pdf","title":"POINTER: Constrained Progressive Text Generation via Insertion based Generative Pre training","urls":["https:\u002F\u002Farxiv.org\u002Fabs\u002F2005.00558","https:\u002F\u002F2020.emnlp.org\u002Fpapers\u002Fmain"],"venue":{"info":{"name":"EMNLP 2020"}},"versions":[{"id":"5eb78919da5629cf24430377","sid":"2005.00558","src":"arxiv","year":2020},{"id":"5f7fe6d80205f07f689732b5","sid":"emnlp2020#413","src":"conf_emnlp","year":2020}],"year":2020},{"authors":[{"name":"Z Li"},{"id":"53f4804edabfae963d2596a1","name":"CJ Brockett"},{"id":"53f43f9cdabfaedd74ddb705","name":"WB Dolan"},{"name":"CB Quirk"},{"name":"AY Lai"},{"name":"SM Hendrich"},{"name":"O Gauthier"}],"flags":[{"flag":"affirm_author","person_id":"53f43f9cdabfaedd74ddb705"}],"id":"60586c3c9e795e4ac8d31a35","lang":"en","num_citation":0,"order":2,"title":"Targeted rewrites","urls":["https:\u002F\u002Fscholar.google.com.hk\u002Fcitations?view_op=view_citation&hl=zh-CN&user=xBvFANIAAAAJ&pagesize=100&sortby=pubdate&citation_for_view=xBvFANIAAAAJ:V3AGJWp-ZtQC"],"versions":[{"id":"60586c3c9e795e4ac8d31a35","sid":"60586c3c9e795e4ac8d31a35","src":"user-5d4bc4a8530c70a9b361c870","year":2020}],"year":2020},{"abstract":" While recent state-of-the-art results for adversarial imitation-learning algorithms are encouraging, recent works exploring the imitation learning from observation (ILO) setting, where trajectories \\textit{only} contain expert observations, have not been met with the same success. Inspired by recent investigations of $f$-divergence manipulation for the standard imitation learning setting(Ke et al., 2019; Ghasemipour et al., 2019), we here examine the extent to which variations in the choice of probabilistic divergence may yield more performant ILO algorithms. We unfortunately find that $f$-divergence minimization through reinforcement learning is susceptible to numerical instabilities. We contribute a reparameterization trick for adversarial imitation learning to alleviate the optimization challenges of the promising $f$-divergence minimization framework. Empirically, we demonstrate that our design choices allow for ILO algorithms that outperform baseline approaches and more closely match expert performance in low-dimensional continuous-control tasks. ","authors":[{"name":"Dilip Arumugam"},{"id":"53f4324ddabfaedce5503fde","name":"Debadeepta Dey"},{"id":"53f4c2c2dabfaedce565c711","name":"Alekh Agarwal"},{"id":"53f46d9cdabfaee2a1dcb9b3","name":"Asli Celikyilmaz"},{"name":"Elnaz Nouri"},{"id":"53f43f9cdabfaedd74ddb705","name":"Bill Dolan"}],"id":"5ef0816891e0112aee0429db","num_citation":0,"order":5,"title":"Reparameterized Variational Divergence Minimization for Stable Imitation","urls":["https:\u002F\u002Farxiv.org\u002Fabs\u002F2006.10810"],"versions":[{"id":"5ef0816891e0112aee0429db","sid":"2006.10810","src":"arxiv","year":2020}],"year":2020},{"abstract":"A “Facet Recommender” creates conversational recommendations for facets of particular conversational topics, and optionally for things associated with those facets, from consumer reviews or other social media content. The Facet Recommender applies a machine-learned facet model and optional sentiment-model, to identify facets associated with spans or segments of the content and to determine neutral, positive, or negative consumer sentiment associated with those facets and, optionally, things associated with those facets. These facets are selected by the facet model from a list or set of manually defined or machine-learned facets for particular conversational topic types. The Facet Recommender then generates new conversational utterances (i.e., short neutral, positive or negative suggestions) about particular facets based on the sentiments associated with those facets. In various implementations, utterances are fit to one or more predefined conversational frameworks. Further, responses or suggestions provided as utterances may be personalized to individual users.","authors":[{"id":"53f43f9cdabfaedd74ddb705","name":"Bill Dolan"},{"name":"Margaret Mitchell"},{"name":"Jay Banerjee"},{"name":"Pallavi Choudhury"},{"name":"Susan Hendrich"},{"name":"Rebecca Mason"},{"name":"Ron Owens"},{"name":"Mouni Reddy"},{"name":"Yaxiao Song"},{"id":"53f46967dabfaedf43651f51","name":"Kristina Toutanova"},{"name":"Liang Xu"},{"name":"Xuetao Yin"}],"id":"605873d69e795e4ac8d38505","lang":"en","num_citation":1,"order":0,"title":"Sentiment-based recommendations as a function of grounding factors associated with a user","urls":["https:\u002F\u002Flens.org\u002F028-190-172-560-327","https:\u002F\u002Fscholar.google.com.hk\u002Fcitations?view_op=view_citation&hl=zh-CN&user=KbD1YlQAAAAJ&pagesize=100&sortby=pubdate&citation_for_view=KbD1YlQAAAAJ:SP6oXDckpogC","https:\u002F\u002Facademic.microsoft.com\u002Fpaper\u002F2926762489\u002Fcitedby\u002Fsearch?q=Sentiment-based+recommendations+as+a+function+of+grounding+factors+associated+with+a+user&qe=RId%253D1985105481&f=&orderBy=0"],"versions":[{"id":"605873d69e795e4ac8d38505","sid":"605873d69e795e4ac8d38505","src":"user-5d4bc4a8530c70a9b361c870","year":2020}],"year":2020},{"authors":[{"name":"Roshan Rao"},{"name":"Sudha Rao"},{"name":"Elnaz Nouri"},{"id":"53f4324ddabfaedce5503fde","name":"Debadeepta Dey"},{"id":"53f46d9cdabfaee2a1dcb9b3","name":"Asli Çelikyilmaz"},{"id":"53f43f9cdabfaedd74ddb705","name":"Bill Dolan"}],"doi":"10.1109\u002FCVPRW50498.2020.00486","id":"5f2e783e91e011fa4e2aeee9","num_citation":0,"order":5,"pages":{"end":"4116","start":"4109"},"title":"Quality and Relevance Metrics for Selection of Multimodal Pretraining Data.","urls":["https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Fcvpr\u002FRaoRNDCD20","https:\u002F\u002Fdoi.org\u002F10.1109\u002FCVPRW50498.2020.00486","http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPRW_2020\u002Fpapers\u002Fw56\u002FRao_Quality_and_Relevance_Metrics_for_Selection_of_Multimodal_Pretraining_Data_CVPRW_2020_paper.pdf","http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPRW_2020\u002Fhtml\u002Fw56\u002FRao_Quality_and_Relevance_Metrics_for_Selection_of_Multimodal_Pretraining_Data_CVPRW_2020_paper.html","https:\u002F\u002Fwww.microsoft.com\u002Fen-us\u002Fresearch\u002Fpublication\u002Fquality-and-relevance-metrics-for-selection-of-multimodal-pretraining-data\u002F"],"venue":{"info":{"name":"CVPR Workshops"}},"versions":[{"id":"5f2e783e91e011fa4e2aeee9","sid":"conf\u002Fcvpr\u002FRaoRNDCD20","src":"dblp","vsid":"conf\u002Fcvpr","year":2020},{"id":"5fae6ea4d4150a363cedbbf5","sid":"3036723255","src":"mag","vsid":"1158167855","year":2020}],"year":2020},{"abstract":"Examples are generally directed towards context-sensitive generation of conversational responses. Context-message-response n-tuples are extracted from at least one source of conversational data to generate a set of training context-message-response n-tuples. A response generation engine is trained on the set of training context-message-response n-tuples. The trained response generation engine automatically generates a context-sensitive response based on a user generated input message and conversational context data. A digital assistant utilizes the trained response generation engine to generate context-sensitive, natural language responses that are pertinent to user queries.","authors":[{"id":"53f328e7dabfae9a8448179a","name":"Michel Galley"},{"id":"53f427f4dabfaeb2acfb005c","name":"Alessandro Sordoni"},{"id":"53f4804edabfae963d2596a1","name":"Christopher John Brockett"},{"id":"53f428e8dabfaec22b9e1c5d","name":"Jianfeng Gao"},{"id":"53f43f9cdabfaedd74ddb705","name":"William Brennan Dolan"},{"id":"562c7baf45cedb3398c36c85","name":"Yangfeng Ji"},{"id":"53f31d15dabfae9a8444042a","name":"Michael Auli"},{"name":"Margaret Ann Mitchell"},{"name":"Jian-Yun Nie"}],"flags":[{"flag":"affirm_author","person_id":"53f43f9cdabfaedd74ddb705"}],"id":"6051648c9e795e33e8588dc9","lang":"en","num_citation":3,"order":4,"title":"Context-sensitive generation of conversational responses","urls":["https:\u002F\u002Flens.org\u002F012-845-912-238-012","https:\u002F\u002Fscholar.google.com.hk\u002Fcitations?view_op=view_citation&hl=zh-CN&user=CQ1cqKkAAAAJ&pagesize=100&sortby=pubdate&citation_for_view=CQ1cqKkAAAAJ:LDvJswV7GG4C","https:\u002F\u002Facademic.microsoft.com\u002Fpaper\u002F2560111285\u002Fcitedby\u002Fsearch?q=Context-sensitive+generation+of+conversational+responses&qe=RId%253D1985105481&f=&orderBy=0"],"versions":[{"id":"6051648c9e795e33e8588dc9","sid":"6051648c9e795e33e8588dc9","src":"user-5fe1a78c4c775e6ec07359f9","year":2020}],"year":2020},{"authors":[{"name":"Donald Brinkman"},{"id":"53f43f9cdabfaedd74ddb705","name":"Bill Dolan"},{"name":"Kathleen Fitzpatrick"},{"name":"Jonathan Grudin"},{"name":"Mark Sample"},{"name":"Allison Hegel"}],"id":"605873d79e795e4ac8d38506","lang":"en","num_citation":0,"order":1,"title":"Being Human, Seeming Human","urls":["https:\u002F\u002Fscholar.google.com.hk\u002Fcitations?view_op=view_citation&hl=zh-CN&user=KbD1YlQAAAAJ&pagesize=100&sortby=pubdate&citation_for_view=KbD1YlQAAAAJ:dQ2og3OwTAUC","https:\u002F\u002Facademic.microsoft.com\u002Fpaper\u002F2995921316\u002Fcitedby\u002Fsearch?q=Being+Human%2C+Seeming+Human&qe=RId%253D1985105481&f=&orderBy=0"],"versions":[{"id":"605873d79e795e4ac8d38506","sid":"605873d79e795e4ac8d38506","src":"user-5d4bc4a8530c70a9b361c870","year":2020}],"year":2020},{"abstract":"Large-scale pre-trained language models, such as BERT and GPT-2, have achieved excellent performance in language representation learning and free-form text generation. However, these models cannot be directly employed to generate text under specified lexical constraints. To address this challenge, we present POINTER (PrOgressive INsertion-based TransformER), a simple yet novel insertion-based approach for hard-constrained text generation. The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner. This procedure is recursively applied until a sequence is completed. The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable. We pre-train our model with the proposed progressive insertion-based objective on a 12GB Wikipedia dataset, and fine-tune it on downstream hard-constrained generation tasks. Non-autoregressive decoding yields a logarithmic time complexity during inference time. Experimental results on both News and Yelp datasets demonstrate that Pointer achieves state-of-the-art performance on constrained text generation. We released the pre-trained models and the source code to facilitate future research.","authors":[{"id":"562f456b45cedb33995dbe96","name":"Yizhe Zhang"},{"id":"562d0a9545cedb3398d38079","name":"Guoyin Wang"},{"id":"53f42d9adabfaee1c0a36753","name":"Chunyuan Li"},{"id":"5622795d45cedb33983cc300","name":"Zhe Gan"},{"id":"53f4804edabfae963d2596a1","name":"Chris Brockett"},{"id":"53f43f9cdabfaedd74ddb705","name":"Bill Dolan"}],"doi":"10.18653\u002FV1\u002F2020.EMNLP-MAIN.698","id":"5ff68ba2d4150a363ccfe88d","num_citation":0,"order":5,"pages":{"end":"8670","start":"8649"},"title":"POINTER: Constrained Progressive Text Generation via Insertion-based Generative Pre-training.","urls":["https:\u002F\u002Fwww.aclweb.org\u002Fanthology\u002F2020.emnlp-main.698\u002F","https:\u002F\u002Fdblp.uni-trier.de\u002Fdb\u002Fconf\u002Femnlp\u002Femnlp2020-1.html#ZhangWLGBD20"],"venue":{"info":{"name":"empirical methods in natural language processing"}},"versions":[{"id":"5ff68ba2d4150a363ccfe88d","sid":"3099872554","src":"mag","vsid":"1192655580","year":2020}],"year":2020},{"abstract":" Motivated by the increasing popularity of intelligent editing assistant, we introduce and investigate the task of narrative incoherence detection: Given a (corrupted) long-form narrative, decide whether there exists some semantic discrepancy in the narrative flow. Specifically, we focus on the missing sentence and incoherent sentence detection. Despite its simple setup, this task is challenging as the model needs to understand and analyze a multi-sentence narrative text, and make decisions at the sentence level. As an initial step towards this task, we implement several baselines either directly analyzing the raw text (\\textit{token-level}) or analyzing learned sentence representations (\\textit{sentence-level}). We observe that while token-level modeling enjoys greater expressive power and hence better performance, sentence-level modeling possesses an advantage in efficiency and flexibility. With pre-training on large-scale data and cycle-consistent sentence embedding, our extended sentence-level model can achieve comparable detection accuracy to the token-level model. As a by-product, such a strategy enables simultaneous incoherence detection and infilling\u002Fmodification suggestions. ","authors":[{"name":"Deng Cai"},{"name":"Yizhe Zhang"},{"name":"Yichen Huang"},{"id":"53f43a32dabfaeb22f4949d4","name":"Wai Lam"},{"id":"53f43f9cdabfaedd74ddb705","name":"Bill Dolan"}],"flags":[{"flag":"affirm_author","person_id":"53f43f9cdabfaedd74ddb705"}],"id":"5fe1d8cb91e0119a161ede7a","num_citation":0,"order":4,"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fstorage\u002Fpdf\u002Farxiv\u002F20\u002F2012\u002F2012.11157.pdf","title":"Narrative Incoherence Detection","urls":["https:\u002F\u002Farxiv.org\u002Fabs\u002F2012.11157"],"versions":[{"id":"5fe1d8cb91e0119a161ede7a","sid":"2012.11157","src":"arxiv","year":2020}],"year":2020},{"abstract":" We present a large, tunable neural conversational response generation model, DialoGPT (dialogue generative pre-trained transformer). Trained on 147M conversation-like exchanges extracted from Reddit comment chains over a period spanning from 2005 through 2017, DialoGPT extends the Hugging Face PyTorch transformer to attain a performance close to human both in terms of automatic and human evaluation in single-turn dialogue settings. We show that conversational systems that leverage DialoGPT generate more relevant, contentful and context-consistent responses than strong baseline systems. The pre-trained model and training pipeline are publicly released to facilitate research into neural response generation and the development of more intelligent open-domain dialogue systems. ","authors":[{"id":"562f456b45cedb33995dbe96","name":"Zhang Yizhe"},{"id":"562c845e45cedb3398c4a8db","name":"Sun Siqi"},{"id":"53f328e7dabfae9a8448179a","name":"Galley Michel"},{"id":"560ccb3945ce1e59609a7f1b","name":"Chen Yen-Chun"},{"id":"53f4804edabfae963d2596a1","name":"Brockett Chris"},{"id":"53f4397adabfaefedbae5b5a","name":"Gao Xiang"},{"id":"53f428e8dabfaec22b9e1c5d","name":"Gao Jianfeng"},{"id":"5631d31345cedb3399f2c18f","name":"Liu Jingjing"},{"id":"53f43f9cdabfaedd74ddb705","name":"Dolan Bill"}],"doi":"10.18653\u002FV1\u002F2020.ACL-DEMOS.30","flags":[{"flag":"affirm_author","person_id":"53f43f9cdabfaedd74ddb705"}],"id":"5dc149983a55acb75f3913be","num_citation":154,"order":8,"pages":{"end":"278","start":"270"},"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fupload\u002Fpdf\u002F1366\u002F124\u002F597\u002F5dc149983a55acb75f3913be_0.pdf","title":"DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation","urls":["https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.00536","https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Facl\u002FZhangSGCBGGLD20","https:\u002F\u002Fwww.aclweb.org\u002Fanthology\u002F2020.acl-demos.30\u002F","https:\u002F\u002Farxiv.org\u002Fpdf\u002F1911.00536.pdf","https:\u002F\u002Fdblp.uni-trier.de\u002Fdb\u002Fjournals\u002Fcorr\u002Fcorr1911.html#abs-1911-00536","https:\u002F\u002Fwww.microsoft.com\u002Fen-us\u002Fresearch\u002Fpublication\u002Fdialogpt-large-scale-generative-pre-training-for-conversational-response-generation\u002F","https:\u002F\u002Fwww.aclweb.org\u002Fanthology\u002F2020.acl-demos.30.pdf","https:\u002F\u002Fwww.arxiv-vanity.com\u002Fpapers\u002F1911.00536\u002F"],"venue":{"info":{"name":"ACL"}},"versions":[{"id":"5dc149983a55acb75f3913be","sid":"1911.00536","src":"arxiv","year":2019},{"id":"5ef876eb91e0115941835e03","sid":"conf\u002Facl\u002FZhangSGCBGGLD20","src":"dblp","vsid":"conf\u002Facl","year":2020},{"id":"60741a5ae4510cd7c86d5a3f","sid":"2988937804","src":"mag","vsid":"1188739475","year":2020}],"year":2020},{"abstract":" Many high-level procedural tasks can be decomposed into sequences of instructions that vary in their order and choice of tools. In the cooking domain, the web offers many partially-overlapping text and video recipes (i.e. procedures) that describe how to make the same dish (i.e. high-level task). Aligning instructions for the same dish across different sources can yield descriptive visual explanations that are far richer semantically than conventional textual instructions, providing commonsense insight into how real-world procedures are structured. Learning to align these different instruction sets is challenging because: a) different recipes vary in their order of instructions and use of ingredients; and b) video instructions can be noisy and tend to contain far more information than text instructions. To address these challenges, we first use an unsupervised alignment algorithm that learns pairwise alignments between instructions of different recipes for the same dish. We then use a graph algorithm to derive a joint alignment between multiple text and multiple video recipes for the same dish. We release the Microsoft Research Multimodal Aligned Recipe Corpus containing 150K pairwise alignments between recipes across 4,262 dishes with rich commonsense information. ","authors":[{"id":"5629e94045ce1e5966603185","name":"Angela Lin"},{"name":"Sudha Rao"},{"id":"53f46d9cdabfaee2a1dcb9b3","name":"Asli Celikyilmaz"},{"name":"Elnaz Nouri"},{"id":"53f4804edabfae963d2596a1","name":"Chris Brockett"},{"id":"53f4324ddabfaedce5503fde","name":"Debadeepta Dey"},{"id":"53f43f9cdabfaedd74ddb705","name":"Bill Dolan"}],"id":"5ec49a639fced0a24b4de721","num_citation":2,"order":6,"pages":{"end":"4884","start":"4871"},"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fupload\u002Fpdf\u002F933\u002F650\u002F1668\u002F5ec49a639fced0a24b4de721_7.pdf","title":"A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks","urls":["https:\u002F\u002Facl2020.org\u002Fprogram\u002Faccepted\u002F","https:\u002F\u002Farxiv.org\u002Fabs\u002F2005.09606","https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Facl\u002FLinRCNBDD20","https:\u002F\u002Fwww.aclweb.org\u002Fanthology\u002F2020.acl-main.440\u002F","https:\u002F\u002Facl2020.org\u002Fprogram\u002Faccepted\u002F#22","https:\u002F\u002Fdblp.uni-trier.de\u002Fdb\u002Fjournals\u002Fcorr\u002Fcorr2005.html#abs-2005-09606","https:\u002F\u002Fwww.microsoft.com\u002Fen-us\u002Fresearch\u002Fpublication\u002Fa-recipe-for-creating-multimodal-aligned-datasets-for-sequential-tasks\u002F"],"venue":{"info":{"name":"ACL"}},"versions":[{"id":"5ec49a639fced0a24b4de721","sid":"acl2020#22","src":"conf_acl","year":2020},{"id":"5ec48cc4da5629efe0884f05","sid":"2005.09606","src":"arxiv","year":2020},{"id":"5ef5c81691e011b33003b757","sid":"conf\u002Facl\u002FLinRCNBDD20","src":"dblp","vsid":"conf\u002Facl","year":2020},{"id":"5fae6d80d4150a363cebb5f9","sid":"3034728660","src":"mag","vsid":"1188739475","year":2020}],"year":2020},{"abstract":"Existing open-domain dialog models are generally trained to minimize the perplexity of target human responses. However, some human replies are more engaging than others, spawning more followup interactions. Current conversational models are increasingly capable of producing turns that are context-relevant, but in order to produce compelling agents, these models need to be able to predict and optimize for turns that are genuinely engaging. We leverage social media feedback data (number of replies and upvotes) to build a large-scale training dataset for feedback prediction. To alleviate possible distortion between the feedback and engagingness, we convert the ranking problem to a comparison of response pairs which involve few confounding factors. We trained DialogRPT, a set of GPT-2 based models on 133M pairs of human feedback data and the resulting ranker outperformed several baselines. Particularly, our ranker outperforms the conventional dialog perplexity baseline with a large margin on predicting Reddit feedback. We finally combine the feedback prediction models and a human-like scoring model to rank the machine-generated dialog responses. Crowd-sourced human evaluation shows that our ranking method correlates better with real human preferences than baseline models.","authors":[{"id":"53f4397adabfaefedbae5b5a","name":"Xiang Gao"},{"id":"562f456b45cedb33995dbe96","name":"Yizhe Zhang"},{"id":"53f328e7dabfae9a8448179a","name":"Michel Galley"},{"id":"53f4804edabfae963d2596a1","name":"Chris Brockett"},{"id":"53f43f9cdabfaedd74ddb705","name":"Bill Dolan"}],"doi":"10.18653\u002FV1\u002F2020.EMNLP-MAIN.28","flags":[{"flag":"affirm_author","person_id":"53f43f9cdabfaedd74ddb705"}],"id":"5f61e3aa91e011fae8fd6a87","num_citation":2,"order":4,"pages":{"end":"395","start":"386"},"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fupload\u002Fpdf\u002F1619\u002F393\u002F1653\u002F5f61e3aa91e011fae8fd6a87_0.pdf","title":"Dialogue Response Ranking Training with Large Scale Human Feedback Data","urls":["https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.06978","https:\u002F\u002F2020.emnlp.org\u002Fpapers\u002Fmain","https:\u002F\u002Fdblp.uni-trier.de\u002Fdb\u002Fjournals\u002Fcorr\u002Fcorr2009.html#abs-2009-06978","https:\u002F\u002Fwww.aclweb.org\u002Fanthology\u002F2020.emnlp-main.28\u002F","https:\u002F\u002Fwww.arxiv-vanity.com\u002Fpapers\u002F2009.06978\u002F"],"venue":{"info":{"name":"EMNLP 2020"}},"versions":[{"id":"5f61e3aa91e011fae8fd6a87","sid":"2009.06978","src":"arxiv","year":2020},{"id":"5f7fe6d80205f07f689731b1","sid":"emnlp2020#153","src":"conf_emnlp","year":2020},{"id":"5ff68b25d4150a363cce7552","sid":"3098258760","src":"mag","vsid":"1192655580","year":2020}],"year":2020},{"abstract":" Current end-to-end neural conversation models inherently lack the flexibility to impose semantic control in the response generation process. This control is essential to ensure that users' semantic intents are satisfied and to impose a degree of specificity on generated outputs. Attempts to boost informativeness alone come at the expense of factual accuracy, as attested by GPT-2's propensity to \"hallucinate\" facts. While this may be mitigated by access to background knowledge, there is scant guarantee of relevance and informativeness in generated responses. We propose a framework that we call controllable grounded response generation (CGRG), in which lexical control phrases are either provided by an user or automatically extracted by a content planner from dialogue context and grounding knowledge. Quantitative and qualitative results show that, using this framework, a GPT-2 based model trained on a conversation-like Reddit dataset outperforms strong generation baselines. ","authors":[{"name":"Wu Zeqiu"},{"id":"53f328e7dabfae9a8448179a","name":"Galley Michel"},{"id":"53f4804edabfae963d2596a1","name":"Brockett Chris"},{"id":"562f456b45cedb33995dbe96","name":"Zhang Yizhe"},{"id":"53f4397adabfaefedbae5b5a","name":"Gao Xiang"},{"id":"53f4a061dabfaec18c77b735","name":"Quirk Chris"},{"id":"562ce83845cedb3398cfacb8","name":"Koncel-Kedziorski Rik"},{"id":"53f428e8dabfaec22b9e1c5d","name":"Gao Jianfeng"},{"id":"53f36800dabfae4b349a10d9","name":"Hajishirzi Hannaneh"},{"id":"5434c27bdabfaebba5862524","name":"Ostendorf Mari"},{"id":"53f43f9cdabfaedd74ddb705","name":"Dolan Bill"}],"flags":[{"flag":"affirm_author","person_id":"53f43f9cdabfaedd74ddb705"}],"id":"5eb78919da5629cf244303ae","num_citation":7,"order":10,"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fupload\u002Fpdf\u002F1791\u002F269\u002F1430\u002F5eb78919da5629cf244303ae_0.pdf","title":"A Controllable Model of Grounded Response Generation","urls":["https:\u002F\u002Farxiv.org\u002Fabs\u002F2005.00613"],"versions":[{"id":"5eb78919da5629cf244303ae","sid":"2005.00613","src":"arxiv","year":2020}],"year":2020},{"abstract":" We present MixingBoard, a platform for quickly building demos with a focus on knowledge grounded stylized text generation. We unify existing text generation algorithms in a shared codebase and further adapt earlier algorithms for constrained generation. To borrow advantages from different models, we implement strategies for cross-model integration, from the token probability level to the latent space level. An interface to external knowledge is provided via a module that retrieves on-the-fly relevant knowledge from passages on the web or any document collection. A user interface for local development, remote webpage access, and a RESTful API are provided to make it simple for users to build their own demos. ","authors":[{"id":"53f4397adabfaefedbae5b5a","name":"Gao Xiang"},{"id":"53f328e7dabfae9a8448179a","name":"Galley Michel"},{"id":"53f43f9cdabfaedd74ddb705","name":"Dolan Bill"}],"id":"5ec3ae5291e0112b16089f30","num_citation":3,"order":2,"pages":{"end":"231","start":"224"},"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fstorage\u002Fpdf\u002Farxiv\u002F20\u002F2005\u002F2005.08365.pdf","title":"MixingBoard: a Knowledgeable Stylized Integrated Text Generation Platform","urls":["https:\u002F\u002Farxiv.org\u002Fabs\u002F2005.08365","https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Facl\u002FGaoGD20","https:\u002F\u002Fwww.aclweb.org\u002Fanthology\u002F2020.acl-demos.26\u002F","https:\u002F\u002Fdblp.uni-trier.de\u002Fdb\u002Fjournals\u002Fcorr\u002Fcorr2005.html#abs-2005-08365","https:\u002F\u002Farxiv.org\u002Fpdf\u002F2005.08365","https:\u002F\u002Fwww.arxiv-vanity.com\u002Fpapers\u002F2005.08365\u002F"],"venue":{"info":{"name":"ACL"}},"versions":[{"id":"5ec3ae5291e0112b16089f30","sid":"2005.08365","src":"arxiv","year":2020},{"id":"5ef876eb91e0115941835de0","sid":"conf\u002Facl\u002FGaoGD20","src":"dblp","vsid":"conf\u002Facl","year":2020},{"id":"5fae6f46d4150a363ceef62b","sid":"3037969532","src":"mag","vsid":"1188739475","year":2020}],"year":2020},{"abstract":"We present Vision-based Navigation with Language-based Assistance (VNLA), a grounded vision-language task where an agent with visual perception is guided via language to find objects in photorealistic indoor environments. The task emulates a real-world scenario in that (a) the requester may not know how to navigate to the target objects and thus makes requests by only specifying high-level endgoals, and (b) the agent is capable of sensing when it is lost and querying an advisor, who is more qualified at the task, to obtain language subgoals to make progress. To model language-based assistance, we develop a general framework termed Imitation Learning with Indirect Intervention (I3L), and propose a solution that is effective on the VNLA task. Empirical results show that this approach significantly improves the success rate of the learning agent over other baselines on both seen and unseen environments.","authors":[{"id":"562ceadc45cedb3398d001c2","name":"Khanh Nguyen"},{"id":"53f4324ddabfaedce5503fde","name":"Debadeepta Dey"},{"id":"53f4804edabfae963d2596a1","name":"Chris Brockett"},{"id":"53f43f9cdabfaedd74ddb705","name":"Bill Dolan"}],"doi":"","flags":[{"flag":"affirm_author","person_id":"53f43f9cdabfaedd74ddb705"}],"id":"5c2c7a9217c44a4e7cf3134d","lang":"en","num_citation":23,"order":3,"pages":{"end":"12537","start":"12527"},"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fupload\u002Fpdf\u002Fprogram\u002F5c2c7a9217c44a4e7cf3134d_0.pdf","title":"Vision-based Navigation with Language-based Assistance via Imitation Learning with Indirect Intervention.","urls":["http:\u002F\u002Farxiv.org\u002Fabs\u002F1812.04155","https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Fcvpr\u002FNguyenDBD19","http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fhtml\u002FNguyen_Vision-Based_Navigation_With_Language-Based_Assistance_via_Imitation_Learning_With_Indirect_CVPR_2019_paper.html","https:\u002F\u002Farxiv.org\u002Fabs\u002F1812.04155"],"venue":{"info":{"name":"CVPR"},"issue":"","volume":"abs\u002F1812.04155"},"versions":[{"id":"5e8d92789fced0a24b5fc1ee","sid":"journals\u002Fcorr\u002Fabs-1812-04155","src":"dblp","vsid":"journals\u002Fcorr","year":2018},{"id":"5ce2d092ced107d4c639322d","sid":"2903932979","src":"mag","vsid":"2597173376","year":2018},{"id":"5da2f8aa3a55ac3402d8c28b","sid":"conf\u002Fcvpr\u002FNguyenDBD19","src":"dblp","vsid":"conf\u002Fcvpr","year":2019},{"id":"5db9278f47c8f766461c8a1d","sid":"2967186499","src":"mag","vsid":"1158167855","year":2018},{"id":"5cac51bfda5629269c8f0de2","sid":"1812.04155","src":"arxiv","year":2019}],"year":2019}],"profilePubsTotal":143,"profilePatentsPage":1,"profilePatents":[],"profilePatentsTotal":0,"profilePatentsEnd":true,"profileProjectsPage":0,"profileProjects":null,"profileProjectsTotal":null,"newInfo":null,"checkDelPubs":[]}};