Recipes for Safety in Open-domain Chatbots
Abstract:
Models trained on large unlabeled corpora of human interactions will learn patterns and mimic behaviors therein, which include offensive or otherwise toxic behavior and unwanted biases. We investigate a variety of methods to mitigate these issues in the context of open-domain generative dialogue models. We introduce a new human-and-mode...More
Code:
Data:
Full Text
Tags
Comments