The Curious Case of Neural Text Degeneration
arXiv: Computation and Language, 2019.
Our results show that maximization is an inappropriate decoding objective for openended text generation, the probability distributions of the best current language models have an unreliable tail which needs to be truncated during generation and Nucleus Sampling is currently the b...
Despite considerable advances in neural language modeling, it remains an open question what the best decoding strategy is for text generation from a language model (e.g. to generate a story). The counter-intuitive empirical observation is that even though the use of likelihood as training objective leads to high quality models for a broad...More
PPT (Upload PPT)