Adopting principles from behavioral testing in software engineering, we propose CheckList, a model-agnostic and task-agnostic testing methodology that tests individual capabilities of the model using three different test types
Recent empirical improvements due to transfer learning with language models have demonstrated that rich, unsupervised pre-training is an integral part of many language understanding systems
We have introduced a general approach for learning high-quality deep context-dependent representations from bidirectional language model, and shown large improvements when applying ELMo to a broad range of NLP tasks
We presented the Transformer, the first sequence transduction model based entirely on attention, replacing the recurrent layers most commonly used in encoder-decoder architectures with multi-headed self-attention
We introduce the first fully convolutional model for sequence to sequence learning that outperforms strong recurrent models on very large benchmark datasets at an order of magnitude faster speed
We proposed a sequence generation method, SeqGAN, to effectively train generative adversarial nets for structured sequences generation via policy gradient
Extractive methods assemble summaries exclusively from passages taken directly from the source text, while abstractive methods may generate novel words and phrases not featured in the source text – as a human-written abstract usually does
We describe in detail the implementation of Google’s Neural Machine Translation system, including all the techniques that are critical to its accuracy, speed, and robustness
We argued that trust is crucial for effective human interaction with machine learning systems, and that explaining individual predictions is important in assessing trust
We introduce a variant of byte pair encoding for word segmentation, which is capable of encoding open vocabularies with a compact symbol vocabulary of variable-length subword units
Towards the end goal of natural language understanding, we introduce the Stanford Question Answering Dataset, a large reading comprehension dataset on Wikipedia articles with crowdsourced question-answer pairs
We propose a language model that leverages subword information through a character-level convolutional neural network, whose output is used as an input to a recurrent neural network language model
Like earlier neural SEQ2SEQ models, our framework captures the compositional models of the meaning of a dialogue turn and generates semantically appropriate responses
We presented a simple and yet powerful model architecture that produces state-of-the-art results for Part of speech tagging, dependency parsing and sentence compression
English-German translations src Orlando Bloom and Miranda Kerr still love each other ref Orlando Bloom und Miranda Kerr lieben sich noch immer best Orlando Bloom und Miranda Kerr lieben einander noch immer. base Orlando Bloom und Lucas Miranda lieben einander noch immer. src ′′ W...