Must-Read NLP Papers自然语言处理( Natural Language Processing, NLP)是计算机科学领域与人工智能领域中的一个重要方向。它研究能实现人与计算机之间用自然语言进行有效通信的各种理论和方法。自然语言处理是一门融语言学、计算机科学、数学于一体的科学。因此,这一领域的研究将涉及自然语言,即人们日常使用的语言,所以它与语言学的研究有着密切的联系,但又有重要的区别。自然语言处理并不是一般地研究自然语言,而在于研制能有效地实现自然语言通信的计算机系统,特别是其中的软件系统。目前,随着深度学习的发展,该领域已经逐渐繁荣,入门该领域,阅读已有的经典论文成为了必不可少的环节。
ACL, pp.4902-4912, (2020)
Adopting principles from behavioral testing in software engineering, we propose CheckList, a model-agnostic and task-agnostic testing methodology that tests individual capabilities of the model using three different test types
Cited by62BibtexViews1802
0
0
north american chapter of the association for computational linguistics, (2019)
Recent empirical improvements due to transfer learning with language models have demonstrated that rich, unsupervised pre-training is an integral part of many language understanding systems
Cited by19480BibtexViews2354
0
0
north american chapter of the association for computational linguistics, (2018)
We have introduced a general approach for learning high-quality deep context-dependent representations from bidirectional language model, and shown large improvements when applying ELMo to a broad range of NLP tasks
Cited by6721BibtexViews532DOI
0
0
Ashish Vaswani,Noam Shazeer, Niki Parmar,Jakob Uszkoreit, Llion Jones, Aidan N. Gomez,Lukasz Kaiser, Illia Polosukhin
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), (2017): 5998-6008
We presented the Transformer, the first sequence transduction model based entirely on attention, replacing the recurrent layers most commonly used in encoder-decoder architectures with multi-headed self-attention
Cited by18886BibtexViews885
0
0
Trans. Assoc. Comput. Linguistics, (2017): 135-146
We show that our model outperforms baselines that do not take into account subword information, as well as methods relying on morphological analysis
Cited by5503BibtexViews470
0
0
ICML, (2017): 1243-1252
We introduce the first fully convolutional model for sequence to sequence learning that outperforms strong recurrent models on very large benchmark datasets at an order of magnitude faster speed
Cited by2078BibtexViews411
0
0
Extractive methods assemble summaries exclusively from passages taken directly from the source text, while abstractive methods may generate novel words and phrases not featured in the source text – as a human-written abstract usually does
Cited by1454BibtexViews525
0
0
national conference on artificial intelligence, (2017)
We proposed a sequence generation method, SeqGAN, to effectively train generative adversarial nets for structured sequences generation via policy gradient
Cited by1391BibtexViews371
0
0
TACL, (2017)
We propose a simple solution to use a single Neural Machine Translation model to translate between multiple languages
Cited by1125BibtexViews262
0
0
EMNLP, (2017): 188-197
Our final model ensemble improves performance on the OntoNotes benchmark by over 3 F1 without external preprocessing tools used by previous systems
Cited by481BibtexViews356DOI
0
0
international conference on learning representations, (2016)
We extended the basic encoder–decoder by letting a modelsearch for a set of input words, or their annotations computed by an encoder, when generating each target word
Cited by17940BibtexViews898
0
0
KDD, (2016): 1135-1144
We argued that trust is crucial for effective human interaction with machine learning systems, and that explaining individual predictions is important in assessing trust
Cited by5014BibtexViews458DOI
0
0
Yonghui Wu,Mike Schuster,Zhifeng Chen,Quoc V. Le,Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun,Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah
arXiv: Computation and Language, (2016)
We describe in detail the implementation of Google’s Neural Machine Translation system, including all the techniques that are critical to its accuracy, speed, and robustness
Cited by3957BibtexViews701
0
0
meeting of the association for computational linguistics, (2016)
We introduce a variant of byte pair encoding for word segmentation, which is capable of encoding open vocabularies with a compact symbol vocabulary of variable-length subword units
Cited by3461BibtexViews305DOI
0
0
Annual Conference on Neural Information Processing Systems, (2016): 649-657
This article offers an empirical study on character-level convolutional networks for text classification
Cited by3160BibtexViews551
0
0
EMNLP, (2016): 2383-2392
Towards the end goal of natural language understanding, we introduce the Stanford Question Answering Dataset, a large reading comprehension dataset on Wikipedia articles with crowdsourced question-answer pairs
Cited by2755BibtexViews535DOI
0
0
meeting of the association for computational linguistics, (2016)
According to the results shown in Table 3, bi-directional LSTM obtains better performance than BRNN on all evaluation metrics of both the two tasks
Cited by1823BibtexViews471DOI
0
0
THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, (2016): 2741-2749
We propose a language model that leverages subword information through a character-level convolutional neural network, whose output is used as an input to a recurrent neural network language model
Cited by1512BibtexViews397
0
0
conference on computational natural language learning, (2016)
This paper introduces the use of a variational autoencoder for natural language sentences
Cited by1406BibtexViews360DOI
0
0
international conference on learning representations, (2016)
We propose the Mixed Incremental Cross-Entropy Reinforce algorithm, which deals with these issues and enables successful training of reinforcement learning models for text generation
Cited by1022BibtexViews210
0
0
小科