Must-Read NLP Papers自然语言处理( Natural Language Processing, NLP)是计算机科学领域与人工智能领域中的一个重要方向。它研究能实现人与计算机之间用自然语言进行有效通信的各种理论和方法。自然语言处理是一门融语言学、计算机科学、数学于一体的科学。因此,这一领域的研究将涉及自然语言,即人们日常使用的语言,所以它与语言学的研究有着密切的联系,但又有重要的区别。自然语言处理并不是一般地研究自然语言,而在于研制能有效地实现自然语言通信的计算机系统,特别是其中的软件系统。目前,随着深度学习的发展,该领域已经逐渐繁荣,入门该领域,阅读已有的经典论文成为了必不可少的环节。
ACL, pp.4902-4912, (2020)
Adopting principles from behavioral testing in software engineering, we propose CheckList, a model-agnostic and task-agnostic testing methodology that tests individual capabilities of the model using three different test types
Cited by0BibtexViews1315DOI
0
0
north american chapter of the association for computational linguistics, (2018)
Recent empirical improvements due to transfer learning with language models have demonstrated that rich, unsupervised pre-training is an integral part of many language understanding systems
Cited by7048BibtexViews1120DOI
0
0
north american chapter of the association for computational linguistics, (2018)
We have introduced a general approach for learning high-quality deep context-dependent representations from bidirectional language model, and shown large improvements when applying ELMo to a broad range of NLP tasks
Cited by3722BibtexViews274DOI
0
0
Ashish Vaswani,Noam Shazeer, Niki Parmar,Jakob Uszkoreit, Llion Jones, Aidan N. Gomez,Lukasz Kaiser, Illia Polosukhin
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), (2017): 5998-6008
We presented the Transformer, the first sequence transduction model based entirely on attention, replacing the recurrent layers most commonly used in encoder-decoder architectures with multi-headed self-attention
Cited by9654BibtexViews411DOI
0
0
Trans. Assoc. Comput. Linguistics, (2017): 135-146
We show that our model outperforms baselines that do not take into account subword information, as well as methods relying on morphological analysis
Cited by3653BibtexViews282DOI
0
0
ICML, (2017): 1243-1252
We introduce the first fully convolutional model for sequence to sequence learning that outperforms strong recurrent models on very large benchmark datasets at an order of magnitude faster speed
Cited by1358BibtexViews245DOI
0
0
national conference on artificial intelligence, (2017)
We proposed a sequence generation method, SeqGAN, to effectively train generative adversarial nets for structured sequences generation via policy gradient
Cited by1182BibtexViews233DOI
0
0
Extractive methods assemble summaries exclusively from passages taken directly from the source text, while abstractive methods may generate novel words and phrases not featured in the source text – as a human-written abstract usually does
Cited by935BibtexViews337DOI
0
0
TACL, (2017)
We propose a simple solution to use a single Neural Machine Translation model to translate between multiple languages
Cited by852BibtexViews161DOI
0
0
EMNLP, (2017): 188-197
Our final model ensemble improves performance on the OntoNotes benchmark by over 3 F1 without external preprocessing tools used by previous systems
Cited by290BibtexViews133DOI
0
0
Yonghui Wu, Mike Schuster,Zhifeng Chen,Quoc V. Le,Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun,Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah
arXiv: Computation and Language, (2016)
We describe in detail the implementation of Google’s Neural Machine Translation system, including all the techniques that are critical to its accuracy, speed, and robustness
Cited by3349BibtexViews424DOI
0
0
KDD, (2016)
We argued that trust is crucial for effective human interaction with machine learning systems, and that explaining individual predictions is important in assessing trust
Cited by3063BibtexViews242DOI
0
0
meeting of the association for computational linguistics, (2016)
We introduce a variant of byte pair encoding for word segmentation, which is capable of encoding open vocabularies with a compact symbol vocabulary of variable-length subword units
Cited by2399BibtexViews178DOI
0
0
EMNLP, (2016): 2383-2392
Towards the end goal of natural language understanding, we introduce the Stanford Question Answering Dataset, a large reading comprehension dataset on Wikipedia articles with crowdsourced question-answer pairs
Cited by1569BibtexViews234DOI
0
0
meeting of the association for computational linguistics, (2016)
According to the results shown in Table 3, bi-directional LSTM obtains better performance than BRNN on all evaluation metrics of both the two tasks
Cited by1339BibtexViews257DOI
0
0
AAAI, (2016): 2741-2749
We propose a language model that leverages subword information through a character-level convolutional neural network, whose output is used as an input to a recurrent neural network language model
Cited by1270BibtexViews264DOI
0
0
conference on computational natural language learning, (2016)
This paper introduces the use of a variational autoencoder for natural language sentences
Cited by1232BibtexViews181DOI
0
0
EMNLP, (2016)
Like earlier neural SEQ2SEQ models, our framework captures the compositional models of the meaning of a dialogue turn and generates semantically appropriate responses
Cited by688BibtexViews259DOI
0
0
Daniel Andor, Chris Alberti,David Weiss,Aliaksei Severyn, Alessandro Presta,Kuzman Ganchev,Slav Petrov,Michael Collins
meeting of the association for computational linguistics, (2016)
We presented a simple and yet powerful model architecture that produces state-of-the-art results for Part of speech tagging, dependency parsing and sentence compression
Cited by417BibtexViews154DOI
0
0
Conference on Empirical Methods in Natural Language Processing, (2015)
English-German translations src Orlando Bloom and Miranda Kerr still love each other ref Orlando Bloom und Miranda Kerr lieben sich noch immer best Orlando Bloom und Miranda Kerr lieben einander noch immer. base Orlando Bloom und Lucas Miranda lieben einander noch immer. src ′′ W...
Cited by3968BibtexViews305DOI
0
0