AI helps you reading Science
AI generates interpretation videos
AI extracts and analyses the key points of the paper to generate videos automatically
AI parses the academic lineage of this thesis
AI extracts a summary of this paper
Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks
International Conference on Research an Development in Information Retrieval, pp.373-382, (2015)
- Encoding query-document pairs into discriminative feature vectors that are input to a learning-to-rank algorithm is a critical step in building an accurate reranker.
- The most widely used approach is to encode input text pairs using many complex lexical, syntactic and semantic features and compute various similarity measures between the obtained representations.
- In answer passage reranking  employ complex linguistic features, modelling syntactic and semantic information as bags of syntactic and semantic role dependencies and build similarity and translation models over these representations.
- Adapting to new domains requires additional effort to tune feature extraction pipelines and adding new resources that may not even exist
- Encoding query-document pairs into discriminative feature vectors that are input to a learning-to-rank algorithm is a critical step in building an accurate reranker
- We describe a novel deep learning architecture for reranking short texts, where questions and documents are limited to a single sentence
- In the following we present our deep learning network for learning to match short text pairs
- We report on the official evaluation metric for the TREC 2012 Microblog track, i.e., precision at 30 (P@30), and on mean average precision (MAP)
- We propose a novel deep learning architecture for reranking short texts
- Our experimental findings show that our deep learning model: (i) greatly improves on the previous state-of-the-art systems and a recent deep learning approach in  on answer sentence selection task showing a 3% absolute improvement in Mean Average Precision and MRR; our system is able to improve even the best system runs from TREC Microblog 2012 challenge; is comparable to the syntactic reranker in , while our system requires no external parsers or resources
- The parameters of the deep learning model were as follows: the width m of the convolution filters is set to 5 and the number of convolutional feature maps is 100.
- At test time the authors use the parameters of the network that were obtained with the best MAP score on the development set, i.e., the authors compute the MAP score after each 10 mini-batch updates and save the network parameters if a new best dev MAP score was obtained.
- Results and discussion
The authors report the results of the deep learning model on the TRAIN and TRAIN-ALL sets when additional word overlap features are used.
- Along with the question-answer similarity score, the architecture includes intermediate representations of the question and the answer, which together constitute a much richer representation
- This results in a large improvement of about 8% absolute points in MAP for TRAIN and almost 10% when trained with more data from TRAIN-ALL.
- Another important aspect is the fact that a large portion of the word embeddings used by the network are initialized at random, which has a negative impact on the accuracy of the model
- The authors propose a novel deep learning architecture for reranking short texts.
- It has the benefits of requiring no manual feature engineering or external resources, which may be expensive or not available.
- The model with the same architecture can be successfully applied to other domains and tasks.
- The authors' experimental findings show that the deep learning model: (i) greatly improves on the previous state-of-the-art systems and a recent deep learning approach in  on answer sentence selection task showing a 3% absolute improvement in MAP and MRR; the system is able to improve even the best system runs from TREC Microblog 2012 challenge; is comparable to the syntactic reranker in , while the system requires no external parsers or resources
- Table1: Summary of TREC QA datasets for answer reranking
- Table2: Results on TRAIN and TRAIN-ALL from Trec QA
- Table3: Results on TREC QA when augmenting the deep learning model with word overlap features
- Table4: Survey of the results on the QA answer selection task
- Table5: Summary of TREC Microblog datasets
- Table6: System performance on the top 30 runs from TMB2012, using the top 10, 20 or 30 runs from TMB2011 for training
- Table7: Comparison of the averaged relative improvements for the top, middle (mid), and bottom (btm) 30 systems from TMB2012
- Our learning to rank method is based on a deep learning model for advanced text representations using distributional word embeddings. Distributional representations have a long tradition in IR, e.g., Latent Semantic Analysis , which more recently has also been characterized by studies on distributional models based on word similarities. Their main properties is to alleviate the problem of data sparseness. In particular, such representations can be derived with several methods, e.g., by counting the frequencies of co-occurring words around a given token in large corpora. Such distributed representations can be obtained by applying neural language models that learn word embeddings, e.g.,  and more recently using recursive autoencoders , and convolutional neural networks .
Our application of learning to rank models concerns passage reranking. For example, [17, 24] designed classifiers of question and answer passage pairs. Several approaches were devoted to reranking passages containing definition/description, e.g., [21, 28, 31].  used a cascading approach, where the ranking produced by one ranker is used as input to the next stage.
- This work has been supported by the EC project CogNet, 671625 (H2020-ICT-2014-2, Research and Innovation action)
- The first author was supported by the Google Europe Doctoral Fellowship Award 2013
- A. Agarwal, H. Raghavan, K. Subbian, P. Melville, D. Gondek, and R. Lawrence. Learning to rank for robust question answering. In CIKM, 2012.
- J. W. Antoine Bordes and N. Usunier. Open question answering with weakly supervised embedding models. In ECML, Nancy, France, September 2014.
- Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin. A neural probabilistic language model. Journal of Machine Learning Research, 3:1137–1155, 2003.
- R. Berendsen, M. Tsagkias, W. Weerkamp, and M. de Rijke. Pseudo test collections for training and tuning microblog rankers. In SIGIR, 2013.
- A. Bordes, S. Chopra, and J. Weston. Question answering with subgraph embeddings. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 615–620, Doha, Qatar, October 2014. Association for Computational Linguistics.
- Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li. Learning to rank: From pairwise approach to listwise approach. In Proceedings of the 24th International Conference on Machine Learning, ICML ’07, pages 129–136, New York, NY, USA, 2007. ACM.
- Y. Chen, M. Zhou, and S. Wang. Reranking answers from definitional QA using language models. In ACL, 2006.
- R. Collobert and J. Weston. A unified architecture for natural language processing: deep neural networks with multitask learning. In ICML, pages 160–167, 2008.
- H. Cui, M. Kan, and T. Chua. Generic soft pattern models for definitional QA. In SIGIR, Salvador, Brazil, 2005. ACM.
- S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 1990.
- M. Denil, A. Demiraj, N. Kalchbrenner, P. Blunsom, and N. de Freitas. Modelling, visualising and summarising documents with a single convolutional neural network. Technical report, University of Oxford, 2014.
- J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res., 12:2121–2159, 2011.
- A. Echihabi and D. Marcu. A noisy-channel approach to question answering. In ACL, 2003.
- I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. C. Courville, and Y. Bengio. Maxout networks. In ICML, pages 1319–1327, 2013.
- M. Heilman and N. A. Smith. Tree edit models for recognizing textual entailments, paraphrases, and answers to questions. In NAACL, 2010.
- M. Iyyer, J. Boyd-Graber, L. Claudino, R. Socher, and H. Daumé III. A neural network for factoid question answering over paragraphs. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 633–644, Doha, Qatar, October 2014. Association for Computational Linguistics.
- J. Jeon, W. B. Croft, and J. H. Lee. Finding similar questions in large question and answer archives. In CIKM, 2005.
- N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modelling sentences. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, June 2014.
- Y. Kim. Convolutional neural networks for sentence classification. In EMNLP, pages 1746–1751, Doha, Qatar, October 2014.
- T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems 26, pages 3111–3119, 2013.
- A. Moschitti, S. Quarteroni, R. Basili, and S. Manandhar. Exploiting syntactic and shallow semantic kernels for question/answer classification. In ACL, 2007.
- V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), pages 807–814, 2010.
- I. Ounis, C. Macdonald, J. Lin, and I. Soboroff. Overview of the TREC-2011 microblog track. In TREC, 2011.
- F. Radlinski and T. Joachims. Query chains: Learning to rank from implicit feedback. CoRR, 2006.
- Y. Sasaki. Question answering as question-biased term extraction: A new approach toward multilingual qa. In ACL, 2005.
- A. Severyn and A. Moschitti. Automatic feature engineering for answer selection and extraction. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 458–467, Seattle, Washington, USA, October 2013. Association for Computational Linguistics.
- A. Severyn, A. Moschitti, M. Tsagkias, R. Berendsen, and M. de Rijke. A syntax-aware re-ranker for microblog retrieval. In SIGIR, 2014.
- D. Shen and M. Lapata. Using semantic roles to improve question answering. In EMNLP-CoNLL, 2007.
- I. Soboroff, I. Ounis, J. Lin, and I. Soboroff. Overview of the TREC-2012 microblog track. In TREC, 2012.
- N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15:1929–1958, 2014.
- M. Surdeanu, M. Ciaramita, and H. Zaragoza. Learning to rank answers to non-factoid questions from web collections. Comput. Linguist., 37(2):351–383, June 2011.
- J. Suzuki, Y. Sasaki, and E. Maeda. Svm answer selection for open-domain question answering. In COLING, 2002.
- W. tau Yih, M.-W. Chang, C. Meek, and A. Pastusiak. Question answering using enhanced lexical semantic models. In ACL, August 2013.
- P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res., 11:3371–3408, Dec. 2010.
- M. Wang and C. D. Manning. Probabilistic tree-edit models with structured latent variables for textual entailment and question answer- ing. In ACL, 2010.
- M. Wang, N. A. Smith, and T. Mitaura. What is the jeopardy model? a quasi-synchronous grammar for qa. In EMNLP, 2007.
- P. C. Xuchen Yao, Benjamin Van Durme and C. Callison-Burch. Answer extraction as sequence tagging with tree edit distance. In NAACL, 2013.
- L. Yu, K. M. Hermann, P. Blunsom, and S. Pulman. Deep learning for answer sentence selection. CoRR, 2014.
- M. D. Zeiler. Adadelta: An adaptive learning rate method. CoRR, 2012.
- M. D. Zeiler and R. Fergus. Stochastic pooling for regularization of deep convolutional neural networks. CoRR, abs/1301.3557, 2013.