AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
Our experimental findings show that our deep learning model: greatly improves on the previous state-of-the-art systems and a recent deep learning approach in on answer sentence selection task showing a 3% absolute improvement in Mean Average Precision and MRR; our system is able ...

Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks

International Conference on Research an Development in Information Retrieval, pp.373-382, (2015)

Cited: 736|Views378
EI

Abstract

Learning a similarity function between pairs of objects is at the core of learning to rank approaches. In information retrieval tasks we typically deal with query-document pairs, in question answering -- question-answer pairs. However, before learning can take place, such pairs needs to be mapped from the original space of symbolic words ...More

Code:

Data:

0
Introduction
  • Encoding query-document pairs into discriminative feature vectors that are input to a learning-to-rank algorithm is a critical step in building an accurate reranker.
  • The most widely used approach is to encode input text pairs using many complex lexical, syntactic and semantic features and compute various similarity measures between the obtained representations.
  • In answer passage reranking [31] employ complex linguistic features, modelling syntactic and semantic information as bags of syntactic and semantic role dependencies and build similarity and translation models over these representations.
  • Adapting to new domains requires additional effort to tune feature extraction pipelines and adding new resources that may not even exist
Highlights
  • Encoding query-document pairs into discriminative feature vectors that are input to a learning-to-rank algorithm is a critical step in building an accurate reranker
  • We describe a novel deep learning architecture for reranking short texts, where questions and documents are limited to a single sentence
  • In the following we present our deep learning network for learning to match short text pairs
  • We report on the official evaluation metric for the TREC 2012 Microblog track, i.e., precision at 30 (P@30), and on mean average precision (MAP)
  • We propose a novel deep learning architecture for reranking short texts
  • Our experimental findings show that our deep learning model: (i) greatly improves on the previous state-of-the-art systems and a recent deep learning approach in [38] on answer sentence selection task showing a 3% absolute improvement in Mean Average Precision and MRR; our system is able to improve even the best system runs from TREC Microblog 2012 challenge; is comparable to the syntactic reranker in [27], while our system requires no external parsers or resources
Methods
  • The parameters of the deep learning model were as follows: the width m of the convolution filters is set to 5 and the number of convolutional feature maps is 100.
  • At test time the authors use the parameters of the network that were obtained with the best MAP score on the development set, i.e., the authors compute the MAP score after each 10 mini-batch updates and save the network parameters if a new best dev MAP score was obtained.
Results
  • Results and discussion

    The authors report the results of the deep learning model on the TRAIN and TRAIN-ALL sets when additional word overlap features are used.
  • Along with the question-answer similarity score, the architecture includes intermediate representations of the question and the answer, which together constitute a much richer representation
  • This results in a large improvement of about 8% absolute points in MAP for TRAIN and almost 10% when trained with more data from TRAIN-ALL.
  • Another important aspect is the fact that a large portion of the word embeddings used by the network are initialized at random, which has a negative impact on the accuracy of the model
Conclusion
  • The authors propose a novel deep learning architecture for reranking short texts.
  • It has the benefits of requiring no manual feature engineering or external resources, which may be expensive or not available.
  • The model with the same architecture can be successfully applied to other domains and tasks.
  • The authors' experimental findings show that the deep learning model: (i) greatly improves on the previous state-of-the-art systems and a recent deep learning approach in [38] on answer sentence selection task showing a 3% absolute improvement in MAP and MRR; the system is able to improve even the best system runs from TREC Microblog 2012 challenge; is comparable to the syntactic reranker in [27], while the system requires no external parsers or resources
Tables
  • Table1: Summary of TREC QA datasets for answer reranking
  • Table2: Results on TRAIN and TRAIN-ALL from Trec QA
  • Table3: Results on TREC QA when augmenting the deep learning model with word overlap features
  • Table4: Survey of the results on the QA answer selection task
  • Table5: Summary of TREC Microblog datasets
  • Table6: System performance on the top 30 runs from TMB2012, using the top 10, 20 or 30 runs from TMB2011 for training
  • Table7: Comparison of the averaged relative improvements for the top, middle (mid), and bottom (btm) 30 systems from TMB2012
Download tables as Excel
Related work
  • Our learning to rank method is based on a deep learning model for advanced text representations using distributional word embeddings. Distributional representations have a long tradition in IR, e.g., Latent Semantic Analysis [10], which more recently has also been characterized by studies on distributional models based on word similarities. Their main properties is to alleviate the problem of data sparseness. In particular, such representations can be derived with several methods, e.g., by counting the frequencies of co-occurring words around a given token in large corpora. Such distributed representations can be obtained by applying neural language models that learn word embeddings, e.g., [3] and more recently using recursive autoencoders [34], and convolutional neural networks [8].

    Our application of learning to rank models concerns passage reranking. For example, [17, 24] designed classifiers of question and answer passage pairs. Several approaches were devoted to reranking passages containing definition/description, e.g., [21, 28, 31]. [1] used a cascading approach, where the ranking produced by one ranker is used as input to the next stage.
Funding
  • This work has been supported by the EC project CogNet, 671625 (H2020-ICT-2014-2, Research and Innovation action)
  • The first author was supported by the Google Europe Doctoral Fellowship Award 2013
Reference
  • A. Agarwal, H. Raghavan, K. Subbian, P. Melville, D. Gondek, and R. Lawrence. Learning to rank for robust question answering. In CIKM, 2012.
    Google ScholarLocate open access versionFindings
  • J. W. Antoine Bordes and N. Usunier. Open question answering with weakly supervised embedding models. In ECML, Nancy, France, September 2014.
    Google ScholarLocate open access versionFindings
  • Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin. A neural probabilistic language model. Journal of Machine Learning Research, 3:1137–1155, 2003.
    Google ScholarLocate open access versionFindings
  • R. Berendsen, M. Tsagkias, W. Weerkamp, and M. de Rijke. Pseudo test collections for training and tuning microblog rankers. In SIGIR, 2013.
    Google ScholarLocate open access versionFindings
  • A. Bordes, S. Chopra, and J. Weston. Question answering with subgraph embeddings. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 615–620, Doha, Qatar, October 2014. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li. Learning to rank: From pairwise approach to listwise approach. In Proceedings of the 24th International Conference on Machine Learning, ICML ’07, pages 129–136, New York, NY, USA, 2007. ACM.
    Google ScholarLocate open access versionFindings
  • Y. Chen, M. Zhou, and S. Wang. Reranking answers from definitional QA using language models. In ACL, 2006.
    Google ScholarLocate open access versionFindings
  • R. Collobert and J. Weston. A unified architecture for natural language processing: deep neural networks with multitask learning. In ICML, pages 160–167, 2008.
    Google ScholarLocate open access versionFindings
  • H. Cui, M. Kan, and T. Chua. Generic soft pattern models for definitional QA. In SIGIR, Salvador, Brazil, 2005. ACM.
    Google ScholarFindings
  • S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 1990.
    Google ScholarLocate open access versionFindings
  • M. Denil, A. Demiraj, N. Kalchbrenner, P. Blunsom, and N. de Freitas. Modelling, visualising and summarising documents with a single convolutional neural network. Technical report, University of Oxford, 2014.
    Google ScholarFindings
  • J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res., 12:2121–2159, 2011.
    Google ScholarLocate open access versionFindings
  • A. Echihabi and D. Marcu. A noisy-channel approach to question answering. In ACL, 2003.
    Google ScholarFindings
  • I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. C. Courville, and Y. Bengio. Maxout networks. In ICML, pages 1319–1327, 2013.
    Google ScholarLocate open access versionFindings
  • M. Heilman and N. A. Smith. Tree edit models for recognizing textual entailments, paraphrases, and answers to questions. In NAACL, 2010.
    Google ScholarLocate open access versionFindings
  • M. Iyyer, J. Boyd-Graber, L. Claudino, R. Socher, and H. Daumé III. A neural network for factoid question answering over paragraphs. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 633–644, Doha, Qatar, October 2014. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • J. Jeon, W. B. Croft, and J. H. Lee. Finding similar questions in large question and answer archives. In CIKM, 2005.
    Google ScholarLocate open access versionFindings
  • N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modelling sentences. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, June 2014.
    Google ScholarLocate open access versionFindings
  • Y. Kim. Convolutional neural networks for sentence classification. In EMNLP, pages 1746–1751, Doha, Qatar, October 2014.
    Google ScholarLocate open access versionFindings
  • T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems 26, pages 3111–3119, 2013.
    Google ScholarLocate open access versionFindings
  • A. Moschitti, S. Quarteroni, R. Basili, and S. Manandhar. Exploiting syntactic and shallow semantic kernels for question/answer classification. In ACL, 2007.
    Google ScholarFindings
  • V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), pages 807–814, 2010.
    Google ScholarLocate open access versionFindings
  • I. Ounis, C. Macdonald, J. Lin, and I. Soboroff. Overview of the TREC-2011 microblog track. In TREC, 2011.
    Google ScholarLocate open access versionFindings
  • F. Radlinski and T. Joachims. Query chains: Learning to rank from implicit feedback. CoRR, 2006.
    Google ScholarFindings
  • Y. Sasaki. Question answering as question-biased term extraction: A new approach toward multilingual qa. In ACL, 2005.
    Google ScholarLocate open access versionFindings
  • A. Severyn and A. Moschitti. Automatic feature engineering for answer selection and extraction. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 458–467, Seattle, Washington, USA, October 2013. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • A. Severyn, A. Moschitti, M. Tsagkias, R. Berendsen, and M. de Rijke. A syntax-aware re-ranker for microblog retrieval. In SIGIR, 2014.
    Google ScholarLocate open access versionFindings
  • D. Shen and M. Lapata. Using semantic roles to improve question answering. In EMNLP-CoNLL, 2007.
    Google ScholarLocate open access versionFindings
  • I. Soboroff, I. Ounis, J. Lin, and I. Soboroff. Overview of the TREC-2012 microblog track. In TREC, 2012.
    Google ScholarLocate open access versionFindings
  • N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15:1929–1958, 2014.
    Google ScholarLocate open access versionFindings
  • M. Surdeanu, M. Ciaramita, and H. Zaragoza. Learning to rank answers to non-factoid questions from web collections. Comput. Linguist., 37(2):351–383, June 2011.
    Google ScholarLocate open access versionFindings
  • J. Suzuki, Y. Sasaki, and E. Maeda. Svm answer selection for open-domain question answering. In COLING, 2002.
    Google ScholarLocate open access versionFindings
  • W. tau Yih, M.-W. Chang, C. Meek, and A. Pastusiak. Question answering using enhanced lexical semantic models. In ACL, August 2013.
    Google ScholarFindings
  • P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res., 11:3371–3408, Dec. 2010.
    Google ScholarLocate open access versionFindings
  • M. Wang and C. D. Manning. Probabilistic tree-edit models with structured latent variables for textual entailment and question answer- ing. In ACL, 2010.
    Google ScholarFindings
  • M. Wang, N. A. Smith, and T. Mitaura. What is the jeopardy model? a quasi-synchronous grammar for qa. In EMNLP, 2007.
    Google ScholarFindings
  • P. C. Xuchen Yao, Benjamin Van Durme and C. Callison-Burch. Answer extraction as sequence tagging with tree edit distance. In NAACL, 2013.
    Google ScholarFindings
  • L. Yu, K. M. Hermann, P. Blunsom, and S. Pulman. Deep learning for answer sentence selection. CoRR, 2014.
    Google ScholarFindings
  • M. D. Zeiler. Adadelta: An adaptive learning rate method. CoRR, 2012.
    Google ScholarLocate open access versionFindings
  • M. D. Zeiler and R. Fergus. Stochastic pooling for regularization of deep convolutional neural networks. CoRR, abs/1301.3557, 2013.
    Findings
0
Your rating :

No Ratings

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn