AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We have proposed a new generative approach for Information Retrieval based on large pretrained neural language models, and demonstrated their effectiveness as rankers by providing robust experimental results on four different datasets

Beyond [CLS] through Ranking by Generation

Conference on Empirical Methods in Natural Language Processing, (2020): 1722-1727

Cited: 11|Views473
EI

Abstract

Generative models for Information Retrieval, where ranking of documents is viewed as the task of generating a query from a document’s language model, were very successful in various IR tasks in the past. However, with the advent of modern deep neural networks, attention has shifted to discriminative ranking functions that model the semant...More

Code:

Data:

0
Introduction
  • Most recent approaches for ranking tasks in Information Retrieval (IR) such as passage ranking and retrieval of semantically related questions have focused primarily on discriminative methods using neural networks that learn a similarity function to compare questions and candidate answers (Severyn and Moschitti, 2015; dos Santos et al, 2015; Tan et al, 2016; Tay et al, 2017, 2018).
  • The key idea consists of first training a unique language model lmi for each candidate document di, using the likelihood of generating the input query using lmi, denoted by P (q|lmi), as the ranking score for document di.
  • The usual approach to train an LM using a neural network with parameters θ consists on performing maximum likelihood estimation (MLE) by minimizing the negative log-likelihood over a large text corpus D = {x1, x2, ..., x|D|}, where each xk is a document of length |xk|: |D| |xk| L(D) = −.
Highlights
  • Most recent approaches for ranking tasks in Information Retrieval (IR) such as passage ranking and retrieval of semantically related questions have focused primarily on discriminative methods using neural networks that learn a similarity function to compare questions and candidate answers (Severyn and Moschitti, 2015; dos Santos et al, 2015; Tan et al, 2016; Tay et al, 2017, 2018)
  • Classical literature on probabilistic models for IR showed that language modeling, a type of simple generative model, can be effective for document ranking (Zhai, 2008; Lafferty and Zhai, 2001; Ponte and Croft, 1998)
  • Unlike classic LM based approaches for IR that employ separate LMs for each document, our proposed method uses a single global LM that applies to all documents
  • The usual approach to train an LM using a neural network with parameters θ consists on performing maximum likelihood estimation (MLE) by minimizing the negative log-likelihood over a large text corpus D = {x1, x2, ..., x|D|}, where each xk is a document of length |xk|:
  • We have proposed a new generative approach for IR based on large pretrained neural language models, and demonstrated their effectiveness as rankers by providing robust experimental results on four different datasets
  • We demonstrated that unlikelihood-based losses are effective for allowing the use of negative examples in generative-based information retrieval
Results
  • The authors trained a discriminative version of BART-large where the input for the encoder and the decoder are the passage and the question, respectively
  • As it is normally adopted in BART for classification (Lewis et al, 2019), the authors take the representation generated by the decoder for the last token and use it to create a score by applying a linear layer.
Conclusion
  • 3.1 Datasets

    The authors use four different publicly available answer selection datasets in the experiments: WikipassageQA (Cohen et al.), WikiQA (Yang et al, 2015), InsuranceQA V2 (Feng et al, 2015), and YahooQA (Tay et al, 2017).
  • Most of the hyperparmeters used for fine-tuning are the default ones from Wolf et al (2019)1, except for learning rate for BART, which the authors set to 1e − 5.The authors have proposed a new generative approach for IR based on large pretrained neural language models, and demonstrated their effectiveness as rankers by providing robust experimental results on four different datasets.
  • The authors believe that the approach can be effectively used for text classification problems, where the score of a class label c is computed as the likelihood of generating the class label c given the document d, p(c|d)
Tables
  • Table1: Dataset statistics. #Q stands for number of questions and #P/Q is the average number of passages per question
  • Table2: Experimental results for different passage ranking models and datasets
  • Table3: Experimental results on using passage vs. question as the conditional context. Results are computed on the WikipassageQA dataset
  • Table4: Examples of automatically generated questions using the GPT2-largeLUL model fine-tuned on the WikipassageQA dataset with likelihood pθ(q|a). The passages were extracted from the test set
  • Table5: Examples of automatically generated passages using the GPT2-largeLUL model fine-tuned on the WikipassageQA dataset with likelihood pθ(a|q). The question was extracted from the test set
Download tables as Excel
Study subjects and analysis
datasets: 4
Statistics about the datasets are shown in Table 1. The four datasets also provide validation sets, which have size similar to the respective test sets. Dataset WikiQA WikipassageQA InsuranceQA YahooQA

datasets: 3
The subscript LU L corresponds to models fine-tuned using maximum likelihood and unlikelihood estimation (Eq 4), while RLL are models fine-tuned using the ranking loss in Eq 5. For M LE and LU L, we use a mini-batch size of 64 for InsuranceQA and 32 for the other 3 datasets. The number of negative examples per positive examples is set to 5 in the case of LU L

tested datasets: 4
Comparing BART-largeRLL with discriminative BART-large (row 4), we can see that BART-largeRLL produces better results for InsuranceQA, while achieving similar performance for YahooQA, WikiQA and WikipassageQA. Overall, our proposed generative approach produces state-of-the-art results on the four tested datasets in all metrics. Model GPT2-baseLU L GPT2-baseLU L GPT2-baseRLL GPT2-baseRLL

people: 11244369
In 2013, Sao Paulo was the most populous city in Brazil and in South America. According to the 2010 IBGE Census, there were 11,244,369 people residing in the city of Sao Paulo. The census found 6,824,668 White people , 3,433,218 Pardo people , 736,083 Black people , 246,244 Asian people and 21,318 Amerindian people

White people: 6824668
According to the 2010 IBGE Census, there were 11,244,369 people residing in the city of Sao Paulo. The census found 6,824,668 White people , 3,433,218 Pardo people , 736,083 Black people , 246,244 Asian people and 21,318 Amerindian people. In 2010, the city had 2,146,077 opposite-sex couples and 7,532 same-sex couples

opposite-sex couples: 2146077
The census found 6,824,668 White people , 3,433,218 Pardo people , 736,083 Black people , 246,244 Asian people and 21,318 Amerindian people. In 2010, the city had 2,146,077 opposite-sex couples and 7,532 same-sex couples. The population of Sao Paulo was 52.6% female and 47.4% male

Reference
  • Yoshua Bengio, Rejean Ducharme, Pascal Vincent, and Christian Janvin. 2003. A neural probabilistic language model. J. Mach. Learn. Res., 3:1137–1155.
    Google ScholarLocate open access versionFindings
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL, Minneapolis, Minnesota.
    Google ScholarLocate open access versionFindings
  • Angela Fan, Mike Lewis, and Yann N. Dauphin. 2018. Hierarchical neural story generation. CoRR, abs/1805.04833.
    Findings
  • Minwei Feng, Bing Xiang, Michael R. Glass, Lidan Wang, and Bowen Zhou. 2015. Applying deep learning to answer selection: A study and an open task. In IEEE Workshop on Automatic Speech Recognition and UnderstandingASRU, pages 813–820.
    Google ScholarLocate open access versionFindings
  • Ari Holtzman, Jan Buys, Maxwell Forbes, and Yejin Choi. 2019. The curious case of neural text degeneration. CoRR, abs/1904.09751.
    Findings
  • Nitish Shirish Keskar, Bryan McCann, Lav R. Varshney, Caiming Xiong, and Richard Socher. 2019. Ctrl: A conditional transformer language model for controllable generation.
    Google ScholarFindings
  • John Lafferty and Chengxiang Zhai. 2001. Document language models, query models, and risk minimization for information retrieval. In 24th Annual International ACM SIGIR Conference, SIGIR ’01.
    Google ScholarLocate open access versionFindings
  • Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. 2019. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension.
    Google ScholarFindings
  • Dongfang Li, Yifei Yu, Qingcai Chen, and Xinyu Li. 201Bertsel: Answer selection with pre-trained models. CoRR, abs/1905.07588.
    Findings
  • Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage re-ranking with bert. arXiv preprint arXiv:1901.04085.
    Findings
  • Rodrigo Nogueira, Zhiying Jiang, and Jimmy Lin. 2020. Document ranking with a pretrained sequence-to-sequence model.
    Google ScholarFindings
  • Jay M. Ponte and W. Bruce Croft. 1998. A language modeling approach to information retrieval. In 21st Annual International ACM SIGIR Conference, SIGIR ’98, pages 275–281. ACM.
    Google ScholarLocate open access versionFindings
  • Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners.
    Google ScholarFindings
  • Jinfeng Rao, Hua He, and Jimmy Lin. 2016. Noisecontrastive estimation for answer selection with deep neural networks. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, page 1913–1916.
    Google ScholarLocate open access versionFindings
  • Cicero dos Santos, Luciano Barbosa, Dasha Bogdanova, and Bianca Zadrozny. 20Learning hybrid representations to retrieve semantically equivalent questions. In ACL, pages 694–699.
    Google ScholarLocate open access versionFindings
  • Cicero dos Santos, Ming Tan, Bing Xiang, and Bowen Zhou. 20Attentive pooling networks.
    Google ScholarFindings
  • Aliaksei Severyn and Alessandro Moschitti. 2015. Learning to rank short text pairs with convolutional deep neural networks. In ACM SIGIR Conference, pages 373–382.
    Google ScholarLocate open access versionFindings
  • Ming Tan, Cicero dos Santos, Bing Xiang, and Bowen Zhou. 2016. Lstm-based deep learning models for non-factoid answer selection. In ICLR - Workshop Track.
    Google ScholarLocate open access versionFindings
  • Yi Tay, Minh C. Phan, Luu Anh Tuan, and Siu Cheung Hui. 2017. Learning to rank question answer pairs with holographic dual lstm architecture. In ACM SIGIR Conference, pages 695–704.
    Google ScholarLocate open access versionFindings
  • Yi Tay, Luu Anh Tuan, and Siu Cheung Hui. 2018. Hyperbolic representation learning for fast and efficient neural question answering. In International Conference on Web Search and Data Mining, pages 583– 591.
    Google ScholarLocate open access versionFindings
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 5998–6008.
    Google ScholarLocate open access versionFindings
  • Zhiguo Wang, Wael Hamza, and Radu Florian. 2017. Bilateral multi-perspective matching for natural language sentences. CoRR, abs/1702.03814.
    Findings
  • Sean Welleck, Ilia Kulikov, Stephen Roller, Emily Dinan, Kyunghyun Cho, and Jason Weston. 2019. Neural text generation with unlikelihood training.
    Google ScholarFindings
  • Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, R’emi Louf, Morgan Funtowicz, and Jamie Brew. 2019. Huggingface’s transformers: State-of-the-art natural language processing. ArXiv, abs/1910.03771.
    Findings
  • Peng Xu, Xiaofei Ma, Ramesh Nallapati, and Bing Xiang. 2019. Passage ranking with weak supervsion. CoRR, abs/1905.05910.
    Findings
  • Yi Yang, Wen-tau Yih, and Christopher Meek. 2015. WikiQA: A challenge dataset for open-domain question answering. In EMNLP, pages 2013–2018.
    Google ScholarLocate open access versionFindings
  • ChengXiang Zhai. 2008. Statistical language models for information retrieval a critical review. Found. Trends Inf. Retr., 2(3):137–213.
    Google ScholarLocate open access versionFindings
0
Your rating :

No Ratings

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn