AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We introduce novel approaches for effectively applying neural transformer models to similar text retrieval and ranking without an initial bag-of-words-based step

Transformer-Based Language Models for Similar Text Retrieval and Ranking

Cited by: 0|Views7
Full Text
Bibtex
Weibo

Abstract

Most approaches for similar text retrieval and ranking with long natural language queries rely at some level on queries and responses having words in common with each other. Recent applications of transformer-based neural language models to text retrieval and ranking problems have been very promising, but still involve a two-step proces...More

Code:

Data:

0
Introduction
  • Most existing approaches for retrieving similar text rely in some way on word matching.
  • Machine learning has been used for similar text retrieval by employing models trained to re-rank results originally retrieved by word-match based information retrieval techniques.
  • Such systems often outperform word-match based retrieval alone, but still cannot return results that have no non-stopwords in common with the query text.
  • The authors have chosen to focus on CNIR models instead in the hope that they will yield greater gains than non-contextual pretrained models
Highlights
  • Most existing approaches for retrieving similar text rely in some way on word matching
  • More recent NIR models based on Contextual Neural Information Retrieval (CNIR) have shown some promise, such as the bidirectional encoder representations from transformers (BERT)-based re-rankers proposed by Nogueira & Cho [17], and Yang et. al. [23]
  • We posited that the sequential dependence model, sequential Dependence Model (SDM), would improve performance over simple BM25, which does not account for term proximity
  • We showed that we can obtain substantial gains in ranking effectiveness for long natural language queries by making modifications to a contextual neural language model, BERT
  • We showed that directly using a finetuned BERT model with a “siamese network” architecture to rank sentences outperforms using BERT to rerank an initial list of sentences retrieved by BM25
  • Interesting future work includes investigating and comparing these BERT-based ranking models not just on long natural language queries, and on queries of varied lengths. The fact that these BERT-based methods work well suggests that more recent neural transformer based language models, which have been shown to outperform BERT across a range of natural language tasks [2], [7], [12], [19], [24], may yield further gains for similar sentence retrieval and ranking
Results
  • Results and Discussion

    Table 1 shows the results of applying various implementations of context-sensitive neural language models to long natural language queries, against the corpus containing 8 million sentences.
  • The authors' results show that BM25 performs better than SDM on long natural language queries
  • This could suggest that, for queries of this length, accounting for term proximity the way SDM does could be the wrong way of capturing context.
  • For a better understanding of this result, further investigation comparing the effects of SDM on long natural language queries to its effects on short natural language queries and bag of words, would be needed
Conclusion
  • The authors showed that the authors can obtain substantial gains in ranking effectiveness for long natural language queries by making modifications to a contextual neural language model, BERT.
  • The fact that these BERT-based methods work well suggests that more recent neural transformer based language models, which have been shown to outperform BERT across a range of natural language tasks [2], [7], [12], [19], [24], may yield further gains for similar sentence retrieval and ranking
Summary
  • Introduction:

    Most existing approaches for retrieving similar text rely in some way on word matching.
  • Machine learning has been used for similar text retrieval by employing models trained to re-rank results originally retrieved by word-match based information retrieval techniques.
  • Such systems often outperform word-match based retrieval alone, but still cannot return results that have no non-stopwords in common with the query text.
  • The authors have chosen to focus on CNIR models instead in the hope that they will yield greater gains than non-contextual pretrained models
  • Results:

    Results and Discussion

    Table 1 shows the results of applying various implementations of context-sensitive neural language models to long natural language queries, against the corpus containing 8 million sentences.
  • The authors' results show that BM25 performs better than SDM on long natural language queries
  • This could suggest that, for queries of this length, accounting for term proximity the way SDM does could be the wrong way of capturing context.
  • For a better understanding of this result, further investigation comparing the effects of SDM on long natural language queries to its effects on short natural language queries and bag of words, would be needed
  • Conclusion:

    The authors showed that the authors can obtain substantial gains in ranking effectiveness for long natural language queries by making modifications to a contextual neural language model, BERT.
  • The fact that these BERT-based methods work well suggests that more recent neural transformer based language models, which have been shown to outperform BERT across a range of natural language tasks [2], [7], [12], [19], [24], may yield further gains for similar sentence retrieval and ranking
Tables
  • Table1: Experimental results applying context-sensitive neural language models to long natural language queries
Download tables as Excel
Funding
  • BERT employs a masked language modeling approach whereby the model is trained to reconstruct an output sequence from an input sequence in which a set fraction (15%) of tokens is corrupted and/or masked
Reference
  • Cer, D., Diab, M., Agirre E., LopezGazpio Iigo., and Specia L.. 2017. SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation. In SemEval.
    Google ScholarFindings
  • Clark, K., Luong, M., Le, Q., Manning, C. (2020, March). ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. arXiv preprint arXiv:2003.10555.
    Findings
  • Dai, Z., & Callan, J. (2019, July). Deeper text understanding for IR with contextual neural language modeling. In SIGIR.
    Google ScholarLocate open access versionFindings
  • Devlin, J., Chang, M., Lee, K., Toutanova, K. (2018, November). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proc. of NAACL-HLT..
    Google ScholarLocate open access versionFindings
  • Fan, Y., Pang, L., Hou, J., Guo, J., Lan, Y., & Cheng, X. (2017). Matchzoo: A toolkit for deep text matching. arXiv preprint arXiv:1707.07270.
    Findings
  • Gupta, M., & Bendersky, M. (2015). Information retrieval with verbose queries. Foundations and Trends® in Information Retrieval, 9(3-4), 209-354.
    Google ScholarLocate open access versionFindings
  • Howard, J., Ruder, S. (2018, May). Universal Language model Fine-tuning for Text Classification. In Proc. of ACL.
    Google ScholarLocate open access versionFindings
  • Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM TOIS, 20(4), 422-446.
    Google ScholarLocate open access versionFindings
  • Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP.
    Google ScholarFindings
  • Johnson, J., Douze, M., Jegou, H. (2017, February). Billion-scale similarity search with GPUs. arXiv preprint arXiv:1702.08734.
    Findings
  • Lin, J., Efron, M., Wang, Y., & Sherman, G. (2014). Overview of the trec-2014 microblog track. In TREC.
    Google ScholarFindings
  • Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V. (2019, July). RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692.
    Findings
  • MacAvaney, S., Yates, A., Cohan, A., & Goharian, N. (2019). Contextualized Word Representations for Document Re-Ranking. arXiv preprint arXiv:1904.07094.
    Findings
  • Metzler, D., & Croft, W. B. (2005, August). A Markov random field model for term dependencies. In SIGIR.
    Google ScholarFindings
  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In NIPS.
    Google ScholarFindings
  • Mu, J., Viswanath, P. (2018, March). All-but-the-top: Simple and Effective Post-processing for Word Representations. In ICLR.
    Google ScholarFindings
  • Nogueira, R., & Cho, K. (2019). Passage Re-ranking with BERT. arXiv preprint arXiv:1901.04085.
    Findings
  • Padigela, H., Zamani, H., & Croft, W. B. (2019). Investigating the successes and failures of BERT for passage re-ranking. arXiv preprint arXiv:1905.01758.
    Findings
  • Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P. (2019, October). Exploring the Limits of Transfer Learning with Unified Text-to-Text Transformer. arXiv preprint arXiv:1910.10683.
    Findings
  • Reimers, N & Gurevych, I (2019) Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In EMNLP-IJCNLP.
    Google ScholarLocate open access versionFindings
  • Robertson, S. E., Walker, S., Jones, S., Hancock-Beaulieu, M. M., & Gatford, M. (1995). Okapi at TREC-3. In NIST Special Publication.
    Google ScholarFindings
  • Yang, W., Lu, K., Yang, P., & Lin, J. (2019, July). Critically Examining the" Neural Hype" Weak Baselines and the Additivity of Effectiveness Gains from Neural Ranking Models. In SIGIR.
    Google ScholarLocate open access versionFindings
  • Yang, W., Zhang, H., & Lin, J. (2019). Simple applications of BERT for ad hoc document retrieval. arXiv preprint arXiv:1903.10972.
    Findings
  • Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., Le, Q. (2019, June). XLNet: Generalized Autoregressive Pretraining for Language Understanding. arXiv preprint arXiv:1906.08237.
    Findings
  • Yilmaz, Z. A., Yang, W., Zhang, H., & Lin, J. (2019, November). Cross-domain modeling of sentence-level evidence for document retrieval. In EMNLP-IJCNLP.
    Google ScholarFindings
Author
Qadrud-Din Javed
Qadrud-Din Javed
Rabiou Ashraf Bah
Rabiou Ashraf Bah
Walker Ryan
Walker Ryan
Soni Ravi
Soni Ravi
Gajek Martin
Gajek Martin
Pack Gabriel
Pack Gabriel
Rangaraj Akhil
Rangaraj Akhil
Your rating :
0

 

Tags
Comments
小科