Ranking-Incentivized Quality Preserving Content Modification

SIGIR '20: The 43rd International ACM SIGIR conference on research and development in Information Retrieval Virtual Event China July, 2020, pp. 259-268, 2020.

Cited by: 0|Bibtex|Views17|DOI:https://doi.org/10.1145/3397271.3401058
EI
Other Links: arxiv.org|dl.acm.org|dblp.uni-trier.de|academic.microsoft.com
Weibo:
We presented a novel method of modifying a document so as to promote it in rankings induced by a non-disclosed ranking function for a given query

Abstract:

The Web is a canonical example of a competitive retrieval setting where many documents' authors consistently modify their documents to promote them in rankings. We present an automatic method for quality-preserving modification of document content --- i.e., maintaining content quality --- so that the document is ranked higher for a query ...More

Code:

Data:

0
Introduction
  • AND MOTIVATION

    Several research communities nurture work on adversarial attacks on algorithms.
  • There has recently been much work on devising adversarial attacks on machine learning algorithms — neural networks — that span different tasks: general machine learning challenges [19, 34, 39], reading comprehension [21], speaker identification [26], object detection [42], face recognition [13], and more.
  • This line of work has driven forward the development of algorithms which are more robust to adversarial examples; e.g., Jia et al [22], He et al [18], Zhang et al [46]
Highlights
  • AND MOTIVATION

    Several research communities nurture work on adversarial attacks on algorithms
  • The “attacked” algorithms are often used in real-life systems
  • We evaluated our approach by using it as a bot in content-based ranking competitions we organized between students1
  • We presented a novel method of modifying a document so as to promote it in rankings induced by a non-disclosed ranking function for a given query
  • Our method replaces a passage of the document with another passage — a challenge we address as a learning-to-rank task over passage pairs with a dual-objective: rank promotion and contentquality maintenance
  • Our method served as a bot in content-based ranking competitions between students
Results
  • There are five “players” per query in each live competition round: two students from the competitions, two students from Raifer et al.’s competitions whose documents were planted, and the bot.

    The authors analyzed the competitions using five evaluation measures.
  • The raw promotion and scaled promotion measures quantify the change of a document’s rank between rounds 1 and 2.
  • The approach is trained with three types of labels which results in three bots: one trained as in the online evaluation for both rank-promotion and coherence (l labels) with β = 1 in the harmonic mean; the other two are trained either only for coherence (c labels) or only for rank-promotion (r labels).
  • As in the online evaluation, the authors use for reference comparison a static bot which keeps the student document as is.
Conclusion
  • The authors presented a novel method of modifying a document so as to promote it in rankings induced by a non-disclosed ranking function for a given query.
  • The only information about the function is past rankings it induced for the query.
  • The authors' method is designed to maintain the content quality of the document it modifies.
  • The authors' method served as a bot in content-based ranking competitions between students.
  • The bot produced documents that were of high quality, and better promoted in rankings than the students’ documents.
  • Additional offline evaluation further demonstrated the merits of the bot
Summary
  • Introduction:

    AND MOTIVATION

    Several research communities nurture work on adversarial attacks on algorithms.
  • There has recently been much work on devising adversarial attacks on machine learning algorithms — neural networks — that span different tasks: general machine learning challenges [19, 34, 39], reading comprehension [21], speaker identification [26], object detection [42], face recognition [13], and more.
  • This line of work has driven forward the development of algorithms which are more robust to adversarial examples; e.g., Jia et al [22], He et al [18], Zhang et al [46]
  • Results:

    There are five “players” per query in each live competition round: two students from the competitions, two students from Raifer et al.’s competitions whose documents were planted, and the bot.

    The authors analyzed the competitions using five evaluation measures.
  • The raw promotion and scaled promotion measures quantify the change of a document’s rank between rounds 1 and 2.
  • The approach is trained with three types of labels which results in three bots: one trained as in the online evaluation for both rank-promotion and coherence (l labels) with β = 1 in the harmonic mean; the other two are trained either only for coherence (c labels) or only for rank-promotion (r labels).
  • As in the online evaluation, the authors use for reference comparison a static bot which keeps the student document as is.
  • Conclusion:

    The authors presented a novel method of modifying a document so as to promote it in rankings induced by a non-disclosed ranking function for a given query.
  • The only information about the function is past rankings it induced for the query.
  • The authors' method is designed to maintain the content quality of the document it modifies.
  • The authors' method served as a bot in content-based ranking competitions between students.
  • The bot produced documents that were of high quality, and better promoted in rankings than the students’ documents.
  • Additional offline evaluation further demonstrated the merits of the bot
Tables
  • Table1: Online evalution: main result. The best result in a block for each round is boldfaced. Promotion is with respect to the previous round, and hence, there are no promotion numbers for the first round. Recall that positive values for raw and scaled promotion attest to actual promotion while negative values attest to demotion. The lower the values of average rank the better. (The highest rank is 1.)
  • Table2: Offline evaluation. Our bot was trained for coherence (c), rank-promotion (r ) and both (l); l is the harmonic mean of c and r using β = 1. Statistically significant differences with the students and the static bot are marked with ‘s’ and ‘b’, respectively. The best result in a column is boldfaced
  • Table3: Feature weights of the passage-pair ranker
Download tables as Excel
Related work
  • There is a body of work on identifying/fighting black hat SEO; specifically, spam [1, 7]. Our approach is essentially a content-based white hat SEO method intended to rank-promote legitimate documents via legitimate content modifications. We are not aware of past work on devising such automatic content modification procedures.

    Our approach might seem at first glance conceptually similar to the black hat stitching technique [17]: authors of low-quality Web pages manually “glue” to their documents unrelated phrases from other documents. In contrast, our approach operates on descent quality documents and is optimized to maintain document quality.

    Our approach can conceptually be viewed as ranking-incentivized passage pool creation can alternatively rely on automatic passage paraphrasing: modifying the document to promote it in rankings, generation. We leave these challenges for future work.
Funding
  • The work by Moshe Tennenholtz and Gregory Goren was supported by funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement 740435)
Reference
  • 2005–2009. AIRWeb — International Workshop on Adversarial Information Retrieval on the Web.
    Google ScholarFindings
  • Ion Androutsopoulos and Prodromos Malakasiotis. 2010. A Survey of Paraphrasing and Textual Entailment Methods. Journal of Artificial Intelligence Research 38 (2010), 135–187.
    Google ScholarLocate open access versionFindings
  • Regina Barzilay and Lillian Lee. 200Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment. In Proc. of HLT-NAACL.
    Google ScholarLocate open access versionFindings
  • Ran Ben-Basat, Moshe Tennenholtz, and Oren Kurland. 2017. A Game Theoretic Analysis of the Adversarial Retrieval Setting. J. Artif. Intell. Res. 60 (2017), 1127– 1164.
    Google ScholarLocate open access versionFindings
  • Michael Bendersky, W. Bruce Croft, and Yanlei Diao. 2011. Quality-biased ranking of Web documents. In Proc. of WSDM. 95–104.
    Google ScholarLocate open access versionFindings
  • Dan Boneh. 1999. Twenty years of attacks on the RSA cryptosystem. Notices of the American Mathematical Society (AMS) 46, 2 (1999), 203–213.
    Google ScholarLocate open access versionFindings
  • Carlos Castillo and Brian D. Davison. 2010. Adversarial Web Search. Foundations and Trends in Information Retrieval 4, 5 (2010), 377–486.
    Google ScholarFindings
  • Na Dai, Milad Shokouhi, and Brian D. Davison. 2011. Learning to rank for freshness and relevance. In Proc. of SIGIR. 95–104.
    Google ScholarLocate open access versionFindings
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR abs/1810.04805 (2018).
    Findings
  • Fernando Diaz, Bhaskar Mitra, and Nick Craswell. 2016. Query Expansion with Locally-Trained Word Embeddings. In Proc. of ACL.
    Google ScholarLocate open access versionFindings
  • Anlei Dong, Yi Chang, Zhaohui Zheng, Gilad Mishne, Jing Bai, Ruiqiang Zhang, Karolina Buchner, Ciya Liao, and Fernando Diaz. 2010. Towards recency ranking in web search. In Proc. of WSDM. 11–20.
    Google ScholarLocate open access versionFindings
  • Anlei Dong, Ruiqiang Zhang, Pranam Kolari, Jing Bai, Fernando Diaz, Yi Chang, Zhaohui Zheng, and Hongyuan Zha. 2010. Time is of the essence: improving recency ranking using Twitter data. In Proc. of WWW. 331–340.
    Google ScholarLocate open access versionFindings
  • Yinpeng Dong, Hang Su, Baoyuan Wu, Zhifeng Li, Wei Liu, Tong Zhang, and Jun Zhu. 2019. Efficient Decision-Based Black-Box Adversarial Attacks on Face Recognition. In Proc. of CVPR. 7714–7722.
    Google ScholarLocate open access versionFindings
  • Hui Fang and ChengXiang Zhai. 2005. An exploration of axiomatic approaches to information retrieval. In Proc. of SIGIR. 480–487.
    Google ScholarLocate open access versionFindings
  • Arthur C Graesser, Danielle S McNamara, Max M Louwerse, and Zhiqiang Cai. 2004. Coh-Metrix: Analysis of text on cohesion and language. Behavior research methods, instruments, & computers 36, 2 (2004), 193–202.
    Google ScholarLocate open access versionFindings
  • Jiaxian Guo, Sidi Lu, Han Cai, Weinan Zhang, Yong Yu, and Jun Wang. 2018. Long text generation via adversarial training with leaked information. In Proc. of AAAI.
    Google ScholarLocate open access versionFindings
  • Zoltán Gyöngyi and Hector Garcia-Molina. 2005. Web Spam Taxonomy. In Proc. of AIRWeb 2005. 39–47.
    Google ScholarLocate open access versionFindings
  • Warren He, James Wei, Xinyun Chen, Nicholas Carlini, and Dawn Song. 2017. Adversarial Example Defenses: Ensembles of Weak Defenses are not Strong. CoRR abs/1706.04701 (2017).
    Findings
  • Sandy Huang, Nicolas Papernot, Ian J. Goodfellow, Yan Duan, and Pieter Abbeel. 2017. Adversarial Attacks on Neural Network Policies. In Proc. of ICLR.
    Google ScholarLocate open access versionFindings
  • Yangfeng Ji, Gholamreza Haffari, and Jacob Eisenstein. 2016. A latent variable recurrent neural network for discourse relation language models. arXiv preprint arXiv:1603.01913 (2016).
    Findings
  • Robin Jia and Percy Liang. 2017. Adversarial Examples for Evaluating Reading Comprehension Systems. In Proc. of EMNLP. 2021–2031.
    Google ScholarLocate open access versionFindings
  • Robin Jia, Aditi Raghunathan, Kerem Göksel, and Percy Liang. 2019. Certified Robustness to Adversarial Word Substitutions. In Proc. of EMNLP-IJCNLP. 4127– 4140.
    Google ScholarLocate open access versionFindings
  • Thorsten Joachims. 2006. Training linear SVMs in linear time. In Proc. of KDD. 217–226.
    Google ScholarLocate open access versionFindings
  • Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. 2005. Accurately interpreting clickthrough data as implicit feedback. In Proc. of SIGIR. 154–161.
    Google ScholarLocate open access versionFindings
  • Chloé Kiddon, Luke Zettlemoyer, and Yejin Choi. 2016. Globally coherent text generation with neural checklist models. In Proc. of EMNLP. 329–339.
    Google ScholarLocate open access versionFindings
  • Felix Kreuk, Yossi Adi, Moustapha Cissé, and Joseph Keshet. 2018. Fooling End-To-End Speaker Verification With Adversarial Examples. In Proc. of ICASSP. 1962–1966.
    Google ScholarLocate open access versionFindings
  • Mirella Lapata and Regina Barzilay. 2005. Automatic evaluation of text coherence: Models and representations. In Proc. of IJCAI, Vol. 5. 1085–1090.
    Google ScholarLocate open access versionFindings
  • Jiwei Li and Dan Jurafsky. 2016. Neural net models for open-domain discourse coherence. arXiv preprint arXiv:1606.01545 (2016).
    Findings
  • Xiaoyan Li and W. Bruce Croft. 2003. Time-Based Language Models. In Proc. of CIKM. 469–475.
    Google ScholarLocate open access versionFindings
  • Ziheng Lin, Hwee Tou Ng, and Min-Yen Kan. 2011. Automatically evaluating text coherence using discourse relations. In Proc. of ACL. 997–1006.
    Google ScholarLocate open access versionFindings
  • Tie-Yan Liu. 2011. Learning to Rank for Information Retrieval. Springer. I–XVII, 1–285 pages.
    Google ScholarLocate open access versionFindings
  • Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781 (2013).
    Findings
  • Bhaskar Mitra and Nick Craswell. 2018. An Introduction to Neural Information Retrieval. Foundations and Trends in Information Retrieval 13, 1 (2018), 1–126.
    Google ScholarLocate open access versionFindings
  • Nicolas Papernot, Patrick D. McDaniel, Ian J. Goodfellow, Somesh Jha, Z. Berkay Celik, and Ananthram Swami. 2017. Practical Black-Box Attacks against Machine Learning. In Proc. of AsiaCCS. 506–519.
    Google ScholarLocate open access versionFindings
  • Emily Pitler and Ani Nenkova. 2008. Revisiting readability: A unified framework for predicting text quality. In Proc. of EMNLP. 186–195.
    Google ScholarLocate open access versionFindings
  • Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. Technical Report. OpenAI.
    Google ScholarFindings
  • Nimrod Raifer, Fiana Raiber, Moshe Tennenholtz, and Oren Kurland. 2017. Information Retrieval Meets Game Theory: The Ranking Competition Between Documents’ Authors. In Proc. of SIGIR. 465–474.
    Google ScholarLocate open access versionFindings
  • Fei Song and W. Bruce Croft. 1999. A general language model for information retrieval. In Proc. of SIGIR. 279–280.
    Google ScholarLocate open access versionFindings
  • Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian J. Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. In Proc. of ICLR.
    Google ScholarLocate open access versionFindings
  • Moshe Tennenholtz and Oren Kurland. 2019. Rethinking search engines and recommendation systems: a game theoretic perspective. Commun. ACM 62, 12 (2019), 66–75.
    Google ScholarLocate open access versionFindings
  • Qiang Wu, Christopher J. C. Burges, Krysta Marie Svore, and Jianfeng Gao. 2010. Adapting boosting for information retrieval measures. Information Retrieval 13, 3 (2010), 254–270.
    Google ScholarLocate open access versionFindings
  • Cihang Xie, Jianyu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, and Alan L. Yuille. 2017. Adversarial Examples for Semantic Segmentation and Object Detection. In Proc. of ICCV 2017. 1378–1387.
    Google ScholarLocate open access versionFindings
  • Zhilin Yang, Zihang Dai, Yiming Yang, Jaime G. Carbonell, Ruslan Salakhutdinov, and Quoc V. Le. 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding. In Proc. of NeurIPS. 5754–5764.
    Google ScholarLocate open access versionFindings
  • Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, Franziska Roesner, and Yejin Choi. 2019. Defending Against Neural Fake News. In Proc. of NeurIPS. 9051–9062.
    Google ScholarLocate open access versionFindings
  • Chengxiang Zhai and John D. Lafferty. 2001. A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval. In Proc. of SIGIR. 334–342.
    Google ScholarLocate open access versionFindings
  • Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, and Michael I. Jordan. 2019. Theoretically Principled Trade-off between Robustness and Accuracy. In Proc. of ICML. 7472–7482.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments