We study the single-target Personalized PageRank query, which measures the importance of a given target node t to every node s in the graph
Personalized PageRank to a Target Node, Revisited
KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining Virtual Event..., pp.657-667, (2020)
Personalized PageRank (PPR) is a widely used node proximity measure in graph mining and network analysis. Given a source node s and a target node t, the PPR value π(s,t) represents the probability that a random walk from s terminates at t, and thus indicates the bidirectional importance between s and t. The majority of the existing work f...更多
下载 PDF 全文
- Personalized PageRank (PPR), as a variant of PageRank , focuses on the relative significance of a target node with respect to a source node in a graph.
- Given a directed graph G = (V , E) with n nodes and m edges, the PPR value π (s, t) of a target node t with respect to a source node s is defined as the probability that an α-discounted random walk from node s terminates at t.
- Personalized PageRank (PPR), as a variant of PageRank , focuses on the relative significance of a target node with respect to a source node in a graph
- We demonstrate that the Randomized Backward Search algorithm improves the complexity of single-source SimRank computation, heavy hitters Personalized PageRank query, and Personalized PageRank-related graph neural networks in Section 5
- We study the single-target Personalized PageRank query, which measures the importance of a given target node t to every node s in the graph
- We present an algorithm Randomized Backward Search to compute approximate single-target Personalized PageRank query with optimal computational complexity
- We show that Randomized Backward Search improves three concrete applications in graph mining: heavy hitters Personalized PageRank query, single-source SimRank computation, and scalable graph neural networks
- An interesting open problem is whether we can replace the Backward Search algorithm with Randomized Backward Search to further improve the complexity of these algorithms
- This section experimentally evaluates the performance of RBS against state-of-the-art methods.
- Section 6.1 presents the empirical study for single-target PPR queries.
- Section 6.2 applies RBS to three concrete applications to show its effectiveness.
- The information of the datasets the authors used is listed in table 3.
- All datasets are obtained from [1, 2].
- All experiments are conducted on a machine with an Intel(R) Xeon(R) E7-4809 @2.10GHz CPU and 196GB memory
- The authors evaluate the performance of RBS against Backward Search  for the single-target PPR query.
- For Backward Search (BS), the authors set ε = δ for relative error.
- Figure 2 shows the tradeoffs between the MaxAdditiveErr and the query time for the additive error experiments.
- Figure 3 presents the tradeoffs between Precision@k and the query time for the relative error experiments.
- To obtain an additive error of 10−6 on IT, the authors observe a 100x query time speedup for RBS.
- From Figure 3, the authors observe that the precision of RBS with relative error approaches 1 more rapidly, which concurs with the theoretical analysis
- The authors study the single-target PPR query, which measures the importance of a given target node t to every node s in the graph.
- The authors present an algorithm RBS to compute approximate single-target PPR query with optimal computational complexity.
- The authors note that a few works combine the Backward Search algorithm with the Monte-Carlo algorithm to obtain nearoptimal query cost for single-pair queries [36, 39].
- An interesting open problem is whether the authors can replace the Backward Search algorithm with RBS to further improve the complexity of these algorithms
- Table1: Complexity of single-source and single-target PPR queries
- Table2: Table of notations
- Table3: Data Sets
- This research is supported by National Natural Science Foundation of China (No 61832017, No 61972401, No 61932001, No.U1936205), by Beijing Outstanding Young Scientist Program NO
- BJJWZYJH012019100020098, and by the Fundamental Research Funds for the Central Universities and the Research Funds of Renmin University of China under Grant 18XNLG21
- Junhao Gan is supported by Australian Research Council (ARC) DECRA DE190101118
- Sibo Wang is also supported by Hong Kong RGC ECS Grant No 24203419
- Zengfeng Huang is supported by Shanghai Science and Technology Commission Grant No 17JC1420200, and by Shanghai Sailing Program Grant No 18YF1401200
- http://snap.stanford.edu/data. http://law.di.unimi.it/datasets.php.
-  Reid Andersen, Christian Borgs, Jennifer Chayes, John Hopcroft, Kamal Jain, Vahab Mirrokni, and Shanghua Teng. Robust pagerank and locally computable spam detection features. In Proceedings of the 4th international workshop on Adversarial information retrieval on the web, pages 69–76, 2008.
-  Reid Andersen, Christian Borgs, Jennifer T. Chayes, John E. Hopcroft, Vahab S. Mirrokni, and Shang-Hua Teng. Local computation of pagerank contributions. In WAW, pages 150–165, 2007.
-  Reid Andersen, Fan R. K. Chung, and Kevin J. Lang. Local graph partitioning using pagerank vectors. In FOCS, pages 475–486, 2006.
-  Lars Backstrom and Jure Leskovec. Supervised random walks: predicting and recommending links in social networks. In WSDM, pages 635–644, 2011.
-  Bahman Bahmani, Kaushik Chakrabarti, and Dong Xin. Fast personalized pagerank on mapreduce. In SIGMOD, pages 973–984, 2011.
-  Bahman Bahmani, Abdur Chowdhury, and Ashish Goel. Fast incremental and personalized pagerank. VLDB, 4(3):173–184, 2010.
-  Marco Bressan, Enoch Peserico, and Luca Pretto. Sublinear algorithms for local graph centrality estimation. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 709–71IEEE, 2018.
-  Soumen Chakrabarti. Dynamic personalized pagerank in entity-relation graphs. In WWW, pages 571–580, 2007.
-  Moses Charikar, Kevin Chen, and Martin Farach-Colton. Finding frequent items in data streams. In ICALP, pages 693–703.
-  Mustafa Coskun, Ananth Grama, and Mehmet Koyuturk. Efficient processing of network proximity queries via chebyshev acceleration. In KDD, pages 1515– 1524, 2016.
-  Dániel Fogaras, Balázs Rácz, Károly Csalogány, and Tamás Sarlós. Towards scaling fully personalized pagerank: Algorithms, lower bounds, and experiments. Internet Mathematics, 2(3):333–358, 2005.
-  Yasuhiro Fujiwara, Makoto Nakatsuji, Makoto Onizuka, and Masaru Kitsuregawa. Fast and exact top-k search for random walk with restart. PVLDB, 5(5):442–453, 2012.
-  Yasuhiro Fujiwara, Makoto Nakatsuji, Hiroaki Shiokawa, Takeshi Mishima, and Makoto Onizuka. Efficient ad-hoc search for personalized pagerank. In SIGMOD, pages 445–456, 2013.
-  Yasuhiro Fujiwara, Makoto Nakatsuji, Hiroaki Shiokawa, Takeshi Mishima, and Makoto Onizuka. Fast and exact top-k algorithm for pagerank. In AAAI, 2013.
-  Yasuhiro Fujiwara, Makoto Nakatsuji, Takeshi Yamamuro, Hiroaki Shiokawa, and Makoto Onizuka. Efficient personalized pagerank with accuracy assurance. In KDD, pages 15–23, 2012.
-  Tao Guo, Xin Cao, Gao Cong, Jiaheng Lu, and Xuemin Lin. Distributed algorithms on exact personalized pagerank. In SIGMOD, pages 479–494, 2017.
-  Manish S. Gupta, Amit Pathak, and Soumen Chakrabarti. Fast algorithms for top-k personalized pagerank queries. In WWW, pages 1225–1226, 2008.
-  Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, and Reza Zadeh. Wtf: The who to follow service at twitter. In WWW, pages 505–514, 2013.
-  Glen Jeh and Jennifer Widom. Simrank: a measure of structural-context similarity. In SIGKDD, pages 538–543, 2002.
-  Glen Jeh and Jennifer Widom. Scaling personalized web search. In WWW, pages 271–279, 2003.
-  Minhao Jiang, Ada Wai-Chee Fu, and Raymond Chi-Wing Wong. Reads: a random walk approach for efficient and accurate dynamic simrank. PPVLDB, 10(9):937–948, 2017.
-  Ruoming Jin, Victor E Lee, and Hui Hong. Axiomatic ranking of network role similarity. In KDD, pages 922–930, 2011.
-  Jinhong Jung, Namyong Park, Sael Lee, and U Kang. Bepi: Fast and memoryefficient method for billion-scale random walk with restart. In SIGMOD, pages 789–804, 2017.
-  Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. ICLR, 2017.
-  Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. Personalized embedding propagation: Combining neural networks on graphs with personalized pagerank. CoRR, abs/1810.05997, 2018.
-  Johannes Klicpera, Stefan WeiÃ§enberger, and Stephan GÃijnnemann. Diffusion improves graph learning, 2019.
-  Mitsuru Kusumoto, Takanori Maehara, and Ken-ichi Kawarabayashi. Scalable similarity search for simrank. In SIGMOD, pages 325–336, 2014.
-  Pei Lee, Laks V. S. Lakshmanan, and Jeffrey Xu Yu. On top-k structural similarity search. In ICDE, pages 774–785, 2012.
-  Lina Li, Cuiping Li, Chen Hong, and Xiaoyong Du. Mapreduce-based simrank computation and its application in social recommender system. In Big Data (BigData Congress), 2013 IEEE International Congress on, 2013.
-  Zhenguo Li, Yixiang Fang, Qin Liu, Jiefeng Cheng, Reynold Cheng, and John Lui. Walking in the cloud: Parallel simrank at scale. PVLDB, 9(1):24–35, 2015.
-  David Liben-Nowell and Jon M. Kleinberg. The link prediction problem for social networks. In CIKM, pages 556–559, 2003.
-  Yu Liu, Bolong Zheng, Xiaodong He, Zhewei Wei, Xiaokui Xiao, Kai Zheng, and Jiaheng Lu. Probesim: scalable single-source and top-k simrank computations on dynamic graphs. PVLDB, 11(1):14–26, 2017.
-  Peter Lofgren, Siddhartha Banerjee, and Ashish Goel. Bidirectional pagerank estimation: From average-case to worst-case. In WAW, pages 164–176, 2015.
-  Peter Lofgren, Siddhartha Banerjee, and Ashish Goel. Personalized pagerank estimation and search: A bidirectional approach. In WSDM, pages 163–172, 2016.
-  Peter Lofgren and Ashish Goel. Personalized pagerank to a target node. arXiv preprint arXiv:1304.4658, 2013.
-  Peter A. Lofgren, Siddhartha Banerjee, Ashish Goel, and C. Seshadhri. Fast-ppr: Scaling personalized pagerank estimation for large graphs. In Proceedings of the Mining, KDD ’14, pages 1436–1445, New York, NY, USA, 2014. ACM.
-  Peter A Lofgren, Siddhartha Banerjee, Ashish Goel, and C Seshadhri. Fast-ppr: Scaling personalized pagerank estimation for large graphs. In KDD, pages 1436– 1445, 2014.
-  Linyuan Lü and Tao Zhou. Link prediction in complex networks: A survey. Physica A: statistical mechanics and its applications, 390(6):1150–1170, 2011.
-  Takanori Maehara, Takuya Akiba, Yoichi Iwata, and Ken-ichi Kawarabayashi. Computing personalized pagerank quickly by exploiting graph structures. PVLDB, 7(12):1023–1034, 2014.
-  Takanori Maehara, Mitsuru Kusumoto, and Ken-ichi Kawarabayashi. Efficient simrank computation via linearization. CoRR, abs/1411.7228, 2014.
-  Naoto Ohsaka, Takanori Maehara, and Ken-ichi Kawarabayashi. Efficient pagerank tracking in evolving networks. In KDD, pages 875–884, 2015.
-  Mingdong Ou, Peng Cui, Jian Pei, Ziwei Zhang, and Wenwu Zhu. Asymmetric transitivity preserving graph embedding. In SIGKDD, pages 1105–1114. ACM, 2016.
-  Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The pagerank citation ranking: bringing order to the web. 1999.
-  Hannu Reittu, Ilkka Norros, Tomi Räty, Marianna Bolla, and Fülöp Bazsó. Regular decomposition of large graphs: Foundation of a sampling approach to stochastic block model fitting. Data Science and Engineering, 4(1):44–60, 2019.
-  CH Ren, Luyi Mo, CM Kao, CK Cheng, and DWL Cheung. Clude: An efficient algorithm for lu decomposition over a sequence of evolving graphs. In EDBT, 2014.
-  Atish Das Sarma, Anisur Rahaman Molla, Gopal Pandurangan, and Eli Upfal. Fast distributed pagerank computation. Theoretical Computer Science, 561:113– 121, 2015.
-  Yingxia Shao, Bin Cui, Lei Chen, Mingming Liu, and Xing Xie. An efficient similarity search framework for simrank over large dynamic graphs. PVLDB, 8(8):838–849, 2015.
-  Kijung Shin, Jinhong Jung, Lee Sael, and U. Kang. BEAR: block elimination approach for random walk with restart on large graphs. In SIGMOD, pages 1571– 1585, 2015.
-  Nikita Spirin and Jiawei Han. Survey on web spam detection: principles and algorithms. SIGKDD Explorations, 13(2):50–64, 2011.
-  Boyu Tian and Xiaokui Xiao. SLING: A near-optimal index structure for simrank. In SIGMOD, pages 1859–1874, 2016.
-  Anton Tsitsulin, Davide Mottin, Panagiotis Karras, and Emmanuel Müller. Verse: Versatile graph embeddings from similarity measures. In WWW, pages 539–548. International World Wide Web Conferences Steering Committee, 2018.
-  Petar VeliÄŊkoviÄĞ, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro LiÃš, and Yoshua Bengio. Graph attention networks, 2017.
-  Sibo Wang, Youze Tang, Xiaokui Xiao, Yin Yang, and Zengxiang Li. Hubppr: Effective indexing for approximate personalized pagerank. PVLDB, 10(3):205– 216, 2016.
-  Sibo Wang, Youze Tang, Xiaokui Xiao, Yang Yin, and Zengxiang Li. Hubppr: Effective indexing for approximate personalized pagerank. In PVLDB, 2016.
-  Sibo Wang and Yufei Tao. Efficient algorithms for finding approximate heavy hitters in personalized pageranks. In Proceedings of the 2018 International Conference on Management of Data, pages 1113–1127, 2018.
-  Sibo Wang, Renchi Yang, Xiaokui Xiao, Zhewei Wei, and Yin Yang. FORA: simple and effective approximate single-source personalized pagerank. In KDD, pages 505–514, 2017.
-  Zhewei Wei, Xiaodong He, Xiaokui Xiao, Sibo Wang, Yu Liu, Xiaoyong Du, and Ji-Rong Wen. Prsim: Sublinear time simrank computation on large power-law graphs. In Proceedings of the 2019 International Conference on Management of Data, pages 1042–1059, 2019.
-  Zhewei Wei, Xiaodong He, Xiaokui Xiao, Sibo Wang, Shuo Shang, and Ji-Rong Wen. Topppr: top-k personalized pagerank queries with precision guarantees on large graphs. In SIGMOD, pages 441–456. ACM, 2018.
-  Yubao Wu, Ruoming Jin, and Xiang Zhang. Fast and unified local search for random walk based k-nearest-neighbor query in large graphs. In SIGMOD 2014, pages 1139–1150, 2014.
-  Keyulu Xu, Chengtao Li, Yonglong Tian, Tomohiro Sonobe, Ken-ichi Kawarabayashi, and Stefanie Jegelka. Representation learning on graphs with jumping knowledge networks. CoRR, abs/1806.03536, 2018.
-  Yuan Yin and Zhewei Wei. Scalable graph embeddings via sparse transpose proximities. CoRR, abs/1905.07245, 2019.
-  Weiren Yu and Xuemin Lin. IRWR: incremental random walk with restart. In SIGIR, pages 1017–1020, 2013.
-  Weiren Yu and Julie A. McCann. Efficient partial-pairs simrank search on large networks. Proceedings of the Vldb Endowment, 8(5):569–580.
-  Weiren Yu and Julie A. McCann. Random walk with restart over dynamic graphs. In ICDM, pages 589–598, 2016.
-  Hongyang Zhang, Peter Lofgren, and Ashish Goel. Approximate personalized pagerank on dynamic graphs. In KDD, pages 1315–1324, 2016.
-  Fanwei Zhu, Yuan Fang, Kevin Chen-Chuan Chang, and Jing Ying. Incremental and accuracy-aware personalized pagerank through scheduled approximation. PVLDB, 6(6):481–492, 2013.
- 0. Assume Var[πi (x, t)]