AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
GRAPE parallelization guarantees to terminate with correct answers under a monotonic condition if the sequential algorithms are correct

Parallelizing Sequential Graph Computations.

SIGMOD Conference, no. 4 (2018): 495-510

被引用73|浏览308
EI
下载 PDF 全文
引用
微博一下

摘要

This paper presents GRAPE, a parallel system for graph computations. GRAPE differs from prior systems in its ability to parallelize existing sequential graph algorithms as a whole. Underlying GRAPE are a simple programming model and a principled approach, based on partial evaluation and incremental computation. We show that sequential gra...更多

代码

数据

0
简介
  • Several parallel systems have been developed for graph computations, e.g., Pregel [35], GraphLab [34], Giraph++ [44] and Blogel [50]
  • These systems, require users to recast graph algorithms into their models.
  • The recasting is nontrivial for people who are not very familiar with the parallel models
  • This makes these systems a privilege for experienced users only.
  • Is it possible to have a system such that the authors can “plug” sequential graph algorithms into it as a whole, and it parallelizes the computation across multiple processors, without drastic degradation in performance or functionality of existing systems?
重点内容
  • Several parallel systems have been developed for graph computations, e.g., Pregel [35], GraphLab [34], Giraph++ [44] and Blogel [50]
  • (2) We prove two fundamental results (Section 4): (a) Assurance Theorem guarantees GRAPE to terminate with correct answers under a monotonic condition when its input sequential algorithms are correct; and (b) Simulation Theorem shows that MapReduce [17], BSP (Bulk Synchronous Parallel) [46] and PRAM (Parallel Random Access Machine) [47] can be optimally simulated by GRAPE
  • For Sim and SubIso, we evaluated the queries over liveJournal and DBpedia, since these queries are meaningful on labeled graphs only, while traffic does not carry labels
  • We evaluated the compatibility of optimization strategies developed for sequential graph algorithms with GRAPE parallelization
  • We have proposed an approach to parallelizing sequential graph algorithms
  • GRAPE parallelization guarantees to terminate with correct answers under a monotonic condition if the sequential algorithms are correct
结果
  • Function Assemble takes partial results Q(Fi ⊕ Mi) and fragmentation graph GP as input, and combines Q(Fi ⊕ Mi) to get Q(G).
  • It is triggered when no more changes can be made to update parameters Ci.xfor any i ∈ [1, m].
  • The authors first evaluated the efficiency of GRAPE over real-life graphs by varying the number n of processors used, from 4 to 24.
  • For Sim and SubIso, the authors evaluated the queries over liveJournal and DBpedia, since these queries are meaningful on labeled graphs only, while traffic does not carry labels
结论
  • For a class of graph queries, users can plug in existing sequential algorithms with minor changes.
  • GRAPE parallelization guarantees to terminate with correct answers under a monotonic condition if the sequential algorithms are correct.
  • Graph algorithms for existing parallel graph systems can be migrated to GRAPE, without incurring extra cost.
  • The authors have verified that GRAPE achieves comparable performance to the state-of-the-art graph systems for various query classes, and that IncEval reduces the cost of iterative graph computations.
表格
  • Table1: Graph traversal on parallel systems
  • Table2: Notations
Download tables as Excel
相关工作
  • The related work is categorized as follows.

    Parallel models and systems. Several parallel models have been studied for graphs, e.g., PRAM [47], BSP [46] and MapReduce [17]. PRAM abstracts parallel RAM access over shared memory. BSP models parallel computations in supersteps (including local computation, communication and a synchronization barrier) to synchronize communication among workers. Pregel [35] (Giraph [3]) implements BSP with vertex-centric programming, where a superstep executes a user-defined function at each vertex in parallel. GraphLab [34] revises BSP to pass messages asynchronously. Block-centric models [44,50] extend vertex-centric programming to blocks, to exchange messages among blocks.
基金
  • Fan, Xu, Yu, Cao and Tian are supported in part by ERC 652976, NSFC 61421003, 973 Program 2014CB340302, EPSRC EP/M025268/1, Shenzhen Peacock Program 1105100030834361, Guangdong Innovative Research Team Program 2011D005, the Foundation for Innovative Research Groups of NSFC, and Beijing Advanced Innovation Center for Big Data and Brain Computing
  • Wu is supported by NSF BIGDATA 1633629 and Google Faculty Research Award
  • Cao is also supported by NSFC 61602023
  • Jiang is supported by HKRGC GRF HKBU12232716
引用论文
  • Aliyun. https://intl.aliyun.com.
    Findings
  • DBpedia. http://wiki.dbpedia.org/Datasets.
    Findings
  • Giraph. http://giraph.apache.org/.[4] GRAPE.http://grapedb.io/.
    Findings
  • [5] Movielens. http://grouplens.org/datasets/movielens/.[6] MPICH.https://www.mpich.org/.
    Findings
  • [7] Snap. http://snap.stanford.edu/data/index.html. http://www.dis.uniroma1.it/challenge9/download.shtml.
    Findings
  • [9] U. A. Acar. Self-Adjusting Computation. PhD thesis, CMU, 2005.
    Google ScholarFindings
  • [10] J. Bang-Jensen and G. Z. Gutin. Digraphs: Theory, Algorithms and Applications. Springer, 2008.
    Google ScholarFindings
  • [11] P. A. Bernstein and N. Goodman. Concurrency control in distributed database systems. ACM Comput. Surv., 13(2):185–221.
    Google ScholarLocate open access versionFindings
  • [12] E. G. Boman, K. D. Devine, and S. Rajamanickam. Scalable matrix computations on large scale-free graphs using 2D graph partitioning. In SC, 2013.
    Google ScholarFindings
  • [13] F. Bourse, M. Lelarge, and M. Vojnovic. Balanced graph edge partition. In SIGKDD, pages 1456–1465, 2014.
    Google ScholarLocate open access versionFindings
  • [14] P. Buneman, G. Cong, W. Fan, and A. Kementsietsidis. Using partial evaluation in distributed query evaluation. In VLDB, 2006.
    Google ScholarLocate open access versionFindings
  • [15] E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. SICOMP, 32(5):1338–1355, 2003.
    Google ScholarLocate open access versionFindings
  • [16] L. P. Cordella, P. Foggia, C. Sansone, and M. Vento. A (sub) graph isomorphism algorithm for matching large graphs. TPAMI, 26(10):1367–1372, 2004.
    Google ScholarLocate open access versionFindings
  • [17] J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. Commun. ACM, 51(1), 2008.
    Google ScholarLocate open access versionFindings
  • [18] W. Fan, C. Hu, and C. Tian. Incremental graph computations: Doable and undoable. In SIGMOD, 2017.
    Google ScholarLocate open access versionFindings
  • [19] W. Fan, J. Li, S. Ma, N. Tang, Y. Wu, and Y. Wu. Graph pattern matching: From intractability to polynomial time. In PVLDB, 2010.
    Google ScholarLocate open access versionFindings
  • [20] W. Fan, J. Li, X. Wang, and Y. Wu. Query preserving graph compression. In SIGMOD, 2012.
    Google ScholarLocate open access versionFindings
  • [21] W. Fan, X. Wang, and Y. Wu. Incremental graph pattern matching. TODS, 38(3), 2013.
    Google ScholarLocate open access versionFindings
  • [22] W. Fan, X. Wang, and Y. Wu. Distributed graph simulation: Impossibility and possibility. PVLDB, 7(12), 2014.
    Google ScholarLocate open access versionFindings
  • [23] M. L. Fredman and R. E. Tarjan. Fibonacci heaps and their uses in improved network optimization algorithms. JACM, 34(3):596–615, 1987.
    Google ScholarLocate open access versionFindings
  • [24] J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In USENIX, 2012.
    Google ScholarLocate open access versionFindings
  • [25] J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica. GraphX: Graph processing in a distributed dataflow framework. In OSDI, 2014.
    Google ScholarLocate open access versionFindings
  • [26] T. J. Harris. A survey of PRAM simulation techniques. ACM Comput. Surv., 26(2):187–206, 1994.
    Google ScholarLocate open access versionFindings
  • [27] M. R. Henzinger, T. Henzinger, and P. Kopke. Computing simulations on finite and infinite graphs. In FOCS, 1995.
    Google ScholarLocate open access versionFindings
  • [28] N. D. Jones. An introduction to partial evaluation. ACM Computing Surveys, 28(3), 1996.
    Google ScholarLocate open access versionFindings
  • [29] H. J. Karloff, S. Suri, and S. Vassilvitskii. A model of computation for MapReduce. In SODA, 2010.
    Google ScholarLocate open access versionFindings
  • [30] G. Karypis and V. Kumar. METIS–unstructured graph partitioning and sparse matrix ordering system, version 2.0. Technical report, 1995.
    Google ScholarFindings
  • [31] A. Khan, Y. Wu, C. C. Aggarwal, and X. Yan. Nema: Fast graph search with label similarity. PVLDB, 6(3), 2013.
    Google ScholarLocate open access versionFindings
  • [32] M. Kim and K. S. Candan. SBV-Cut: Vertex-cut based graph partitioning using structural balance vertices. Data & Knowledge Engineering, 72:285–303, 2012.
    Google ScholarLocate open access versionFindings
  • [33] Y. Koren, R. Bell, C. Volinsky, et al. Matrix factorization techniques for recommender systems. Computer, 42(8):30–37, 2009.
    Google ScholarLocate open access versionFindings
  • [34] Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. Distributed GraphLab: A framework for machine learning in the cloud. PVLDB, 5(8), 2012.
    Google ScholarLocate open access versionFindings
  • [35] G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In SIGMOD, 2010.
    Google ScholarLocate open access versionFindings
  • [36] T. Mytkowicz, M. Musuvathi, and W. Schulte. Data-parallel finite-state machines. In ASPLOS, 2014.
    Google ScholarLocate open access versionFindings
  • [37] K. Pingali, D. Nguyen, M. Kulkarni, M. Burtscher, M. A. Hassaan, R. Kaleem, T.-H. Lee, A. Lenharth, R. Manevich, M. Mendez-Lojo, et al. The tao of parallelism in algorithms. In ACM Sigplan Notices, volume 46, pages 12–25, 2011.
    Google ScholarLocate open access versionFindings
  • [38] C. Radoi, S. J. Fink, R. M. Rabbah, and M. Sridharan. Translating imperative code to MapReduce. In OOPSLA, 2014.
    Google ScholarLocate open access versionFindings
  • [39] G. Ramalingam and T. Reps. An incremental algorithm for a generalization of the shortest-path problem. J. Algorithms, 21(2):267–305, 1996.
    Google ScholarLocate open access versionFindings
  • [40] G. Ramalingam and T. Reps. On the computational complexity of dynamic graph problems. TCS, 158(1-2), 1996.
    Google ScholarLocate open access versionFindings
  • [41] V. Raychev, M. Musuvathi, and T. Mytkowicz. Parallelizing user-defined aggregations using symbolic execution. In SOSP, 2015.
    Google ScholarLocate open access versionFindings
  • [42] S. Salihoglu and J. Widom. GPS: a graph processing system. In SSDBM, 2013.
    Google ScholarLocate open access versionFindings
  • [43] I. Stanton and G. Kliot. Streaming graph partitioning for large distributed graphs. In KDD, pages 1222–1230, 2012.
    Google ScholarLocate open access versionFindings
  • [44] Y. Tian, A. Balmin, S. A. Corsten, and J. M. Shirish Tatikonda. From “think like a vertex” to “think like a graph”. PVLDB, 7(7):193–204, 2013.
    Google ScholarLocate open access versionFindings
  • [45] P. Trinder. A Functional Database. PhD thesis, University of Oxford, 1989.
    Google ScholarFindings
  • [46] L. G. Valiant. A bridging model for parallel computation. Commun. ACM, 33(8):103–111, 1990.
    Google ScholarLocate open access versionFindings
  • [47] L. G. Valiant. General purpose parallel architectures. In Handbook of Theoretical Computer Science, Vol A. 1990.
    Google ScholarLocate open access versionFindings
  • [48] J. Vinagre, A. M. Jorge, and J. Gama. Fast incremental matrix factorization for recommendation with positive-only feedback. In International Conference on User Modeling, Adaptation, and Personalization, 2014.
    Google ScholarLocate open access versionFindings
  • [49] G. Wang, W. Xie, A. J. Demers, and J. Gehrke. Asynchronous large-scale graph processing made easy. In CIDR, 2013.
    Google ScholarLocate open access versionFindings
  • [50] D. Yan, J. Cheng, Y. Lu, and W. Ng. Blogel: A block-centric framework for distributed computation on real-world graphs. PVLDB, 7(14):1981–1992, 2014.
    Google ScholarLocate open access versionFindings
  • [51] D. Yan, J. Cheng, K. Xing, Y. Lu, W. Ng, and Y. Bu. Pregel algorithms for graph connectivity problems with performance guarantees. PVLDB, 7(14):1821–1832, 2014.
    Google ScholarLocate open access versionFindings
  • [52] Y. Zhou, L. Liu, K. Lee, C. Pu, and Q. Zhang. Fast iterative graph computation with resource aware graph parallel abstractions. In HPDC, 2015. (1) We first show that GRAPE terminates. Assume by contradiction that there exist Q and G such that GRAPE does not terminate. Consider the values of update parameters in the fragments of G during the run. Since at least one update parameter has to be updated in a superstep of incremental computation (except the last step), and the total number of distinct values to update parameters is bounded by Q and G by the monotonic condition (a) given in Section 4.1. Hence there must exist supersteps p and q such that for each i ∈ [1, m], xpi = xqi, i.e., the values to all the parameters changed at supersteps p and q are the same.
    Google ScholarLocate open access versionFindings
  • (2) We prove that for any Q and G, at any superstep r of the run of GRAPE for Q and G with PEval, IncEval and Assemble, partial answers Q(Fi[xri ]) (i ∈ [1, m]) are computed on all fragments Fi of G. We show this by induction on r.
    Google ScholarFindings
  • (1) Since BSP and GRAPE have the same amount of physical workers, each worker of BSP is simulated by a worker in
    Google ScholarFindings
  • (2) We use two supersteps in GRAPE to simulate one mapshuffle-reduce round of a MapReduce algorithm, including a map phase and a reduce phase, in the key-value message mode (see Section 3.5) for messages. More specifically, for a MapReduce algorithm A that has R rounds, we implement each round r ∈ [1, R] of A in GRAPE as follows.
    Google ScholarLocate open access versionFindings
  • (2) CC. Figures 8(d)-8(f) demonstrate similar improvement of GRAPE over GraphLab and Giraph for CC, e.g., on average GRAPE ships 5.4% of the data shipped by Giraph and GraphLab. Blogel is slightly better than GRAPE. As remarked in Section 7 for Exp-1(2), this is because Blogel precomputes CCs of graphs when partitioning and loading the graphs, and thus already recognizes connected components by using an internal partition strategy. While a fair comparison should include the time for precomputing CCs in the evaluation time of CC by Blogel, we cannot identify the communication cost saved by its preprocessing. Thus, the reported communication cost of Blogel is almost 0 in all cases. Nonetheless, GRAPE incurs communication cost comparable to the near “optimal” case reported by Blogel, when Blogel operates on a graph that is already partitioned as CCs.
    Google ScholarLocate open access versionFindings
  • (1) The vertex program for Giraph requires substantial changes to its corresponding sequential algorithm. As shown in Fig. 10, the logic flow of a Giraph program for SSSP is quite different from that of a sequential SSSP algorithm. Writing such programs requires users to have prior knowledge about the query classes and the design principle of the vertex-centric model. Moreover, it is challenging to integrate graph-level optimization, e.g., incremental evaluation, into the vertex programming model. In contrast, the logic flow of PIE algorithms (GRAPE) remains the same as those sequential algorithms adapted for PEval and IncEval.
    Google ScholarFindings
  • (2) While Blogel supports block-centric computation, it also requires recasting of sequential algorithms, as shown in Fig. 11.
    Google ScholarFindings
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科