GRAPE parallelization guarantees to terminate with correct answers under a monotonic condition if the sequential algorithms are correct
Parallelizing Sequential Graph Computations.
SIGMOD Conference, no. 4 (2018): 495-510
This paper presents GRAPE, a parallel system for graph computations. GRAPE differs from prior systems in its ability to parallelize existing sequential graph algorithms as a whole. Underlying GRAPE are a simple programming model and a principled approach, based on partial evaluation and incremental computation. We show that sequential gra...更多
下载 PDF 全文
- Several parallel systems have been developed for graph computations, e.g., Pregel , GraphLab , Giraph++  and Blogel 
- These systems, require users to recast graph algorithms into their models.
- The recasting is nontrivial for people who are not very familiar with the parallel models
- This makes these systems a privilege for experienced users only.
- Is it possible to have a system such that the authors can “plug” sequential graph algorithms into it as a whole, and it parallelizes the computation across multiple processors, without drastic degradation in performance or functionality of existing systems?
- Several parallel systems have been developed for graph computations, e.g., Pregel , GraphLab , Giraph++  and Blogel 
- (2) We prove two fundamental results (Section 4): (a) Assurance Theorem guarantees GRAPE to terminate with correct answers under a monotonic condition when its input sequential algorithms are correct; and (b) Simulation Theorem shows that MapReduce , BSP (Bulk Synchronous Parallel)  and PRAM (Parallel Random Access Machine)  can be optimally simulated by GRAPE
- For Sim and SubIso, we evaluated the queries over liveJournal and DBpedia, since these queries are meaningful on labeled graphs only, while traffic does not carry labels
- We evaluated the compatibility of optimization strategies developed for sequential graph algorithms with GRAPE parallelization
- We have proposed an approach to parallelizing sequential graph algorithms
- GRAPE parallelization guarantees to terminate with correct answers under a monotonic condition if the sequential algorithms are correct
- Function Assemble takes partial results Q(Fi ⊕ Mi) and fragmentation graph GP as input, and combines Q(Fi ⊕ Mi) to get Q(G).
- It is triggered when no more changes can be made to update parameters Ci.xfor any i ∈ [1, m].
- The authors first evaluated the efficiency of GRAPE over real-life graphs by varying the number n of processors used, from 4 to 24.
- For Sim and SubIso, the authors evaluated the queries over liveJournal and DBpedia, since these queries are meaningful on labeled graphs only, while traffic does not carry labels
- For a class of graph queries, users can plug in existing sequential algorithms with minor changes.
- GRAPE parallelization guarantees to terminate with correct answers under a monotonic condition if the sequential algorithms are correct.
- Graph algorithms for existing parallel graph systems can be migrated to GRAPE, without incurring extra cost.
- The authors have verified that GRAPE achieves comparable performance to the state-of-the-art graph systems for various query classes, and that IncEval reduces the cost of iterative graph computations.
- Table1: Graph traversal on parallel systems
- Table2: Notations
- The related work is categorized as follows.
Parallel models and systems. Several parallel models have been studied for graphs, e.g., PRAM , BSP  and MapReduce . PRAM abstracts parallel RAM access over shared memory. BSP models parallel computations in supersteps (including local computation, communication and a synchronization barrier) to synchronize communication among workers. Pregel  (Giraph ) implements BSP with vertex-centric programming, where a superstep executes a user-defined function at each vertex in parallel. GraphLab  revises BSP to pass messages asynchronously. Block-centric models [44,50] extend vertex-centric programming to blocks, to exchange messages among blocks.
- Fan, Xu, Yu, Cao and Tian are supported in part by ERC 652976, NSFC 61421003, 973 Program 2014CB340302, EPSRC EP/M025268/1, Shenzhen Peacock Program 1105100030834361, Guangdong Innovative Research Team Program 2011D005, the Foundation for Innovative Research Groups of NSFC, and Beijing Advanced Innovation Center for Big Data and Brain Computing
- Wu is supported by NSF BIGDATA 1633629 and Google Faculty Research Award
- Cao is also supported by NSFC 61602023
- Jiang is supported by HKRGC GRF HKBU12232716
- Aliyun. https://intl.aliyun.com.
- DBpedia. http://wiki.dbpedia.org/Datasets.
- Giraph. http://giraph.apache.org/. GRAPE.http://grapedb.io/.
-  Movielens. http://grouplens.org/datasets/movielens/. MPICH.https://www.mpich.org/.
-  Snap. http://snap.stanford.edu/data/index.html. http://www.dis.uniroma1.it/challenge9/download.shtml.
-  U. A. Acar. Self-Adjusting Computation. PhD thesis, CMU, 2005.
-  J. Bang-Jensen and G. Z. Gutin. Digraphs: Theory, Algorithms and Applications. Springer, 2008.
-  P. A. Bernstein and N. Goodman. Concurrency control in distributed database systems. ACM Comput. Surv., 13(2):185–221.
-  E. G. Boman, K. D. Devine, and S. Rajamanickam. Scalable matrix computations on large scale-free graphs using 2D graph partitioning. In SC, 2013.
-  F. Bourse, M. Lelarge, and M. Vojnovic. Balanced graph edge partition. In SIGKDD, pages 1456–1465, 2014.
-  P. Buneman, G. Cong, W. Fan, and A. Kementsietsidis. Using partial evaluation in distributed query evaluation. In VLDB, 2006.
-  E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. SICOMP, 32(5):1338–1355, 2003.
-  L. P. Cordella, P. Foggia, C. Sansone, and M. Vento. A (sub) graph isomorphism algorithm for matching large graphs. TPAMI, 26(10):1367–1372, 2004.
-  J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. Commun. ACM, 51(1), 2008.
-  W. Fan, C. Hu, and C. Tian. Incremental graph computations: Doable and undoable. In SIGMOD, 2017.
-  W. Fan, J. Li, S. Ma, N. Tang, Y. Wu, and Y. Wu. Graph pattern matching: From intractability to polynomial time. In PVLDB, 2010.
-  W. Fan, J. Li, X. Wang, and Y. Wu. Query preserving graph compression. In SIGMOD, 2012.
-  W. Fan, X. Wang, and Y. Wu. Incremental graph pattern matching. TODS, 38(3), 2013.
-  W. Fan, X. Wang, and Y. Wu. Distributed graph simulation: Impossibility and possibility. PVLDB, 7(12), 2014.
-  M. L. Fredman and R. E. Tarjan. Fibonacci heaps and their uses in improved network optimization algorithms. JACM, 34(3):596–615, 1987.
-  J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In USENIX, 2012.
-  J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica. GraphX: Graph processing in a distributed dataflow framework. In OSDI, 2014.
-  T. J. Harris. A survey of PRAM simulation techniques. ACM Comput. Surv., 26(2):187–206, 1994.
-  M. R. Henzinger, T. Henzinger, and P. Kopke. Computing simulations on finite and infinite graphs. In FOCS, 1995.
-  N. D. Jones. An introduction to partial evaluation. ACM Computing Surveys, 28(3), 1996.
-  H. J. Karloff, S. Suri, and S. Vassilvitskii. A model of computation for MapReduce. In SODA, 2010.
-  G. Karypis and V. Kumar. METIS–unstructured graph partitioning and sparse matrix ordering system, version 2.0. Technical report, 1995.
-  A. Khan, Y. Wu, C. C. Aggarwal, and X. Yan. Nema: Fast graph search with label similarity. PVLDB, 6(3), 2013.
-  M. Kim and K. S. Candan. SBV-Cut: Vertex-cut based graph partitioning using structural balance vertices. Data & Knowledge Engineering, 72:285–303, 2012.
-  Y. Koren, R. Bell, C. Volinsky, et al. Matrix factorization techniques for recommender systems. Computer, 42(8):30–37, 2009.
-  Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. Distributed GraphLab: A framework for machine learning in the cloud. PVLDB, 5(8), 2012.
-  G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In SIGMOD, 2010.
-  T. Mytkowicz, M. Musuvathi, and W. Schulte. Data-parallel finite-state machines. In ASPLOS, 2014.
-  K. Pingali, D. Nguyen, M. Kulkarni, M. Burtscher, M. A. Hassaan, R. Kaleem, T.-H. Lee, A. Lenharth, R. Manevich, M. Mendez-Lojo, et al. The tao of parallelism in algorithms. In ACM Sigplan Notices, volume 46, pages 12–25, 2011.
-  C. Radoi, S. J. Fink, R. M. Rabbah, and M. Sridharan. Translating imperative code to MapReduce. In OOPSLA, 2014.
-  G. Ramalingam and T. Reps. An incremental algorithm for a generalization of the shortest-path problem. J. Algorithms, 21(2):267–305, 1996.
-  G. Ramalingam and T. Reps. On the computational complexity of dynamic graph problems. TCS, 158(1-2), 1996.
-  V. Raychev, M. Musuvathi, and T. Mytkowicz. Parallelizing user-defined aggregations using symbolic execution. In SOSP, 2015.
-  S. Salihoglu and J. Widom. GPS: a graph processing system. In SSDBM, 2013.
-  I. Stanton and G. Kliot. Streaming graph partitioning for large distributed graphs. In KDD, pages 1222–1230, 2012.
-  Y. Tian, A. Balmin, S. A. Corsten, and J. M. Shirish Tatikonda. From “think like a vertex” to “think like a graph”. PVLDB, 7(7):193–204, 2013.
-  P. Trinder. A Functional Database. PhD thesis, University of Oxford, 1989.
-  L. G. Valiant. A bridging model for parallel computation. Commun. ACM, 33(8):103–111, 1990.
-  L. G. Valiant. General purpose parallel architectures. In Handbook of Theoretical Computer Science, Vol A. 1990.
-  J. Vinagre, A. M. Jorge, and J. Gama. Fast incremental matrix factorization for recommendation with positive-only feedback. In International Conference on User Modeling, Adaptation, and Personalization, 2014.
-  G. Wang, W. Xie, A. J. Demers, and J. Gehrke. Asynchronous large-scale graph processing made easy. In CIDR, 2013.
-  D. Yan, J. Cheng, Y. Lu, and W. Ng. Blogel: A block-centric framework for distributed computation on real-world graphs. PVLDB, 7(14):1981–1992, 2014.
-  D. Yan, J. Cheng, K. Xing, Y. Lu, W. Ng, and Y. Bu. Pregel algorithms for graph connectivity problems with performance guarantees. PVLDB, 7(14):1821–1832, 2014.
-  Y. Zhou, L. Liu, K. Lee, C. Pu, and Q. Zhang. Fast iterative graph computation with resource aware graph parallel abstractions. In HPDC, 2015. (1) We first show that GRAPE terminates. Assume by contradiction that there exist Q and G such that GRAPE does not terminate. Consider the values of update parameters in the fragments of G during the run. Since at least one update parameter has to be updated in a superstep of incremental computation (except the last step), and the total number of distinct values to update parameters is bounded by Q and G by the monotonic condition (a) given in Section 4.1. Hence there must exist supersteps p and q such that for each i ∈ [1, m], xpi = xqi, i.e., the values to all the parameters changed at supersteps p and q are the same.
- (2) We prove that for any Q and G, at any superstep r of the run of GRAPE for Q and G with PEval, IncEval and Assemble, partial answers Q(Fi[xri ]) (i ∈ [1, m]) are computed on all fragments Fi of G. We show this by induction on r.
- (1) Since BSP and GRAPE have the same amount of physical workers, each worker of BSP is simulated by a worker in
- (2) We use two supersteps in GRAPE to simulate one mapshuffle-reduce round of a MapReduce algorithm, including a map phase and a reduce phase, in the key-value message mode (see Section 3.5) for messages. More specifically, for a MapReduce algorithm A that has R rounds, we implement each round r ∈ [1, R] of A in GRAPE as follows.
- (2) CC. Figures 8(d)-8(f) demonstrate similar improvement of GRAPE over GraphLab and Giraph for CC, e.g., on average GRAPE ships 5.4% of the data shipped by Giraph and GraphLab. Blogel is slightly better than GRAPE. As remarked in Section 7 for Exp-1(2), this is because Blogel precomputes CCs of graphs when partitioning and loading the graphs, and thus already recognizes connected components by using an internal partition strategy. While a fair comparison should include the time for precomputing CCs in the evaluation time of CC by Blogel, we cannot identify the communication cost saved by its preprocessing. Thus, the reported communication cost of Blogel is almost 0 in all cases. Nonetheless, GRAPE incurs communication cost comparable to the near “optimal” case reported by Blogel, when Blogel operates on a graph that is already partitioned as CCs.
- (1) The vertex program for Giraph requires substantial changes to its corresponding sequential algorithm. As shown in Fig. 10, the logic flow of a Giraph program for SSSP is quite different from that of a sequential SSSP algorithm. Writing such programs requires users to have prior knowledge about the query classes and the design principle of the vertex-centric model. Moreover, it is challenging to integrate graph-level optimization, e.g., incremental evaluation, into the vertex programming model. In contrast, the logic flow of PIE algorithms (GRAPE) remains the same as those sequential algorithms adapted for PEval and IncEval.
- (2) While Blogel supports block-centric computation, it also requires recasting of sequential algorithms, as shown in Fig. 11.