AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
We propose using data generator, the data generator used in the Linked Data Benchmark Council Social Network Benchmark

Graphalytics: A Big Data Benchmark for Graph-Processing Platforms

GRADES@SIGMOD/PODS, (2015)

被引用78|浏览83
EI
下载 PDF 全文
引用
微博一下

摘要

Graphs are increasingly used in industry, governance, and science. This has stimulated the appearance of many and diverse graph-processing platforms. Although platform diversity is beneficial, it also makes it very challenging to select the best platform for an application domain or one of its important applications, and to design new and...更多

代码

数据

0
简介
  • Generic big data processing platforms, such as Hadoop, can process graphs, but are generally slow for challenging graph-processing algorithms [3, 4] or graph datasets [4, 7].
  • Several studies have compared the performance of graph processing platforms [3, 4, 7] using multiple algorithms and/or datasets, but the de facto benchmarking standard is currently Graph500, which is limited to a single algorithm applied to a synthetic graph model.
  • The authors present the vision for Graphalytics, a big data
重点内容
  • Graph data is increasingly used in industry, governance, and science
  • We propose using data generator [16], the data generator used in the Linked Data Benchmark Council Social Network Benchmark
  • The Report Generator produces the main outcome of Graphalytics, a detailed report on the performance of the SUT during the benchmark, which includes all relevant configuration information
  • Graphalytics has a database for Datasets, which includes preconfigured graphs ready to be used with Graphalytics
  • Graphalytics focuses on diverse datasets and algorithms, and methodologically it greatly extends the shortcomings of related work
  • Novel from previous work, including our own, Graphalytics focuses on a fundamental understanding of choke points, extensions to the dataset generation, and an advanced benchmarking harness that will evolve into a public database of useful results
结果
  • System Monitor

    Dataset Generator

    Platform-specific algorithm implementation

    Graph processing platform each supported platform.
  • Graphalytics is still in an early phase of development, it has already enabled them to enrich the previous graph benchmarking results with new datasets and platforms.
  • The single machine is faster than the cluster for smaller graphs, were computation is mostly CPU bound.
  • It can generate a 1.3 billion edges graph in about 3 hours.
  • The authors note that GraphX is significantly slower that Giraph for the CONN algorithm (∼ 3×), al-
结论
  • Benchmarking graph-processing platforms enables system comparison, tuning, anddesign for increasingly more domains.
  • Responding to a dearth of comprehensive benchmarking approaches for graph-processing platforms, in this work the authors have proposed the vision: Graphalytics.
  • Novel from previous work, including the own, Graphalytics focuses on a fundamental understanding of choke points, extensions to the dataset generation, and an advanced benchmarking harness that will evolve into a public database of useful results.
  • Graphalytics aims to become an accepted benchmarking standard by both the LDBC and the SPEC Research Group communities, and attract further implementations from the creators of graph-processing platforms themselves.
表格
  • Table1: Characteristics of real graphs
Download tables as Excel
相关工作
  • We have already compared, throughout this work, the Graphalytics benchmark with other benchmarks proposed for graph-processing [7, 13, 22]. In summary, Graphalytics is much more comprehensive and ambitious than previous work: it supports more diverse and realistic datasets [4, 16], more diverse and realistic algorithms [4], and reference implementations for more platforms (preliminary results obtained for 10 platforms [4, 5]). Moreover, Graphalytics includes in its vision a fundamental understanding of choke points, extensions to the dataset generation, and an advanced benchmarking harness that will evolve into a public databased of useful results.
基金
  • This research was supported by Oracle Labs, LDBC (ldbcouncil.org, originally funded by EU project FP7-317548), Dutch NWO KIEM project KIESA, COMMIT project COMMIssioner, Ministry of Science and Innovation of Spain (TIN2013-47008-R), and Generalitat de Catalunya (SGR2014-890)
引用论文
  • M. Dayarathna and T. Suzumura. Graph database benchmarking on cloud environments with XGDBench. Autom. Softw. Eng., 21(4):509–533, 2014.
    Google ScholarLocate open access versionFindings
  • J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2004.
    Google ScholarLocate open access versionFindings
  • B. Elser and A. Montresor. An evaluation study of BigData frameworks for graph processing. In IEEE International Conference on Big Data, pages 60–67. IEEE, Oct. 2013.
    Google ScholarLocate open access versionFindings
  • Y. Guo, M. Biczak, A. L. Varbanescu, A. Iosup, C. Martella, and T. L. Willke. How Well Do Graph-Processing Platforms Perform? An Empirical Performance Evaluation and Analysis. In IEEE IPDPS, pages 395–40IEEE, May 2014.
    Google ScholarLocate open access versionFindings
  • Y. Guo, A. L. Varbanescu, A. Iosup, and D. H. J. Epema. An Empirical Performance Evaluation of GPU-Enabled Graph-Processing Systems. In CCGRID, pages 927–932, 201(in print, available online: http://www.pds.ewi.tudelft.nl/~iosup/perf-eval-gpu-graph-processing15ccgrid.pdf).
    Locate open access versionFindings
  • Y. Guo, A. L. Varbanescu, A. Iosup, C. Martella, and T. L. Willke. Benchmarking Graph-Processing Platforms: A Vision. In ACM/SPEC International Conference on Performance Engineering (ICPE), pages 289–292. ACM Press, 2014.
    Google ScholarLocate open access versionFindings
  • M. Han, K. Daudjee, K. Ammar, M. T. Ozsu, X. Wang, and T. Jin. An Experimental Comparison of Pregel-like Graph Processing Systems. In VLDB, 2014.
    Google ScholarLocate open access versionFindings
  • C. Herrera and P. J. Zufiria. Generating scale-free networks with adjustable clustering coefficient via random walks. arXiv preprint arXiv:1105.3347, 2011.
    Findings
  • A. Iosup, A. L. Varbanescu, M. Capota, T. Hegeman, Y. Guo, W. L. Ngai, and M. Verstraaten. Towards Benchmarking IaaS and PaaS Clouds for Graph Analytics. In Workshop on Big Data Benchmarking (WBDB), Potsdam, Germany, 2014.
    Google ScholarLocate open access versionFindings
  • J. Leskovec, D. Chakrabarti, J. M. Kleinberg, C. Faloutsos, and Z. Ghahramani. Kronecker graphs: An approach to modeling networks. J Mach Learn Res, 11:985–1042, 2010.
    Google ScholarLocate open access versionFindings
  • J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over time: Densification laws, shrinking diameters and possible explanations. In ACM SIGKDD, 2005.
    Google ScholarLocate open access versionFindings
  • I. X. Y. Leung, P. Hui, P. Lio, and J. Crowcroft. Towards real-time community detection in large networks. Phys. Rev. E, 79:066107, Jun 2009.
    Google ScholarLocate open access versionFindings
  • Y. Lu, J. Cheng, D. Yan, and H. Wu. Large-Scale Distributed Graph Computing Systems: An Experimental Evaluation. In VLDB, 2014.
    Google ScholarLocate open access versionFindings
  • G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A System for Large-Scale Graph Processing. In ACM International Conference on management of data (SIGMOD), page 135. ACM Press, 2010.
    Google ScholarLocate open access versionFindings
  • M. Pham, P. A. Boncz, and O. Erling. S3G2: A scalable structure-correlated social graph generator. In TPCTC, 2012.
    Google ScholarLocate open access versionFindings
  • A. Prat-Perez and A. Averbuch. Benchmark design for navigational pattern matching benchmarking. Deliverable 3.3.34, LDBC, October 2014. [Online] Available: http://ldbc.eu/sites/default/files/LDBC_D3.3.34.pdf.
    Findings
  • A. Prat-Perez and D. Dominguez-Sal. How community-like is the structure of synthetically generated graphs? In GRADES, pages 7:1–7:9. ACM, 2014.
    Google ScholarLocate open access versionFindings
  • A. Prat-Perez, D. Dominguez-Sal, and J. Larriba-Pey. Social based layouts for the increase of locality in graph operations. In International Conference on Database Systems for Advanced Applications (DASFAA), 2011.
    Google ScholarLocate open access versionFindings
  • J. Ugander, B. Karrer, L. Backstrom, and C. Marlow. The anatomy of the Facebook social graph. arXiv preprint arXiv:1111.4503, 2011.
    Findings
  • E. Volz. Random networks with tunable degree distribution and clustering. Physical Review E, 70(5):056115, 2004.
    Google ScholarLocate open access versionFindings
  • R. S. Xin, J. E. Gonzalez, M. J. Franklin, and I. Stoica. GraphX: A Resilient Distributed Graph System on Spark. In GRADES, pages 1–6. ACM Press, 2013.
    Google ScholarLocate open access versionFindings
  • Y. Zhao, K. Yoshigoe, M. Xie, and S. Zhou. Evaluation and Analysis of Distributed Graph-Parallel Processing Frameworks. Journal of Cyber Security and Mobility, 3:289–316, 2014.
    Google ScholarLocate open access versionFindings
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科