Optimize the FP-Tree Based Graph Edge Weight Computation on Multi-core MapReduce Clusters

2017 IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS)(2017)

引用 1|浏览19
暂无评分
摘要
The FP-tree based edge weight computation (EWC for short) with MapReduce has demonstrated its remarkable performance for extracting weighted graphs from big data for data analysis. However, our investigation finds that existing algorithm includes unnecessary scan on the datasets as well as unnecessary information for the FP-tree construction, which prolong the runtime execution. In addition, applying inappropriate Reducers-to-cores mapping strategy may make it exhaust the resources and fail to complete the job execution. This paper designs, implements and evaluates an optimized FP-tree based graph EWC algorithm with MapReduce on Multi-core Clusters. First, we design a more compact FP-tree based EWC with 2-phase MapReduce, reducing one phase scan of the dataset. Second, we propose a reduced FP-tree data structure to reduce the FP-tree construction cost. Third, we examine two strategies for mapping Reducers to cores for EWC on each multi-core computer: one-Reducer-one-core and one-Reducer-multiple-cores. Finally, an empirical comparison performance study has been carried out on the optimized EWC algorithm against the existing one over a massive application dataset generated by a real social network. The results demonstrate that the optimized FP-tree based EWC algorithm obtains about 39% to 55% percentage improvement in execution time, and in the meantime achieves better scale-out and scale-up speedup. This paper's findings can also be applied to improve the scalability and efficiency of the parallel and distributed execution of applications involving large scale all-pairs set intersection computation over multi-core MapReduce clusters.
更多
查看译文
关键词
Weighted Graph Extraction,Edge Weight Computation,FP-tree,MapReduce,Big Data,Multi-core Clusters
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要