Distributed data placement to minimize communication costs via graph partitioning

SSDBM(2014)

引用 55|浏览68
暂无评分
摘要
With the widespread use of shared-nothing clusters of servers, there has been a proliferation of distributed object stores that offer high availability, reliability and enhanced performance for MapReduce-style workloads. However, data-intensive scientific workflows and join-intensive queries cannot always be evaluated efficiently using MapReduce-style processing without extensive data migrations, which cause network congestion and reduced query throughput. In this paper, we study the problem of computing data placement strategies that minimize the data communication costs incurred by such workloads in a distributed setting. Our main contribution is a reduction of the data placement problem to the well-studied problem of Graph Partitioning, which is NP-Hard but for which efficient approximation algorithms exist. The novelty and significance of this result lie in representing the communication cost exactly and using standard graphs instead of hypergraphs, which were used in prior work on data placement that optimized for different objectives. We study several practical extensions of the problem: with load balancing, with replication, and with complex workflows consisting of multiple steps that may be computed on different servers. We provide integer linear programs (IPs) that may be used with any IP solver to find an optimal data placement. For the no-replication case, we use publicly available graph partitioning libraries (e.g., METIS) to efficiently compute nearly-optimal solutions. For the versions with replication, we introduce two heuristics that utilize the Graph Partitioning solution of the no-replication case. Using a workload based on TPC-DS, it may take an IP solver weeks to compute an optimal data placement, whereas our reduction produces nearly-optimal solutions in seconds.
更多
查看译文
关键词
design,distributed databases,experimentation,measurement,performance,statistical databases
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要