Meta-X - A Technique for Reducing Communication in Geographically Distributed Computations.

CSCML(2021)

引用 0|浏览23
暂无评分
摘要
Computations, such as syndromic surveillance and e-commerce, are executed over the datasets collected from different geographical locations. Modern data processing systems, such as MapReduce/Hadoop or Spark, also, require to collect the data from different geographical locations to a single global location, before executing an application, and thus, result in a significant communication cost. While MapReduce/Hadoop and Spark have proven to be the most useful paradigms in the revolution of distributed computing, the federation of cloud and bigdata activities is the challenge, wherein data processing should be modified to avoid (big) data migration across remote (cloud) sites. This is exactly our scope of work, where only the very essential data for obtaining the final result is transmitted, for reducing communication and processing, and for preserving data privacy as much as possible. In this work, we propose an algorithmic technique for geographically distributed computations, called Meta-X, that decreases the communication cost by allowing us to process and moves metadata to among different locations, instead of the entire datasets. We illustrate the usefulness of Meta-X in terms of MapReduce computations for different operations, such as equijoin, k-nearest-neighbors finding, and shortest path finding.
更多
查看译文
关键词
MapReduce,Hadoop,Spark
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要