Hdm-Mc In-Action: A Framework For Big Data Analytics Across Multiple Clusters

2018 IEEE 38TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS)(2018)

引用 0|浏览25
暂无评分
摘要
Big data are increasingly collected and stored in a highly distributed infrastructures due to the development of several emerging technologies including sensor network, cloud computing, IoT and mobile computing among many other emerging technologies. In practice, the majority of existing big-data-processing frameworks (e.g., Hadoop, Spark, fink) are designed based on the single-cluster setup with the assumptions of centralized management and homogeneous connectivity which makes them sub-optimal and sometimes infeasible to be applied for scenarios that require implementing data analytics jobs on highly distributed data sets (across racks, data centers or multiorganizations). We demonstrate HDM-MC, a big data processing framework that is designed to enable the capability of performing large scale data analytics across multi-clusters with minimum extra overhead due to additional scheduling requirements. We describe the architecture and realization of the system using a step-by-step example scenario.
更多
查看译文
关键词
Big Data,Distributed Systems,Workflows
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要