HEBR : A High Efficiency Block Reporting Scheme for HDFS

semanticscholar(2017)

引用 0|浏览1
暂无评分
摘要
Hadoop platform is widely being used for managing, analyzing and transforming large data sets in various systems. Two basic components of Hadoop are: 1) a distributed file system (HDFS) 2) a computation framework (MapReduce). HDFS stores data on simple commodity machines that run DataNode processes (DataNodes). A commodity machine running NameNode process (NameNode) maintains meta data information of the file system. Every DataNode sends lists of all files currently stored on it, known as block report to the NameNode periodically. NameNode processes block reports to build a mapping between files and their locations on various DataNodes. The block reports form a heavy internal load of a Hadoop cluster as they consume computation resources of DataNodes and the NameNode as well as network bandwidth of the cluster. With extensive supporting experiment results, this paper proposes a new block report protocol, HEBR, for Hadoop to significantly reduce both computational and communication overhead, and thus improving overall Hadoop system’s performance greatly.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要