Shastor: A Scalable Hdfs-Based Storage Framework For Small-Write Efficiency In Pervasive Computing

2018 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI)(2018)

引用 0|浏览25
暂无评分
摘要
It is well known that small files are often created and accessed in pervasive computing in which information is processed with limited resources via linking with objects as encountered. And the Hadoop framework, as a de facto big data processing platform though very popular in practice, cannot effectively process the small files. In this paper, we propose a scalable HDFS-based storage framework, named SHAstor, to improve the throughput in processing of small-writes for pervasive computing paradigm. Compared to the classic HDFS, the essence of this approach is to merge the incoming small writes into a large chunk of data, either at client side or at server side, and then store it as a big file in the framework. As a consequence, this could substantially reduce the number of small files to process the pervasively gathered information. To reach this goal, the framework takes the HDFS as the basis and adds three extra modules for merging and indexing the small files during the read/write operations in pervasive applications are performed. To further facilitate this process, a new ancillary namenode is also optionally installed to store the index table. With this optimization, SHAstor can not only optimize the small-writes, but also scale out with the number of datanodes to improve the performance of pervasive applications.
更多
查看译文
关键词
pervasive computing, HDFS-based storage, small write, Hadoop framework
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要