Storage And Recreation Trade-Off For Multi-Version Data Management

WEB AND BIG DATA (APWEB-WAIM 2018), PT II(2018)

引用 0|浏览22
暂无评分
摘要
With the tremendous development of data acquisition technology, massive observation data have been accumulated in scientific disciplines. As the difference between the successive observations only changes slightly, it is critical to utilize multi-version data management technology to compress data to minimize both storage and recreation. However, the existing work on this field only optimizes the total storage and recreation costs, but ignores the recreation cost of some special versions. Consequently, in this paper, we investigate the trade-off among all of three metrics, including total storage cost, total recreation cost, and the maximum recreation cost for each version. We formulate two problems, including (1) discover a storage plan to lower the total recreation and the individual recreation if the total storage is limited; (2) find a storage plan to minimize the total storage with restricted total recreation and individual recreation. To solve above problems, we model all versions with a directed graph and then devise two efficient algorithms based on spanning tree. A series of experiments indicate that our proposals are effective and efficient in dealing with the problems.
更多
查看译文
关键词
Multi-version data management, Storage and recreation trade-off, Scientific data management
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要