谷歌浏览器插件
订阅小程序
在清言上使用

Utilizing SSD to Alleviate Chunk Fragmentation in De-Duplicated Backup Systems

Longxin Lin,Kun Xiao,Wenjie Liu

2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS)(2016)

引用 3|浏览12
暂无评分
摘要
Data deduplication, which removes redundant data so that only one copy of duplicate blocks needs to be actually stored, has been implemented in almost all storage appliances, including archival and back-up systems, primary data storage, and SSD devices, to save storage space. However, as time goes and more duplicate blocks have been ingested into the system, the fragmentation problem emerges, that is, logically continuous data blocks of later stored datasets are dispersed in a large storage space and as a result restoring them requires a lot of extra disk accesses, significantly degrading restore performance and garbage collection efficiency. Existing approaches toward the fragmentation problem choose to sacrifice space savings for performance by selectively rewriting trouble-causing duplicate blocks when performing deduplication, even though they have already been stored elsewhere previously. However, rewriting chunks into the system impacts the backup process and reduces deduplication efficiency as many duplicate chunks are allowed in the system. In this work, we propose to deploy flash-based SSDs in the system to overcome the limitations of rewriting algorithms by taking advantage of the high performance provided by SSDs. Specifically, instead of rewriting, we migrate the trouble-causing blocks into an SSD storage in the background when encountering duplicate blocks. The idea is mainly motivated by the following two reasons. First, using a separate migrating process leverages the computing power provided by modern multi-core architecture. Second, typically restores are not performed immediately after backups. Therefore, there is no need to rewrite blocks on the critical path, which affects performance. We augment our proposal to two rewriting schemes and conduct comprehensive evaluations to evaluate its efficacy. Our results show that by provisioning a reasonable amount of SSD, the backup performance and deduplication efficiency can be significantly improved, while slightly increasing the amount of container reads associated with restore operations.
更多
查看译文
关键词
Storage,Backup,Data Deduplication,Fragmentation,Restore Performance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要