GreDedup: A Greedy-Based Application-Aware Data Routing Strategy for Distributed Deduplication.

International Conference on Parallel and Distributed Systems(2023)

引用 0|浏览2
暂无评分
摘要
We propose GreDedup, a greedy algorithm based application-aware data routing strategy for distributed deduplication, which can achieve a good tradeoff between high global deduplication ratio and scalable performance by reducing the communication overhead and avoiding disk bottleneck. We extract semantic information to classify backup files, and use the greedy algorithm to route files with the same type to as few storage servers as possible with the help of application tables. In intra-node deduplication, we maintain a unique chunk fingerprint index for each file type to reduce disk access times. We perform experiments to compare GreDedup with state-of-the-art alternatives under public datasets. The results show that GreDedup can achieve high global deduplication ratio almost the same as the high overhead scheme, but its write performance even exceeds that of the low overhead method with good load balancing.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要