论文部分内容阅读
Mod backup systems exploit data deduplication technology to save stor-age space whereas suffering from the fragmentation problem caused by deduplication. Fragmentation degrades the restore performance because of restoring the chunks that are scattered all over different containers. To improve the restore performance, the state-of-the-art History Aware Rewriting Algorithm (HAR) is proposed to collect frag-mented chunks in the last backup and rewrite them in the next backup. However, due to rewriting fragmented chunks in the next backup, HAR fails to eliminate intal fragmentation caused by self-referenced chunks (that exist more than two times in a backup) in the current backup, thus degrading the restore performance. In this paper, we propose Selectively Rewriting Self-Referenced Chunks (SRSC), a scheme that de-signs a buffer to simulate a restore cache, identify intal fragmentation in the cache and selectively rewrite them. Our experimental results based on two real-world datas-ets show that SRSC improves the restore performance by 45% with an acceptable sac-rifice of the deduplication ratio.