A Study of Practical Deduplication

11 years 5 months ago
A Study of Practical Deduplication
We collected file system content data from 857 desktop computers at Microsoft over a span of 4 weeks. We analyzed the data to determine the relative efficacy of data deduplication, particularly considering whole-file versus block-level elimination of redundancy. We found that whole-file deduplication achieves about three quarters of the space savings of the most aggressive block-level deduplication for storage of live file systems, and 87% of the savings for backup images. We also studied file fragmentation finding that it is not prevalent, and updated prior file system metadata studies, finding that the distribution of file sizes continues to skew toward very large unstructured files.
Dutch T. Meyer, William J. Bolosky
Added 28 Aug 2011
Updated 28 Aug 2011
Type Journal
Year 2011
Where FAST
Authors Dutch T. Meyer, William J. Bolosky
Comments (0)