Conferences in Research and Practice in Information Technology
  

Online Version - Last Updated - 20 Jan 2012

 

 
Home
 

 
Procedures and Resources for Authors

 
Information and Resources for Volume Editors
 

 
Orders and Subscriptions
 

 
Published Articles

 
Upcoming Volumes
 

 
Contact Us
 

 
Useful External Links
 

 
CRPIT Site Search
 
    

Scalability in Recursively Stored Delta Compressed Collections of Files

Molfetas, A., Wirth, A. and Zobel, J.

    The archiving and maintenance of vast quantities of data is a key challenge for the current use of information technology. When storing large repositories, possibly mirrored at multiple sites, an archiving system aims to reduce both storage and transmission costs. Delta compression is a key component of many archiving and backup systems. A file may be stored succinctly as a sequence of references to other files in the collection, establishing a dependency relationship between files. On the one hand, exploiting large dependency chains provides excellent compression. On the other hand, if a file is stored compactly, so that it depends on hundreds of other files, then retrieving it from the archive may be very time and resource consuming. This paper assesses the scalability of delta compression of typical data collections. We use experiments to model and examine the dependency relationship, and quantify the cost of full use of dependencies. We propose strategies to reduce dependencies and yet retain highly effective compression.
Cite as: Molfetas, A., Wirth, A. and Zobel, J. (2014). Scalability in Recursively Stored Delta Compressed Collections of Files. In Proc. Australasian Web Conference (AWC 2014) Auckland, New Zealand. CRPIT, 155. Trotman, A., Cranefield, S. and Yang, J. Eds., ACS. 21-30
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS