Bita - Tool for differential file updates over http, written in Rust


#1

I’m working on this project for doing differential updates of files over http.

Besides learning Rust my main goal is to create a tool for doing efficient full filesystem updates (think embedded system/IoT devices) over http, but I’m hoping the tool should be generic enough for any kind of file update where one expects to have a local copy which might contain similar data to the remote one.

On compression bita chunks the input file using a rolling hash, generates a strong hash (blake2) for each chunk, removes duplicated chunks and then compress each chunk individually. Finally it generates a dictionary which the source file can be rebuilt from and places the dictionary and compressed chunk data into a single archive.

On decompression/unpack bita downloads the dictionary from the remote archive (using http range request) then scans any given local seed for chunks matching the dictionary. The chunks which are not found in seeds are fetched from the remote archive and inserted into the output file.

In its current shape the tool is fully functional in that it can compress a file and unpack from remote using local seeds.
Still missing is to download/rebuild an archive locally from remote. And to use bita archives as seed input.

I would like for anyone who finds this interesting to give (constructive) criticism on what could be improved as an application or code wise, or what might be missing, ideas for optimizations etc. Or just to try it out. :slight_smile:

/Olle