orz is an optimized ROLZ/Huffman algorithm based data compressor. for most benchmark data, it is now compressing better than bzip2, and faster than gzip.
orz is under active development. we are looking for ways to improve its performance. very welcome to you to give suggestions and make contributions.
Thanks for making this!
Can it be used as a library and for stream compression like gzip? Eg. Wrap an AsyncRead/AsyncWrite bytestream so it will be compressed/decompressed?
From the table on the repo it looks like the decompression speed is where the competition is though for orz...
What about pbzip2
? As I know it use the same algorithm as ordinary bzip2
utility,
but because of parallel execution it is possible to utilize all cores of CPU, so it is n
-times faster then bzip2
. So orz
single-threaded or multi-threaded program?
orz is currently single threaded. it can be made parallelized by simply splitting input data into blocks. just as @najamelan said, i will try to split it into a compression library and cli app in the future, and apply parallelization in cli.
It might be useful to implement the parallelization in the library. Clients might be interested in that. You can provide an API where the user can choose whether to use multiple threads or not. I think that with rayon it should be straightforward to have a reliable implementation (I mean cross platform out of the box etc).
You can always hide features like this behind a feature flag to avoid imposing the dependencies on users that don't need them.
I generally put everything in a lib, except main and clap (command line argument parsing) specific code.
I think you should mention the --silent
command line argument within the --help
message
Also, what extension do you suggest for the encoded result? .orz
?
Quick test:
encode
======
time bzip2 for_bzip2.pbrt
84.173u 0.255s 1:24.45 99.9% 0+0k 0+99000io 0pf+0w
time gzip for_gzip.pbrt
18.442u 0.235s 0:18.68 99.9% 0+0k 0+127896io 0pf+0w
time orz encode --silent for_orz.pbrt for_orz.pbrt.orz
12.138u 0.194s 0:12.33 99.9% 0+0k 0+111872io 0pf+0w
total 484424
-rw-r--r-- 1 jan users 50687667 Apr 17 11:26 for_bzip2.pbrt.bz2
-rw-r--r-- 1 jan users 65482538 Apr 17 11:25 for_gzip.pbrt.gz
-rw-r--r-- 1 jan users 322597185 Apr 17 11:26 for_orz.pbrt
-rw-rw-rw- 1 jan users 57271395 Apr 17 11:44 for_orz.pbrt.orz
decode
======
time bunzip2 for_bzip2.pbrt.bz2
12.664u 0.652s 0:13.32 99.9% 0+0k 0+630080io 0pf+0w
time gunzip for_gzip.pbrt.gz
2.583u 0.312s 0:02.89 100.0% 0+0k 0+630080io 0pf+0w
time orz decode --silent for_orz.pbrt.orz from_orz.pbrt
2.986u 0.294s 0:03.28 99.6% 0+0k 0+630080io 0pf+0w
total 1316096
-rw-r--r-- 1 jan users 322597185 Apr 17 11:26 for_bzip2.pbrt
-rw-r--r-- 1 jan users 322597185 Apr 17 11:25 for_gzip.pbrt
-rw-r--r-- 1 jan users 322597185 Apr 17 11:26 for_orz.pbrt
-rw-rw-rw- 1 jan users 57271395 Apr 17 11:44 for_orz.pbrt.orz
-rw-rw-rw- 1 jan users 322597185 Apr 17 11:47 from_orz.pbrt
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.