String compression : Looking for advice on how to improve efficiency

hi
What I have is a simple string compressor that encodes/decodes byte size characters of strings into 2 bits. Example:

extern crate itertools;
use itertools::Itertools;

fn main() {
    let myab = "aabbababbaBAs".to_string();

    let bitstr = &myab
        .chars()
        .into_iter()
        .map(|c| match c {
            'A' | 'a' => "00",
            'B' | 'b' => "01",
            _ => "11",
        })
        .collect::<String>();

    println!("{:?}", bitstr);

    let bitvec = &bitstr
        .chars()
        .chunks(8)
        .into_iter()
        .map(|chunk| {
            let mut c = chunk.collect::<String>();
            if c.len() < 8 {
                let l = format!("{:1<1$}", "", (8 - c.len()));
                c.push_str(&l);
                c
            } else {
                c
            }
        })
        .collect::<Vec<_>>();

    println!("{:?}", bitvec);

    let cc: Vec<u8> = bitvec
        .into_iter()
        .map(|x| u8::from_str_radix(&x, 2).unwrap())
        .collect();

    println!("{:?}", cc);

    let str = cc
        .into_iter()
        .map(|x| format!("{:08b}", x))
        .collect::<String>();

    println!("{:?}", str);

    let back = str
        .chars()
        .chunks(2)
        .into_iter()
        .map(|chunk| {
            let c = chunk.collect::<String>();
            match &c[..] {
                "00" => 'A',
                "01" => 'B',
                _ => 'X',
            }
        })
        .collect::<String>();

    println!("{:?}", back);
}

Short description:

  1. take a string
  2. break it into characters
  3. encode characters (00, 01, 11)
  4. collect it back to string
  5. split "01010101" string into 8 char substrings
  6. treat substrings as bits and convert it into u8
  7. print array
  8. do it in revers to get back to 1.

What I am looking for is a more efficient (in speed and memory) implementation of the above procedure .. Any advice ?

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=0303d332164f7961d0ce5e05d3020f25

  • You could create a type that represents a compressed string and do all the compression work in a single Iterator::fold
  • You can avoid itertools because that will allocate when you use chunks
  • You can avoid allocating strings with format, and use bitwise math instead
  • You can pre-allocate your compressed vector and your uncompressed string when you go to make them, to avoid reallocations when you build them up
1 Like

While I agree my compress function would be cleaner using a more procedural style, I stand by my decompress method because Compressed::iter is a useful function even outside of decompress, as seen in my Display implementation

1 Like

Don't get me wrong. I'm sure there is a lot of room for making what I wrote more useful by wrapping it up nicely in some type(s) and so on.

I was just having a dig at what I see as OFPD around here quite a lot.

I made the claim that my suggested code was easier to comprehend and would perform better.

That may not be true of course. I know many who find all that FP noise quite easy on the mind. If it's performance one is after, well, one just has to measure it and find out.

2 Likes

What is going on here?

Once again I find a post of mine hidden and a message telling me it has been flagged and "the community feels it is offensive, abusive, or a violation of..."

Clearly this is not the "community" feeling anything, it is one anonymous user that for some odd reason thinks it was offensive or abusive. I point to the fact that another community member hit the like button on that post.

I always hope that by the tone of my writing that I mean no offence. Quite the contrary, my intention is to be humorous and hopefully even informative and helpful. To than end I put a smiley in there to make that fact extra clear. Apparently emoji carry no meaning for some people.

Frankly I have to say that I find having the finger pointed at me out of the dark by some anonymous person for an unspecified crime somewhat insulting and offensive in itself.

3 Likes

@ZiCog it is not okay to call out people having a disorder. Even if you weren't sirous about it or just wrote it as a joke. It is neither of both.
You could have written your post without the insult and everything would be good. Please take that into account next time.

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.