String compression : Looking for advice on how to improve efficiency

hi
What I have is a simple string compressor that encodes/decodes byte size characters of strings into 2 bits. Example:

extern crate itertools;
use itertools::Itertools;

fn main() {
    let myab = "aabbababbaBAs".to_string();

    let bitstr = &myab
        .chars()
        .into_iter()
        .map(|c| match c {
            'A' | 'a' => "00",
            'B' | 'b' => "01",
            _ => "11",
        })
        .collect::<String>();

    println!("{:?}", bitstr);

    let bitvec = &bitstr
        .chars()
        .chunks(8)
        .into_iter()
        .map(|chunk| {
            let mut c = chunk.collect::<String>();
            if c.len() < 8 {
                let l = format!("{:1<1$}", "", (8 - c.len()));
                c.push_str(&l);
                c
            } else {
                c
            }
        })
        .collect::<Vec<_>>();

    println!("{:?}", bitvec);

    let cc: Vec<u8> = bitvec
        .into_iter()
        .map(|x| u8::from_str_radix(&x, 2).unwrap())
        .collect();

    println!("{:?}", cc);

    let str = cc
        .into_iter()
        .map(|x| format!("{:08b}", x))
        .collect::<String>();

    println!("{:?}", str);

    let back = str
        .chars()
        .chunks(2)
        .into_iter()
        .map(|chunk| {
            let c = chunk.collect::<String>();
            match &c[..] {
                "00" => 'A',
                "01" => 'B',
                _ => 'X',
            }
        })
        .collect::<String>();

    println!("{:?}", back);
}

Short description:

  1. take a string
  2. break it into characters
  3. encode characters (00, 01, 11)
  4. collect it back to string
  5. split "01010101" string into 8 char substrings
  6. treat substrings as bits and convert it into u8
  7. print array
  8. do it in revers to get back to 1.

What I am looking for is a more efficient (in speed and memory) implementation of the above procedure .. Any advice ?

  • You could create a type that represents a compressed string and do all the compression work in a single Iterator::fold
  • You can avoid itertools because that will allocate when you use chunks
  • You can avoid allocating strings with format, and use bitwise math instead
  • You can pre-allocate your compressed vector and your uncompressed string when you go to make them, to avoid reallocations when you build them up

While I agree my compress function would be cleaner using a more procedural style, I stand by my decompress method because Compressed::iter is a useful function even outside of decompress, as seen in my Display implementation

Don't get me wrong. I'm sure there is a lot of room for making what I wrote more useful by wrapping it up nicely in some type(s) and so on.

I was just having a dig at what I see as OFPD around here quite a lot.

I made the claim that my suggested code was easier to comprehend and would perform better.

That may not be true of course. I know many who find all that FP noise quite easy on the mind. If it's performance one is after, well, one just has to measure it and find out.

What is going on here?

Once again I find a post of mine hidden and a message telling me it has been flagged and "the community feels it is offensive, abusive, or a violation of..."

Clearly this is not the "community" feeling anything, it is one anonymous user that for some odd reason thinks it was offensive or abusive. I point to the fact that another community member hit the like button on that post.

I always hope that by the tone of my writing that I mean no offence. Quite the contrary, my intention is to be humorous and hopefully even informative and helpful. To than end I put a smiley in there to make that fact extra clear. Apparently emoji carry no meaning for some people.

Frankly I have to say that I find having the finger pointed at me out of the dark by some anonymous person for an unspecified crime somewhat insulting and offensive in itself.

@ZiCog it is not okay to call out people having a disorder. Even if you weren't sirous about it or just wrote it as a joke. It is neither of both.
You could have written your post without the insult and everything would be good. Please take that into account next time.