Use knowledge of code to not having to use Option


#1

I have this function, it might not be the best code. But i’m using it as example to ask a question.

fn impl1(values: &[Vec<&str>]) {
  let mut seen = HashMap::new();
  // 1. assign bits to all unique strings
  for strings in values {
    for string in strings {
      let next_avail = seen.len();
      seen.entry(string).or_insert(next_avail);
    }
  }
  // 2. collect bits
  let maxbits = seen.len() - 1;
  for strings in values {
    let mut collected = 0;
    for string in strings {
      let bit = *seen.get(string).unwrap();
      let bit_reversed = seen.len() - bit - 1;
      collected |= 1 << bit_reversed;
    }
    //println!("{:?}", (strings, collected));
  }  
}

On line 15 (the unwrap) an option type is used. But i think i can proof this will always succeed because i loop over the same strings in the first loop (line 4-9). Is there any way to code this so that i don’t have to use the option type or something else that can panic ?


#2

You could combine the two loops into one. The only information that you use in the second loop that you don’t know in the first loop is seen.len().
If you really need the bitflags in reversed order (why?), you can shuffle them afterwards.

In the first loop you can just use:

let bit = *seen.entry(string).or_insert(next_avail);

#3

More generally, you could use iterators (iter and values), e.g.

for bit in seen.values() {
let bit_reversed = maxbits - bit;
collected |= 1 << bit_reversed;
}


#4

This would just set all relevant bits to 1.
If there are n different strings, seen.values() will contain the numbers 0..n-1. Probably not in that order, but still.

What @flip101 wants is a separate bitset for every string array, which only contains the bits of the strings in that array.


#5

Because it is specified that the first encountered value is to be encoded as 1 and the second as 2. When i start shifting the first bit to the left it becomes 2 and the second one will be 1.

How can they be reversed afterwards ?