Was trying to find a way to get to the index of an element in a slice at compile time. First attempt:
pointer arithmetic
const ANIMALS: &[&str] = [ "cat", "dog", "shark", "squirrel" ].as_slice();
const CAT_INDEX: usize = get_index(ANIMALS, "cat").unwrap();
const SHARK_INDEX: usize = get_index(ANIMALS, "shark").unwrap();
const SQUIRREL_INDEX: usize = get_index(ANIMALS, "squirrel").unwrap();
const fn get_index(arr: &[&str], of: &str) -> Option<usize> {
let of_ptr = of.as_ptr();
let of_len = of.len();
let mut str_index = 0;
let ptr = arr.as_ptr();
let len = arr.len();
while str_index < len {
let curr = unsafe { *ptr.add(str_index) };
let curr_ptr = curr.as_ptr();
let curr_len = curr.len();
if curr_len != of_len {
continue;
}
let mut byte_index = 0;
while byte_index < curr_len {
let of_byte = unsafe { *of_ptr.add(byte_index) };
let curr_byte = unsafe { *curr_ptr.add(byte_index) };
if curr_byte != of_byte {
break;
}
byte_index += 1;
}
if byte_index == curr_len {
return Some(str_index);
}
str_index = str_index.checked_add(1).unwrap();
}
None
}
Building this on both my own machine and in the cloud hangs for a ridiculously long time:
Compiling playground v0.0.1 (/playground)
error: constant evaluation is taking a long time
Researching a bit further brought me to the message by @H2CO3 as well as the recently stabilized str::eq_ignore_ascii_case in the rustc 1.89. By combining the two:
pattern matching
const fn get_str_array_index(arr: &[&str], of: &str) -> Option<usize> {
let bytes = of.as_bytes();
const fn index_of(arr: &[&str], bytes: &[u8], index: usize) -> Option<usize> {
// pattern matching as an iterator
match arr {
[head, rest @ ..] => {
if head.as_bytes().eq_ignore_ascii_case(bytes) {
return Some(index);
}
index_of(rest, bytes, index + 1)
}
_ => None,
}
}
index_of(arr, bytes, 0)
}
The compilation returns back to instant. Yet I can't wrap my head around as to why it is so slow, in the first place. How is the pattern matching version so much faster than plain old pointer arithmetic?
Digging some more lead me to const_str as well. Good to know, for future reference.