So I was chatting with my AI friend about determining if two English words were anagrams of each other. Along the way it came up with the usual solutions of comparing the sorted letters of each word, counting the frequencies of letters in each word, removing matching letters until there is something or nothing left. Eventually I got it to use the fundamental theorem of arithmetic to do the job. All good idiomatic Rust solutions but dog slow compare to mine:
pub fn are_anagrams_crust(word_1: &str, word_2: &str) -> Result<bool, &'static str> {
fn word_product(word: &str) -> Result<u128, &'static str> {
const PRIMES: [u128; 26] = [
2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83,
89, 97, 101,
];
let mut pw1 = &raw const *word as *const u8;
let mut product: u128 = 1;
for _ in 0..word.len() {
let mut c = unsafe { *(pw1) };
if c.is_ascii_lowercase() {
c -= 32;
}
if c.is_ascii_uppercase() {
let prime = PRIMES[(c - b'A') as usize];
product = product.checked_mul(prime).ok_or("First word too long")?;
} else {
return Err("Non alphabetic character");
}
pw1 = pw1.wrapping_add(1);
}
Ok(product)
}
Ok(word_product(word_1) == word_product(word_2))
}
Which has a smdgen of unsafe
and is written in a more like C fashion (Crust).
But can you spot the error above? It all compiles so it must work, right? Clippy does not complain, Miri does not complain. What could it be?
Well I missed a couple of ?
in returning the final result. So we can end up comparing an error with an error and producing OK even if there is errors. Which Clippy thinks is checking the errors I guess.
Took me a while to spot that.
Anyway, I'd be interested if anyone can do this as fast or faster in safe and idiomatic Rust? Just now Cargo bench puts it well in the lead.