Motivated by the question in [Solved] Best way to sort a string of key value pairs - #9 by vitalyd about the best way to sort key/value pairs, I thought about sorting with a known distribution of keys. If each key is unique and a bound is known then it can be faster to just insert each value in its position. This way the typical logarithmic factor of the sorting algorithm is avoided.
For that I wrote the following code:
fn main() {
let input = "Key0=value,Key3=vvalue,Key2=vealue,Key10=vralue,Key4=vtalue,Key1=vyalue,Key17=vualue".to_owned();
let mut kvs=vec![None;100];
for item in input.split(",").map(str::trim)
{
let mut item_s=item.split("=");
let key=item_s.next().unwrap();
let value=item_s.next().unwrap();
let keyid:usize=key["Key".len()..].parse().unwrap();
kvs[keyid]=Some(value);
}
println!("{:?}", itertools::join(kvs.iter().enumerate().filter_map(|(keyid,opt)|match opt{
None => None,
Some(value) => Some(format!("Key{}={}",keyid,value))
}),","));
}
I feel this code considerably dirty and I thought that maybe someone can make it more elegant. More concretely: it is possible to rewrite the loop in a functional way?
Yeah. I use tuple patterns all the time as closure args, but it's never occurred to me that this implies all the pattern syntax is available there. As I say, obvious in retrospect, but a nice realisation.
Thank you. That code is a lot nicer. I forgot about the existence of the for_each method. Extracting the pair from the str within an impl str::FromStr is also very nice touch.
Matter of taste I suppose, but I would prefer to use a Regex for the string parsing.
let rx = regex::Regex::new(r"Key(\d+)=(\w+)").unwrap();
rx.captures_iter(input).for_each(|cap| {
let id = cap.get(1).unwrap().as_str().parse::<usize>().unwrap();
let val = cap.get(2).unwrap().as_str();
// ...
});
(Don't let the unwraps fool you, none of these steps can fail)
Unless your regex was statically precompiled your solution will most probably not be as performant as manually parsing.
Since the post already tries to optimize the complexity of a sorting algorithm from O(n · log(n)) to O(n + M) (where M is the upper bound), performance seems to be important, here
It sets the length of the ArrayVec accordingly (so if there’s no “=“, say, it’ll set the length of the ArrayVec to 1). Trying to index beyond that length will panic (in this example, inside the fold() closure).
There’s a try_fold that can be likely used to make this handle errors/edge cases, but this was just demo code .
Oh yes sure, it's not an array but a vec. I was thinking more of something akin to my ::stackvec::TryFromIterator (a crate that ended up being too similar to ::arrayvec despite its different initial purpose).