I want to do simple thing: get as input list of almost unique strings (all strings should be unique, except "_") and output list of unique String.
But the problem that language force me to do unnecessary allocations.
I can not use just HashSet<&str> this is code impossible to compile,
I have to use HashSet<String> for checking that string is unique.
Is any way to prevent memory allocation for String duplicates (take into consideration that I do not want change generate_names signature) ?
use std::collections::HashSet;
fn main() {
let mut names: Vec<String> = vec!["_".into(), "second".into(), "_".into()];
generate_names(&mut names).unwrap();
}
fn generate_names(names: &mut [String]) -> Result<(), String> {
let mut known_names = HashSet::<&str>::with_capacity(names.len());
for name in names.iter() {
if name != "_" {
if known_names.contains(name.as_str()) {
return Err(format!("duplicate name {}", name));
}
known_names.insert(name.as_str());
}
}
for (i, name) in names.iter_mut().enumerate() {
if name == "_" {
let templ = format!("x{}", i);
let new_name = new_unique_name(&known_names, &templ);
*name = new_name;
known_names.insert(name.as_str());
}
}
Ok(())
}
fn new_unique_name(names: &HashSet<&str>, templ: &str) -> String {
unimplemented!();
}
No, I need preserve place of unique names in input list.
In other words I need almost exactly what in Rust code,
scan input array, and replace placehodlers with unique names, preserve another strings on their places.
Yes, it allocates just one vector of pointers, it's still quite cheap compared to the hashset and format! calls; but I agree that the vec should not be strictly necessary, algorithm-wise, it's a trick to circumvent limitations regarding the sometimes bad granularity of borrows: if you use two loops, Rust does not see that the items you work on are disjoint.
With quite a bit of effort to do it soundly (Miri does not complain), I have managed to no longer need the extra vec of mutable references. However, it involves complex (and thus slightly likely to be unsound) unsafe code, just to save one vector allocation; use at your own risk: