I have seen some topics similar to this, but I think they're not much help.
I want to implement a feature that consumes an iterator and returns the corresponding value of the item of the iterator according to the map. But the following code doesn't compile. For the sake of brevity, I omitted some code and used concrete types, and assumed String cannot be cloned.
use std::{collections::HashMap, slice::Iter};
struct MyIter<'a> {
keys: Iter<'a, u32>,
map: &'a mut HashMap<u32, String>,
}
impl<'a> Iterator for MyIter<'a> {
type Item = &'a String;
fn next(&mut self) -> Option<Self::Item> {
let key = self.keys.next()?;
let string = match self.map.get(&key) {
Some(s) => s,
None => {
todo!("Insert an empty string and return it");
}
};
Some(string)
}
}
error: lifetime may not live long enough
--> src/lib.rs:18:9
|
8 | impl<'a> Iterator for MyIter<'a> {
| -- lifetime `'a` defined here
9 | type Item = &'a String;
10 | fn next(&mut self) -> Option<Self::Item> {
| - let's call the lifetime of this reference `'1`
...
18 | Some(string)
| ^^^^^^^^^^^^ associated function was supposed to return data with lifetime `'a` but it is returning data with lifetime `'1`
I don't know why I am returning data with lifetime '1 but not 'a, and how to fix this? Is it safe if I return string forcibly via unsafe code like follow? Thanks.
// Change `Some(string)` to this
unsafe { Some(&*(string as *const String)) }
You’re probably running into the problem that &mut has an invariant lifetime, but I’m not sure exactly how. If you’re ok with not inserting into the map for missing keys, you can write something like this instead:
use std::{collections::HashMap};
struct MyIter<'a, I> {
keys: I,
map: &'a HashMap<u32, String>,
}
impl<'a, I:Iterator<Item=u32>> Iterator for MyIter<'a, I> {
type Item = &'a str;
fn next(&mut self) -> Option<Self::Item> {
let key = self.keys.next()?;
let string = match self.map.get(&key) {
Some(s) => s.as_str(),
None => &""
};
Some(string)
}
}
Note that filling in your todo! will be impossible: Inserting a value into the HashMap may move other values, which can break references returned from previous calls to next— If you want the HashMap to be modified like this, you’ll need to use some other strategy to make it happen.
This also makes your unsafe workaround unsound: After the internal structure of the map is changed, some of the references will be left dangling.
Thanks for your explaination and workaround, which is sound and enlightening.
In your second demonstration, you choose return dummy instead of creating an empty string immediately. But actually, I forgot to metion, in my project, creating a string is not a thing simple as String::default but through a slightly time-consuming function, and the string changes as the key changes. For the sake of brevity, you can recognize creating a string is not by String::default but u32::to_string().
Do you have a workaround for this case? Thanks.
Hmmm... one is tempted to use some sort of buffer, but the difficulty is that you can't actually mutate such a buffer on each next call -- that would be a lending iterator (where you can only hold on to each Item until the subsequent next, as opposed to be able to collect them all and access them all at the same time). Making a faux-iterator (a LendingIterator, but there is no such standard trait yet) instead of an Iterator is an option.
The pre-populate approach would work with Iterator, but you do pay the cost up-front.
You could perhaps (if the keys iterator is Clone) pre-scan to allocate enough uninitialized or Default storage, and then populate it as you go. If the values have a reasonable default, this could avoid unsafe; if not, you might need to use MaybeUninit or the like. POC.
If the keys iterator isn't clone... you could still pre-scan but collect into a Vec<Either<(key, (key, String))>> or something like that. Even sloppier POC.
But... you should probably keep track of what you consumed in this case, or you'll insert your dummy default/uninitialized values. Increasingly messy POC.
Thanks again for giving such a complex workaround and detailed explanation, which solved my problem perfectly (although understanding them takes me some time).
/// A streaming iterator of mutable references, where `advance` consumes the iterator.
pub trait MutStreamingIterator: Sized {
type Item;
fn advance(self) -> Option<Self>;
fn get(&mut self) -> Option<&mut Self::Item>;
fn size_hint(&self) -> (usize, Option<usize>);
}
// Clone is not required for the iterator, but needed for the collect that takes by value
fn collect<T: Clone, I: MutStreamingIterator<Item=T>>(mut iter: I) -> Vec<T> {
let mut values = Vec::with_capacity(iter.size_hint().0);
loop {
match iter.advance() {
Some(mut new_iter) => {
if let Some(item) = new_iter.get() {
// item is a mutable reference here!
values.push(item.clone())
}
iter = new_iter;
},
None => break
}
}
values
}