IterMut and lifetime issue

Hi, I'm trying to write a map-like collection but that allows the item to be looked up by two keys, name or uuid. Internally using a map from name->uuid and then one from uuid->T.

This is mostly working except for implementing iter_mut. Big lifetime issues that I haven't been able to figure out even after reading the Advanced Lifetimes page.

the code in question is here at line 78. Iter works, but IterMut needs to call get_mut() and then everything blows up :frowning: Help?

Also, if a crate already exists for this, I'd be happy to use that but couldn't find one.

Thanks! -- Andy

You won't be able to write that mutable iterator without unsafe code (or some other contortion). Mutable iterators, in general, are virtually impossible to do without unsafe code. We can go into some details on why that is, if you're interested, but for now I'm wondering whether you can use a HashMap<(Option<String>, Option<UUID>), T> (or custom key struct that's essentially the same thing) internally instead? The public API would require that at least one of String/UUID are specified, but internally you can use Options.

I don't see how that would make it possible to do a lookup on just name or just uuid without iterating all keys.

Could you talk a little bit more about how using unsafe might allow this to work? Unsafe doesn't turn off the borrow checker... the only thing I can think of is casting to ptrs would get it to ignore us? :slight_smile:

If you also had a uuid->name map, then you could iter_mut() the uuid->T map and just get the name.

Or maybe a little less wasteful, and closer to what @vitalyd said, something like:

struct UuidKey(Uuid, String);
// impl Hash and Eq by Uuid only, and impl Borrow<Uuid> for lookups

pub struct Table<T> {
    name_to_uuid: HashMap<String, Uuid>,
    items: HashMap<UuidKey, T>,
}

// now IterMut can just iterate items
pub struct IterMut<'a, T: 'a> {
    iter: hash_map::Iter<'a, UuidKey, T>,
}

impl<'a, T> Iterator for IterMut<'a, T> {
    type Item = (&'a str, &'a Uuid, &'a mut T);

    #[inline]
    fn next(&mut self) -> Option<(&'a str, &'a Uuid, &'a mut T)> {
        self.iter
            .next()
            .map(|(key, item)| (&key.1, &key.0, item))
    }
}

So lookup by name still requires two lookups, but iteration can just use the one hash table.

2 Likes

Well, that UuidKey might be overkill. You could just as well have:

pub struct Table<T> {
    name_to_uuid: HashMap<String, Uuid>,
    items: HashMap<Uuid, (String, T)>,
}

I think @cuviper fleshed out sort of what I had in mind (maybe with Rc<str> instead of String to avoid copies). Is there a particular use you'd like to optimize for?

Borrow checker is worried about your iterator returning a mutable reference to the same value across next() calls. As far as it's concerned, nothing prevents that based on the types involved and next() method signature. More specifically, next() borrows the iterator mutably - while next is executing, you have unique/exclusive access to the &'a mut HashMap<Uuid, T> - all good so far. The problem arises when you try to return the mutable reference from next(). The lifetime of that mutable borrow is not at all related to the lifetime of the &mut self that was used to call next(). So once next() returns, the caller has a mutable reference to T. Since the mutable borrow of the iterator has ended, you can call next again. At this point, the borrow checker has no way to ensure that the next T reference you return does not alias with the one you returned on the previous iteration. And this is the crux of the problem. This is a common issue with implementing mutable iterators, and some other techniques exist (notably, streaming iterators) although they won't lend themselves to for loop desugaring.

Unsafe code would allow you to deliberately return a mutable reference by going through a raw pointer first; once you get a raw pointer, borrow checker does not track it anymore. This, however, opens you up to UB if you do end up returning mutably aliased data (and hence why it's unsafe :slight_smile:). But if your data structure and/or iterator internally ensure (some way) that they will not yield the same element more than once, then you can drop down to unsafe code and guarantee that invariant yourself.

1 Like

Good point. I'd even consider Arc<str> so the table can impl Send and Sync. The overhead of atomic ref-counting shouldn't matter much if insertions are relatively rare, and anyway we know the count will be stable at just 2, never actually concurrently updated. :slight_smile:

1 Like

Thanks! I'll try that!