Creating an iterator over Mutex contents: cannot infer an appropriate lifetime

#1

I’m trying to implement an iterator for a type that contains data within a mutex. It seems like it should work if the struct that implements the Iterator trait contains the appropriate MutexGuard, so that the lock is retained for the lifetime of the iterator instance, but I’m running into some trouble making the lifetimes work out. I have the full code in a playground, but here’s the core of the code:

pub struct Iter<'mutex, T: 'mutex> {
    guard: MutexGuard<'mutex, T>,
}

impl<'mutex, T> Iterator for Iter<'mutex, T> {
    type Item = &'mutex T;

    fn next(&mut self) -> Option<Self::Item> {
        let x: &'mutex T = &self.guard;
        Some(x)
    }
}

The error I’m getting is

error[E0495]: cannot infer an appropriate lifetime for borrow expression due to conflicting requirements
  --> src/main.rs:21:28
   |
21 |         let x: &'mutex T = &self.guard;
   |                            ^^^^^^^^^^^
   |

And the following additional note:

First, the lifetime cannot outlive the anonymous lifetime #1 defined on the method body at 20:5, so that reference does not outlive borrowed content. But, the lifetime must be valid for the lifetime 'mutex as defined on the impl at 17:6, so that reference does not outlive borrowed content.

If I’m understanding this correctly, it’s saying that the &T returned by next can’t outlive the scope of the method because the borrow of self.guard that provides access to it ends when next does. But I also understand that the &T returned by next needs a lifetime specified, and since the references need to be valid for the lifetime of the iterator, 'mutex, the lifetime of the MutexGuard seems like the appropriate choice.

It seems like the &mut self input to next should be valid for 'mutex, but I don’t see any way to express that given the requirements of the Iterator trait.

Is there a way to make this approach work, or is my approach fatally flawed? And if it is flawed, is there an safe way to implement an iterator over data protected by a Mutex?

0 Likes

#2

What you are trying to obtain is a bit annoying at the state of the art.

To recap the problem (in practice you understood the issue), the return value of next is not bound to the lifetime of Iter, but on the other hand you are trying to return a reference to the guard (yes, in this case it is a smart reference, but it still needs the MutexGuard itself to exist). So you could drop Iter while having a reference to &T, and that is not allowed.

The real problem is that Item should have a lifetime associated with the object itself, but we cannot express that without GATs.
What you can do is using a StreamingIterator instead of a normal Iterator, which won’t allow to call next() without releasing the object for the current iteration. This approach has some downsides, but if you are lucky enough you should be able to use it without any problem.

1 Like

#3

Hey, @dodomorandi, that’s very helpful! With StreamingIterator, I’m able to get the trivial case of iterating over a single value behind a mutex to work:

impl<'mutex, T> StreamingIterator for Iter<'mutex, T> {
    type Item = T;

    fn advance(&mut self) {}
    fn get(&self) -> Option<&Self::Item> {
        let x = &self.guard;
        Some(x)
    }
}

But when I try to extend this to something more like my practical application—iterating over a collection behind a Mutex—I’m still struggling. I was able to specifically get a Vec working if I keep track of the index in my struct that implements StreamingIterator:

pub struct Iter<'mutex, T: 'mutex> {
    guard: MutexGuard<'mutex, Vec<T>>,
    idx: Option<usize>,
}

impl<'mutex, T> Iter<'mutex, T> {
    fn new(guard: MutexGuard<'mutex, Vec<T>>) -> Self {
        Self { guard, idx: None }
    }
}

impl<'mutex, T> StreamingIterator for Iter<'mutex, T> {
    type Item = T;

    fn advance(&mut self) {
        self.idx = Some(self.idx.map_or(0, |i| i + 1));
    }

    fn get(&self) -> Option<&Self::Item> {
        self.idx.and_then(|i| self.guard.get(i))
    }
}

But that won’t be generalizable to collections that aren’t linearly indexed like HashMap. My first thought was to make use of StreamingIterator::convert, but I can’t seem to make that work. Alternatively, if I try to manage the underlying Iterator as a member of my struct which implements StreamingIterator, I run afoul of lifetime conflicts again:

pub struct Iter<'mutex, T: 'mutex> {
    guard: MutexGuard<'mutex, Vec<T>>,
    iter: std::slice::Iter<'mutex, T>,
}

impl<'mutex, T> Iter<'mutex, T> {
    fn new(guard: MutexGuard<'mutex, Vec<T>>) -> Self {
        let iter = guard.iter();
        Self { guard, iter }
    }
}
error[E0597]: `guard` does not live long enough
   |
19 | impl<'mutex, T> Iter<'mutex, T> {
   |      ------ lifetime `'mutex` defined here
20 |     fn new(guard: MutexGuard<'mutex, Vec<T>>) -> Self {
21 |         let iter = guard.iter();
   |                    ^^^^^-------
   |                    |
   |                    borrowed value does not live long enough
   |                    argument requires that `guard` is borrowed for `'mutex`
22 |         Self { guard, iter }
23 |     }
   |     - `guard` dropped here while still borrowed

error[E0505]: cannot move out of `guard` because it is borrowed
   |
19 | impl<'mutex, T> Iter<'mutex, T> {
   |      ------ lifetime `'mutex` defined here
20 |     fn new(guard: MutexGuard<'mutex, Vec<T>>) -> Self {
21 |         let iter = guard.iter();
   |                    ------------
   |                    |
   |                    borrow of `guard` occurs here
   |                    argument requires that `guard` is borrowed for `'mutex`
22 |         Self { guard, iter }
   |                ^^^^^ move out of `guard` occurs here

And if I try to work around that issue by changing my storage of the iterator to an Option:

pub struct Iter<'mutex, T: 'mutex> {
    guard: MutexGuard<'mutex, Vec<T>>,
    iter: Option<std::slice::Iter<'mutex, T>>,
}

I end up with a similar problem as before implementing advance for StreamingIterator because calling iter() on the Vec behind the guard requires a borrow of the guard through &mut self which can’t outlive the function:

impl<'mutex, T> StreamingIterator for Iter<'mutex, T> {
    type Item = T;

    fn advance(&mut self) {
        if let Some(i) = self.iter {
            i.next();
        } else {
            self.iter = Some(&self.guard.iter())
        }
    }
    ...
}
error[E0495]: cannot infer an appropriate lifetime for lifetime parameter in function call due to conflicting requirements
   |
32 |             self.iter = Some(self.guard.iter())
   |                                         ^^^^
   |
note: first, the lifetime cannot outlive the anonymous lifetime #1 defined on the method body at 28:5...
   |
28 | /     fn advance(&mut self) {
29 | |         if let Some(i) = self.iter {
30 | |             i.next();
31 | |         } else {
32 | |             self.iter = Some(self.guard.iter())
33 | |         }
34 | |     }
   | |_____^
note: ...so that reference does not outlive borrowed content
   |
32 |             self.iter = Some(self.guard.iter())
   |                              ^^^^^^^^^^
note: but, the lifetime must be valid for the lifetime 'mutex as defined on the impl at 25:6...
   |
25 | impl<'mutex, T> StreamingIterator for Iter<'mutex, T> {
   |      ^^^^^^
   = note: ...so that the expression is assignable:
           expected std::option::Option<std::slice::Iter<'mutex, _>>
              found std::option::Option<std::slice::Iter<'_, _>>

Is there a way to get more general implementation for StreamingIterator over iterable types behind Mutex, or is my approach flawed?

1 Like

#4

Very nice evaluation!

I think that for general case we are in the self-referential realm, because:

  1. we got a MutexGuard which can be deferenced respect to its lifetime. In other words, the reference to the underlying element cannot outlive MutexGuard
  2. we need that, given MutexGuard<'m, T>, &'_ T: IntoIterator, in order to use into_iter() and get an iterator over references
  3. we need to store the iterator in the same struct as MutexGuard

At the end, we need that the iterator has a reference to MutexGuard but, at the same time, has the same lifetime, creating a self-referential struct.

As you probably know, you can try to use rental to create this kind of design. Maybe it is possible to create a better implementation with Pin, but I don’t have any experience with it so I am not sure.

0 Likes

#5

Thanks again, @dodomorandi!

I had heard of rental, but hadn’t used it before, so the suggestion that it’s appropriate for this self-referential situation was really helpful. I think you’re also probably right that Pin would also be applicable, but since that’s still experimental, and I’d like something that works on stable, I opted to look into what I could do with rental.

I’m getting pretty close! Here’s what I have now:

pub struct MutexVec<T>(Mutex<Vec<T>>);

impl<T> MutexVec<T> {
    pub fn iter(&self) -> rented::IterGuard<T> {
        rented::IterGuard::new(self.0.lock().unwrap(), |mutex_guard| mutex_guard.iter())
    }
}

rental! {
    pub mod rented {
        use super::*;

        #[rental_mut]
        pub struct IterGuard<'mutex, T: 'static> {
            guard: MutexGuard<'mutex, Vec<T>>,
            iter: std::slice::Iter<'guard, T>,
        }
    }
}

This allows for usage like this:

let mv = MutexVec(Mutex::new(vec![5, 6]));

mv.iter().rent_mut(|x| {
    for n in x {
        println!("{}", n);
    }
});

Which is pretty good, but what I’m really looking for is being able to implement the Iterator or StreamingIterator trait, so I can have ergonomics like:

for n in mv.iter() {
    println!("{}", n);
}

If we try the naïve approach, we’re back to the same conflicting lifetime requirements:

impl<'mutex, T> Iterator for Iter<'mutex, T> {
    type Item = &'mutex T;

    fn next(&mut self) -> Option<Self::Item> {
        self.iter_guard.rent_mut(|x| x.next())
    }
}
error[E0495]: cannot infer an appropriate lifetime for lifetime parameter `'a` due to conflicting requirements
  --> src/eleventh.rs:19:40
   |
19 |         self.iter_guard.rent_mut(|x| x.next())
   |                                        ^^^^
   |
note: first, the lifetime cannot outlive the anonymous lifetime #3 defined on the body at 19:34...
  --> src/eleventh.rs:19:34
   |
19 |         self.iter_guard.rent_mut(|x| x.next())
   |                                  ^^^^^^^^^^^^
note: ...but the lifetime must also be valid for the lifetime 'mutex as defined on the impl at 15:6...
  --> src/eleventh.rs:15:6
   |
15 | impl<'mutex, T> Iterator for Iter<'mutex, T> {
   |      ^^^^^^

Is this the limit of what can be done with rental? We can return a shared reference from the shared suffix of the rental struct with ref_rent, but that still only gets us a lifetime of 'iter, when we need 'mutex.

1 Like

#6

Sorry, I completely forgot that rental cannot give back a transparent access to the data. AFAIK, it is probably not possible to impl Iterator when using rental… or if it is possible, it could be pretty difficult to achieve.

Nevertheless, I tried to give you a different solution, splitting the guard from the iter and using the streaming iterators.

use std::ops::Deref;
use std::sync;
use streaming_iterator::StreamingIterator;

pub struct MutexGuard<'m, T>(sync::MutexGuard<'m, T>);

impl<'m, T> MutexGuard<'m, T> {
    pub fn from(
        mutex: &'m mut sync::Mutex<T>,
    ) -> Result<Self, sync::PoisonError<sync::MutexGuard<'m, T>>> {
        mutex.lock().map(|guard| Self(guard))
    }
}

impl<'g, 'm: 'g, T: 'm> MutexGuard<'m, T>
where
    &'g T: IntoIterator,
{
    pub fn iter(&'g self) -> MutexGuardIter<'g, T> {
        let iter = self.0.deref().into_iter();

        MutexGuardIter { iter, elem: None }
    }
}

pub struct MutexGuardIter<'g, T>
where
    &'g T: IntoIterator,
{
    iter: <&'g T as IntoIterator>::IntoIter,
    elem: Option<<&'g T as IntoIterator>::Item>,
}

impl<'g, T> StreamingIterator for MutexGuardIter<'g, T>
where
    &'g T: IntoIterator,
{
    type Item = <&'g T as IntoIterator>::Item;

    fn advance(&mut self) {
        self.elem = self.iter.next()
    }

    fn get(&self) -> Option<&Self::Item> {
        self.elem.as_ref()
    }
}

A couple of things:

  1. maybe the lifetimes for impl MutexGuard where &T: IntoIterator could be written in a simpler way. However it is explicit that the lifetime 'g represents the lifetime of the MutexGuard, which is used for the MutexGuardIter and is bound to the lifetime of the mutex 'm ('g cannot outlive 'm).
  2. we smash against the limitations of StreamingIterator trait: the get function must return a constant reference to Item, which means that we cannot write MutexGuardIterMut with the streaming_iterator crate. You can try to write down your StreamingIterator trait (and get some missing features like zip), in order to obtain both const and mut iterators.

As you can see, with this examples we are in that grey area or Rust in which we can write it but there is a good risk of creating something that does not fit into the rest of the ecosystem :frowning_face:

1 Like

#7

This is great, I think I’ve got my head around the general problem much more fully thanks to all your help, @dodomorandi.

I worked through your code and here’s how it works out in terms of ergonomics:

let mut mx = sync::Mutex::new(vec![5, 6]);
let mg = MutexGuard::from(&mut mx).unwrap();
let mut iter = mg.iter();
while let Some(x) = iter.next() {
    println!("{}", x);
}

I don’t think any of that can be simplified since you need the mx binding to live long enough for the call to MutexGuard::from and you need the mg binding to live long enough for the call to MutexGuard::iter.

To better understand why this approach worked, I tried writing some functions to see if I could eliminate the need for the MutexGuard struct, since it just wraps std::sync::MutexGuard. My first try was:

fn mutex_iter<'g, 'm: 'g, T: 'm>(mutex: &'m mut sync::Mutex<T>) -> MutexGuardIter<'g, T>
where
    &'g T: IntoIterator,
{
    let iter = mutex.lock().unwrap().deref().into_iter();

    MutexGuardIter { iter, elem: None }
}

But that fails for basically the same reason you can’t eliminate the mg binding above:

error[E0716]: temporary value dropped while borrowed
   |
25 | fn mutex_iter<'g, 'm: 'g, T: 'm>(mutex: &'m mut sync::Mutex<T>) -> MutexGuardIter<'g, T>
   |               -- lifetime `'g` defined here
...
29 |     let iter = mutex.lock().unwrap().deref().into_iter();
   |                ^^^^^^^^^^^^^^^^^^^^^--------------------- temporary value is freed at the end of this statement
   |                |
   |                creates a temporary which is freed while still in use
   |                argument requires that borrow lasts for `'g`

So, instead I tried this:

fn mutex_guard_iter<'g, 'm: 'g, T: 'm>(
    mutex_guard: &'g sync::MutexGuard<'m, T>,
) -> MutexGuardIter<'g, T>
where
    &'g T: IntoIterator,
{
    MutexGuardIter { iter: mutex_guard.deref().into_iter(), elem: None }
}

Which does eliminate the need for the extra wrapper type, but the usage ergonomics are basically the same:

let mut mx = sync::Mutex::new(vec![5, 6]);
let mg = mx.lock().unwrap();
let mut iter = mutex_guard_iter(&mg);
while let Some(x) = iter.next() {
    println!("{}", x);
}

At this point it occurred to me: all the code we’ve worked through sill relies on the same constraint: the guard, the (potentially streaming) iterator and the current value all need to have lifetimes dependent upon one another, but due to the way lifetimes work, we can’t package up all that complexity behind a nice interface while still allowing for the nice ergonomics we want.

If we’re willing to just return the std::sync::MutexGuard guard to the consumer:

fn iterable_mutex_guard<T>(mutex: &mut sync::Mutex<T>) -> sync::MutexGuard<T>
where
    T: IntoIterator,
{
    mutex.lock().unwrap()
}

the ergonomics are great:

let mut mx = sync::Mutex::new(vec![5, 6]);
for x in iterable_mutex_guard(&mut mx).iter() {
    println!("{}", x);
}

But we just improved ergonomics at the expense of worse encapsulation. Now we can also do this:

let mut mx = sync::Mutex::new(vec![5, 6]);
let mut mg = iterable_mutex_guard(&mut mx);
*mg = vec![7, 8];
for x in mg.iter() {
    println!("{}", x);
}

At that point, we might as well just be returning the &mut std::sync::Mutex itself.

Perhaps the thing that gives best balance between ergonomics and encapsulation would be a wrapper around the guard that gives easy access to the Iterator interface, but without totally exposing the value the guard owns:

pub struct IterableGuard<'m, T>(sync::MutexGuard<'m, T>);

impl<'m, I> IterableGuard<'m, Vec<I>> {
    pub fn from(mutex: &'m mut sync::Mutex<Vec<I>>) -> Self {
        mutex.lock().map(Self).unwrap()
    }

    pub fn iter(&self) -> impl Iterator<Item = &I> {
        self.0.iter()
    }
}

I implement this specifically for Vec because I need to call slice::iter, which can return an owned slice::Iter from a shared reference to self rather than IntoIterator that takes ownership of self. (There may be a way to do it with using a IntoIterator bound on &T rather than T, but I got into the conflicting lifetimes trap again when I tried.)

This results in nice ergonomics:

let mut mx = sync::Mutex::new(vec![5, 6]);
let ig = IterableGuard::from(&mut mx);
for x in ig.iter() {
    println!("{}", x);
}

But it doesn’t expose the guarded value, since IterableGuard doesn’t implement Deref. And it can be expanded to support container types other than Vec with a pretty modest amount of code.

It’s even possible to make things a bit more generic and generalize separately over the guard and collection type:

pub struct IterableGuard<T, G: Deref<Target = T>>(G);

impl<'m, T> IterableGuard<T, sync::MutexGuard<'m, T>> {
    pub fn from_mutex(mutex: &'m mut sync::Mutex<T>) -> Self {
        mutex.lock().map(Self).unwrap()
    }
}

impl<'a, T> IterableGuard<T, sync::RwLockReadGuard<'a, T>> {
    pub fn from_read_lock(mutex: &'a mut sync::RwLock<T>) -> Self {
        mutex.read().map(Self).unwrap()
    }
}

impl<I, G: Deref<Target = Vec<I>>> IterableGuard<Vec<I>, G> {
    pub fn iter(&self) -> impl Iterator<Item = &I> {
        self.0.iter()
    }
}

impl<K, V, G: Deref<Target = HashMap<K, V>>> IterableGuard<HashMap<K, V>, G>
where
    K: std::cmp::Eq + std::hash::Hash,
{
    pub fn iter(&self) -> impl Iterator<Item = (&K, &V)> {
        self.0.iter()
    }
}

Though I wonder if that’s sacrificing too much in terms of readability.

1 Like

#8

I discovered something interesting, so I have another update! I was thinking about making a crate for the code above, and while searching crates.io to see if there was already anything similar, I came across the guardian crate which describes itself this way:

Guardian provides owned mutex guards for refcounted mutexes.

If the mutex is refcounted using an Rc or an Arc , it is not necessary for the guard to be scoped in this way – it could instead carry with it a ref to the mutex in question, which allows the guard to be held for as long as is necessary. This is particularly useful for writing iterators where it is advantageous to hold a read lock for the duration of the iteration.

Since in my intended use cases, the locks would be inside Rcs or Arcs anyway, this is ideal. With it, I’m able to get almost exactly what I want:

use std::sync;
use guardian::ArcMutexGuardian;

struct ConcurrentIterableVec<T> {
    items: sync::Arc<sync::Mutex<T>>,
}

impl<I> ConcurrentIterableVec<Vec<I>> {
    fn new(val: Vec<I>) -> Self {
        Self { items: sync::Arc::new(sync::Mutex::new(val)) }
    }

    fn iterable_lock(&self) -> IterableGuardian<I> {
        IterableGuardian(ArcMutexGuardian::take(sync::Arc::clone(&self.items)).unwrap())
    }
}

struct IterableGuardian<I: 'static>(ArcMutexGuardian<Vec<I>>);

impl<I> IterableGuardian<I> {
    fn iter(&self) -> std::slice::Iter<I> {
        self.0.iter()
    }
}

And that small amount of code gives nearly ideal ergonomics:

let rs = ConcurrentIterableVec::new(vec![1, 2, 3]);
for r in rs.iterable_lock().iter() {
    println!("{}", r);
}
1 Like

#9

+1 for the guardian crate. It’s exactly what I needed so thanks for posting. :smiley: For what it’s worth, I didn’t get any value out of wrapping it in a tuple struct since the Guardian is convenient as a transparent accessor, but I can see that would be nice if you wanted a locked iterator specifically.

0 Likes