Iterate references to global collection

I have a global collection, over which I often need to iterate. I am using tokio, so I encapsulated it in a RwLock:

lazy_static::lazy_static! {
    static ref CONNECTIONS: tokio::sync::RwLock<Vec<Connection>> = tokio::sync::RwLock::new(Vec::new());
}

My goal is to have an iterator, that locks the collection as long as it exists and that I can use like this:

for c in read_iter().await {
    println!("{}", c.name);
}

I am struggling hard with the lifetimes. This is what I came up by just adding one where the compiler was complaining.

pub struct ConnectionIterator<'a> {
    guard: tokio::sync::RwLockReadGuard<'a, std::vec::Vec<Connection>>,
    pos: usize,
}

impl<'a> Iterator for ConnectionIterator<'a> {
    type Item = &'a Connection;

    fn next(&mut self) -> std::option::Option<Self::Item> {
        if self.pos >= self.guard.len() {
            return None;
        } else {
            let elem = &self.guard[self.pos];
            self.pos += 1;
            return Some(elem);
        }
    }
}

pub async fn read_iter<'a>() -> ConnectionIterator<'a> {
    let guard = CONNECTIONS.read().await;
    return ConnectionIterator { guard, pos: 0 };
}

The compiler really tries to tell me something in detail but I just do not understand it.

error[E0495]: cannot infer an appropriate lifetime for lifetime parameter in function call due to conflicting requirements
   --> src\containers.rs:168:25
    |
168 |             let elem = &self.guard[self.pos];
    |                         ^^^^^^^^^^
    |
note: first, the lifetime cannot outlive the anonymous lifetime #1 defined on the method body at 164:5...
   --> src\containers.rs:164:5
    |
164 | /     fn next(&mut self) -> std::option::Option<Self::Item> {
165 | |         if self.pos >= self.guard.len() {
166 | |             return None;
167 | |         } else {
...   |
171 | |         }
172 | |     }
    | |_____^
note: ...so that reference does not outlive borrowed content
   --> src\containers.rs:168:25
    |
168 |             let elem = &self.guard[self.pos];
    |                         ^^^^^^^^^^
note: but, the lifetime must be valid for the lifetime `'a` as defined on the impl at 161:6...
   --> src\containers.rs:161:6
    |
161 | impl<'a> Iterator for ConnectionIterator<'a> {
    |      ^^
note: ...so that the types are compatible
   --> src\containers.rs:164:59
    |
164 |       fn next(&mut self) -> std::option::Option<Self::Item> {
    |  ___________________________________________________________^
165 | |         if self.pos >= self.guard.len() {
166 | |             return None;
167 | |         } else {
...   |
171 | |         }
172 | |     }
    | |_____^
    = note: expected  `std::iter::Iterator`
               found  `std::iter::Iterator`

First, using return in those positions is unidiomatic.

Anyway, the lifetime annotation on RwLockReadGuard<'a, Vec<Connection>> is only an upper bound on how long the read guard can live. The guard is allowed to live shorter than that annotation. This means that you cannot produce a borrow into the insides of the guard annotated by that lifetime, since that borrow could then live longer than the guard, even if the cannot outlive 'a, which would be unsound. Consider reading this. To get around this, your iterator would have to be a borrow of the read guard.

That said, does your Connection type happen to contain IO primitives such as a TcpStream? In that case please be aware that letting IO primitives be owned by a single task and using channels for communication is almost always better than putting it behind some kind of lock.

1 Like

OK, thanks a lot. I think I understand the problem, but I am still not sure how to proceed in my case.

My goal is to have something like this, where all the implementation details like the RwLock are encapsulated in one module:

for c in read_iter().await {
    println!("{}", c.name);
}

When I put it like the following code, everything works, but I have to make the global variable accessible for everyone:

pub async fn print_names() {
    for c in CONNECTIONS.read().await.iter() {
        println!("{}", c.container.name);
    }
}

Is something like this even possible?

pub async fn read_iter() -> ??? {
    CONNECTIONS.read().await.iter()
}

On the other hand I am thinking about other designs. I could have a function that takes a lambda for_each<F>(mut f: F) where F: FnMut(). Here I probably can not execute async operations in the lambda? But I am more and more convinced that it would not make a lot of sense anyway to do slow IO in a lock.

To your last point: no there are no IO primitives behind the lock in this case, but thanks for the info. I will keep that in mind.

Unfortunately, it is not possible with the exact api you want. This is because the iterator must be a borrow of the guard, and you can't return something and a reference to that something at once. That said, it is possible to return something you can turn into an iterator.

pub struct IterGuard<'a> {
    guard: tokio::sync::RwLockReadGuard<'a, std::vec::Vec<Connection>>,
}

impl<'a, 'b> IntoIterator for &'b IterGuard<'a> {
    type IntoIter = std::slice::Iter<'b, Connection>;
    type Item = &'b Connection;

    fn into_iter(self) -> Self::IntoIter {
        self.guard.iter()
    }
}
impl<'a> IterGuard<'a> {
    pub fn iter(&self) -> std::slice::Iter<Connection> {
        self.guard.iter()
    }
}

pub async fn read_iter<'a>() -> IterGuard<'a> {
    let guard = CONNECTIONS.read().await;
    return IterGuard { guard };
}

Using this abstraction requires either an extra & like this:

for conn in &read_iter().await {
    println!("{}", conn.name);
}

or with an extra function call:

for conn in read_iter().await.iter() {
    println!("{}", conn.name);
}

playground

2 Likes

Thanks a lot for all the trouble! This really makes sense now.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.