Help with Iterator

Hi,

I'm trying to create an in-memory DB that can take advantage of the RUST Iterator API, the DB would out the data, this would be just some indexes stored as HashMap<String, Vec<String>>. See code below. How can I implement something like get_filtered to return an iterator, my objective is to add some methods that allow me to query the vector inside the hash map an apply filter/map operations to them.

use std::collections::HashMap;

struct DB {
    index: HashMap<String, Vec<String>>,
}

impl DB {
    fn new() -> Self {
        DB {
            index: HashMap::new(),
        }
    }

    fn get_entity(&self, entity: &str) -> std::slice::Iter<'_, String> {
        match self.index.get(entity) {
            Some(vec) => vec.iter(),
            _ => [].iter(),
        }
    }

    fn get_filtered(&self, entity: &str, filter: &str) -> std::slice::Iter<'_, String> {
        self.get_entity(entity).filter(|v| v.starts_with(filter))
    }
}

fn main() {
    println!("Hello, world!");
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error[E0308]: mismatched types
  --> src/main.rs:22:9
   |
21 |     fn get_filtered(&self, entity: &str, filter: &str) -> std::slice::Iter<'_, String> {
   |                                                           ---------------------------- expected `std::slice::Iter<'_, String>` because of return type
22 |         self.get_entity(entity).filter(|v| v.starts_with(filter))
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected struct `std::slice::Iter`, found struct `Filter`
   |
   = note: expected struct `std::slice::Iter<'_, _>`
              found struct `Filter<std::slice::Iter<'_, _>, [closure@src/main.rs:22:40: 22:65]>`

For more information about this error, try `rustc --explain E0308`.
error: could not compile `playground` due to previous error

Well, the compiler told you exactly what the error is. .filter() returns an iterator adaptor which performs the filtering; it doesn't return the input iterator directly.

As long as you are writing inherent impls (and not trait methods), you can just hide the specific iterator type by changing the return type to impl Iterator<Item = String>.

1 Like

You may try this with what @H2CO3 says and the suggestions about lifetime from the compiler.

struct DB {
    index: HashMap<String, Vec<String>>,
}

impl DB {
    fn new() -> Self {
        DB {
            index: HashMap::new(),
        }
    }

    fn get_entity(&self, entity: &str) -> impl Iterator<Item = &String> {
        match self.index.get(entity) {
            Some(vec) => vec.iter(),
            _ => [].iter(),
        }
    }

    fn get_filtered<'a>(
        &'a self,
        entity: &'a str,
        filter: &'a str,
    ) -> impl Iterator<Item = &String> + 'a {
        self.get_entity(entity)
            .filter(move |v| v.starts_with(filter))
    }
}

playground

The return type you are looking for is impl Iterator<Item = ...> as in the following:

fn get_entity(&self, entity: &str) -> impl Iterator<Item = &String> {
    match self.index.get(entity) {
        Some(vec) => vec.iter(),
        _ => [].iter(),
    }
}

You can read about this syntax in this chapter of the Rust book. Basically, what impl Trait means for a return type is "this function returns some type that implements the given trait, but I'm not telling you which one". It's very useful for things like iterators.


It does have one limitation, which is that the actual underlying type of the iterator has to be a single specific type. For example, you cannot have a match that returns different iterators in each case. As an example of this, I note that it is more idiomatic to use &str instead of &String. Unfortunately the naive attempt to do this doesn't work:

fn get_entity(&self, entity: &str) -> impl Iterator<Item = &str> {
    match self.index.get(entity) {
        Some(vec) => vec.iter().map(|s| s.as_str()),
        _ => [].iter(),
    }
}
error[E0308]: `match` arms have incompatible types
  --> src/main.rs:17:18
   |
15 | /         match self.index.get(entity) {
16 | |             Some(vec) => vec.iter().map(|s| s.as_str()),
   | |                          ------------------------------ this is found to be of type `Map<std::slice::Iter<'_, String>, [closure@src/main.rs:16:41: 16:55]>`
17 | |             _ => [].iter(),
   | |                  ^^^^^^^^^ expected struct `Map`, found struct `std::slice::Iter`
18 | |         }
   | |_________- `match` arms have incompatible types
   |
   = note: expected type `Map<std::slice::Iter<'_, String>, [closure@src/main.rs:16:41: 16:55]>`
            found struct `std::slice::Iter<'_, _>`

As you can see it complains that each branch returns a different type of iterator.

There's an alternative that works:

fn get_entity(&self, entity: &str) -> impl Iterator<Item = &str> {
    let slice = match self.index.get(entity) {
        Some(vec) => vec.as_slice(),
        _ => &[],
    };
    slice.iter().map(|s| s.as_str())
}

Here it returns the same kind of iterator no matter what.


Doing this for the get_filtered combinator is a bit more difficult.

fn get_filtered(&self, entity: &str, filter: &str) -> impl Iterator<Item = &str> {
    self.get_entity(entity).filter(|v| v.starts_with(filter))
}
error[E0700]: hidden type for `impl Trait` captures lifetime that does not appear in bounds
  --> src/main.rs:22:59
   |
22 |     fn get_filtered(&self, entity: &str, filter: &str) -> impl Iterator<Item = &str> {
   |                                                  ----     ^^^^^^^^^^^^^^^^^^^^^^^^^^
   |                                                  |
   |                                                  hidden type `Filter<impl Iterator, [closure@src/main.rs:23:40: 23:65]>` captures the anonymous lifetime defined here
   |
help: to declare that the `impl Trait` captures '_, you can add an explicit `'_` lifetime bound
   |
22 |     fn get_filtered(&self, entity: &str, filter: &str) -> impl Iterator<Item = &str> + '_ {
   |                                                                                      ++++

error[E0597]: `filter` does not live long enough
  --> src/main.rs:23:58
   |
23 |         self.get_entity(entity).filter(|v| v.starts_with(filter))
   |                                        ---               ^^^^^^ borrowed value does not live long enough
   |                                        |
   |                                        value captured here
24 |     }
   |      -
   |      |
   |      `filter` dropped here while still borrowed
   |      borrow later used here

You get a bunch of lifetime errors. If we take a step back, the first function also had some lifetimes - we just had not been specified because the lifetime elision rules gave us what we wanted in this specific case. The expanded lifetimes are:

fn get_entity<'me, 'entity>(&'me self, entity: &'entity str) -> impl Iterator<Item = &'me str> + 'me {
    let slice = match self.index.get(entity) {
        Some(vec) => vec.as_slice(),
        _ => &[],
    };
    slice.iter().map(|s| s.as_str())
}

The above says that the returned iterator contains references into self (that's the + 'me part), and that the string slices returned by the iterator are also references into self (that's the &'me str part).

The reason that the default elided lifetimes don't work for get_filtered is that the returned iterator has reference to both self and filter, but the default lifetimes we get by the elision rules say that the returned iterator has references to self only. (Note: The elision rules don't look at the contents of the function.)

One fix is the following:

fn get_filtered<'entity, 'filter_and_me>(
    &'filter_and_me self,
    entity: &'entity str,
    filter: &'filter_and_me str,
) -> impl Iterator<Item = &'filter_and_me str> + 'filter_and_me {
    self.get_entity(entity).filter(move |v| v.starts_with(filter))
}

Here we tell the compiler that the returned iterator has references to self and filter, but not to entity. However note that the lifetimes above are actually stricter than necessary. This is because the lifetime on the &str return type allows us to return references into the filter variable, but we don't actually do that - once the iterator is destroyed, the returned strings don't actually require that the filter argument stays alive.

A more relaxed lifetime specification is the following:

fn get_filtered<'me, 'entity, 'filter_and_me>(
    &'me self,
    entity: &'entity str,
    filter: &'filter_and_me str,
) -> impl Iterator<Item = &'me str> + 'filter_and_me
where
    'me: 'filter_and_me,
{
    self.get_entity(entity).filter(move |v| v.starts_with(filter))
}

Here we say that the returned iterator has references to both self and filter, but the strings that the iterator creates has references only to self and not to filter. Note that the where 'me: 'filter_and_me part says "The lifetime 'me entirely contains the lifetime 'filter_and_me". It doesn't compile without it, because otherwise we say that the returned iterator doesn't contain references to into self, which is untrue.

Note that the 'entity lifetime can be removed in all the examples. The lifetime elision rules will re-insert an unused lifetime.

If you want to try out the lifetimes, you can use the following example:

fn main() {
    let mut db = DB::new();
    db.index.insert("foo".to_string(), vec!["bar".to_string(), "baz".to_string()]);
    
    let entity = "foo".to_string();
    let filter = "bar".to_string();
    
    // all three lifetimes start here
    let mut iterator = db.get_filtered(&entity, &filter);
    // 'entity ends here
    drop(entity);
    
    let item = iterator.next().unwrap();
    drop(iterator);
    // 'filter_and_me ends here
    drop(filter);
    
    println!("{}", item);
    // 'me ends here
}

This will only compile if the lifetimes are as relaxed as they can be. Any of the other correct but more strict lifetime specifications will fail to compile the above main function.

8 Likes

Actually the one you posted is more restrictive than the one in my post. If you try it with the main function I gave as an example, then only my version compiles.

Thanks for the wonderful explanation

1 Like

You are right. (I was replying to @vague, so I didn't see your post yet.)

Seriously! Helped me a lot too

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.