Iterators in traits and their generalised usage

Hi there,

I'm about to define a trait that contains a method that shall return an Iterator of some type like so:

pub trait Trait<'a> {
  type Iter: Iterator<Item = &'a&'a str>;

  fn iter(&self) -> Self::Iter;
}

However, when implementing the trait for a specific type the contents of the type the iter() method is about to provide are not always of the item type requested - e.g. &&str in my case. Thus I'd like to use map in the implementation of the concrete type.

impl<'a> Trait<'a> for Foo {
   type Iter = std::slice::Iter<'a, &'a str>;

fn iter(&self) -> Self::Iter {
    self.data
        .iter()
        .map(|item| item.as_str())
}

Well - this fails to compile as the iterator provided by the implementation is actually a different type:

expected struct `std::slice::Iter<'_, &str>`
   found struct `std::iter::Map<std::slice::Iter<'_, String>, [closure@src/test.rs:60:12: 60:43]>

The question now would be - how to specify and restrict the iterator definition in the trait so it can kind of accept any kind of Iterator that yields items with specific behavior like being able to be converted to &str. So the implementation finally running on the items of the returned iterator can be properly used without the need to know the exact iterator type:

fn foo<'a, T: Trait<'a>>(value: T) {
  for item in value.iter() {
     dbg!(item); // here the item should be a &str or being able to be converted into &str with as_str()
  }
}

Any hints are much appreciated.

I assume your current problem is that you can't write out Map<Iter<'_ String>, [closure]> since [closure] is unnamed.

There's a few ways to resolve it.

  1. Erase the type and return Box<dyn Iterator<Item = &'a &'a str>>. This obviously solves the problem, but comes with some runtime overhead.

  2. Create you own type and mimic the behavior of Map. This have no runtime overhead, but requires some boilerplate code to be write. If you really need that performance I would recommend this route.

  3. Define the trait as

pub trait Trait<'a> {
  type Item: Deref<Target = str>;  // or any other trait that fits your need, e.g. AsRef<sstr>
  type Iter: Iterator<Item = Self::Item>;

  fn iter(&self) -> Self::Iter;
}

This solves your problem as is but I wouldn't say this is the best solution, since you still might have problem providing Self::Iter. But if this solves the most cases, the might just go with it, and create a new type when it doesn't fit.

  1. If you are on nightly version of Rust, you could write
#![feature(type_alias_impl_trait)]
impl<'a> Trait<'a> for Foo {
   type Iter = impl Iterator<Item = &'a &'a str>;

    fn iter(&self) -> Self::Iter {
        self.data
            .iter()
            .map(|item| item.as_str())
    }
}

which solve the problem exactly. But it's on nightly, and may have some issue around it.

P.S. returning &'a &'a str seems fishy, why not just returning &'a str ?

2 Likes

Hi,

thanks for your suggestions.

  1. using the box approach is kinda off the table as it is performance crucial
  2. using a wrapper mimicking Map might also not fit as it's not guarantied that the trait implementor only uses the map combinator to create the final iterator :thinking:
  3. well nightly is also not an option. So hoping it get stabilized soon :wink:

This means I'll try suggestion 3 and see how far I get :nerd_face: ..thanks again.

Regarding 2.

You don't actually create a MyMap with accept a closure as argument, but a MyIter which impl Iterator and inline the closure and forget you are using Map totally. So it's not restrained to Map.

On mobile now, will provide an example later.

Tangential to the question, but the lifetimes don't work out with your sketches. You might need a GAT or to implement on references to nominal types.

Hi,

as I did not get it working with option 3 I was going the Box route:

trait Trait<'a> {
    type Item: AsRef<str> + Debug; 
    //type Iter: Iterator<Item = &'a str>;
    fn iter(&self) -> Box<dyn Iterator<Item = Self::Item> + '_>;
}

struct Foo {
    values: Vec<String>,
}

impl<'a> Trait<'a> for Foo {
    type Item = String;
    //type Iter = std::iter::Map<std::slice::Iter<'a, String>, dyn FnMut(String) -> &'a str>;
    
    fn iter(&self) -> Box<dyn Iterator<Item = Self::Item> + '_> {
        Box::new(
            self.values
                .iter()
                .map(|v| v.to_owned())
        )
    }
}

It works but does have the heap allocation overhead. I would appreciate if you could sketch the MyIter proposal as I'm having a hard time to imagine what you are up to with this approach :wink:

Thx. in advance.

Single lifetime version (on stable).

GAT version (on beta).

Basically you move the unnameable type (the closure) to the Iterator implementation of your (named) custom iterator.

3 Likes

Hi,
thanks a ton - with the first example I could rebuild my Iterator usage without the allocation and it compiles just fine. However, down the road I'm having trouble using it due to some lifetime issues.

I tried to boil down my code to a reproducible example.
It might look complicated at first but I try to generalise some processing logic with traits.
So the bare minimum looks like this:

trait Processor<'a, T>
where
    T: Trait<'a>
{
    fn process(&'a self, data: T);
}

struct MyProcessor;

impl<'a, T> Processor<'a, T> for MyProcessor
where
    T: Trait<'a> + 'static,
{
    fn process(&self, data: T) {
        let value = self.extract_data(&data);
    }
}

impl MyProcessor {
    fn extract_data<'a, T>(&self, data: &'a T) -> Option<String>
    where
        T: Trait<'a>,
    {
        data.iter().find(|itm| itm.as_ref() == "dummy").map(|itm| itm.as_ref().to_string())
    }
}

However, this gives the compiler error:

error[E0597]: `data` does not live long enough
  --> src/main.rs:48:39
   |
43 | impl<'a, T> Processor<'a, T> for MyProcessor
   |      -- lifetime `'a` defined here
...
48 |         let value = self.extract_data(&data);
   |                     ------------------^^^^^-
   |                     |                 |
   |                     |                 borrowed value does not live long enough
   |                     argument requires that `data` is borrowed for `'a`
49 |     }
   |     - `data` dropped here while still borrowed

From my point of view this does not make any sense. The borrow of data das require to life only till the value of the method call is assigned to value. Furthermore data is owned by the process method and would be dropped only at the end of this method - together with value. So how come that the compiler thinks that data is dropped too early and the borrow does not live long enough? Any insights would be much appreciated to unfold the knot in my head :smiley:

Within process, the lifetime 'a could be anything -- that's what it means when you make an implementation generic over a lifetime with no other bounds. For a concrete example, maybe T only implements Trait<'static>.

Then the signature of extract_data says you must make a borrow of length 'a. For the example case, that would mean you need to be able to borrow data forever ('static), which clearly you can't do since you drop it at the end of process. That's what the API requires. For all the compiler knows [1], maybe extract_data is actually going to store the argument for 'a via interior mutability or such.

This slightly more complicated version of a common error where you do something like:

fn foo<'generic>(data: Data) {
    let borrow: &'generic Data = &Data;
}

In this simpler version, the caller chooses the lifetime 'generic, and the only thing you know as a the function writer is that it must be longer than your function body. Thus you can never use it as the lifetime of a borrow of a local variable that drops at the end of the function.


There's some other disconnect here where trait Processor takes &'a self but your implementation does not, and then your extract_data takes &'a data instead, but I'm not seeing the intention so I'm going to ignore it. Depending on what you're trying to do, this might need to be revisited.


So anyway, how can you work with lifetimes of local borrows generically? You need a Higher-Ranked Trait Bound. HRTBs say that you can work with lifetimes of any length -- including those shorter than function or method bodies, which you otherwise can't name. They look like:

T: for<'any> Trait<'any>

In the example so far, Processor doesn't actually need a lifetime [2], or the trait bound, so first we simplify that to

trait Processor<T> {
    fn process(&self, data: T);
}

And then we change your implementation to use a HRTB:

impl<T> Processor<T> for MyProcessor
where
    T: for<'any> Trait<'any>,
{
    fn process(&self, data: T) {
        let value = self.extract_data(&data);
    }
}

And now you can borrow data for some lifetime shorter than the method body, because T supports Trait<'_> for any lifetime -- including lifetimes shorter than your method body like you require.


  1. or more accurately, is "allowed" to reason about -- there is huge value in enforcing the API contract of your method signatures ↩︎

  2. given the disconnect mentioned above ↩︎

1 Like

Hi,
thanks for the detailed explanation and the help sorting this out. With the HRTB on the lifetime it's working like a charm :wink: ...

The lifetime disconnects where mainly due to some back and fourth try&error thingy. The Process trait did have a lifetime associated as I thought I might want to define an additional bound to the generic parameter as requiring to implement Trait<'a> - thus the lifetime needed to bubble up to this type as well.

However, I might reduce complexity in the trait definition itself and put the proper bounds on the implementation of this trait for a specific type only.

Thanks again!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.