Trait that requires implementing an iterator over another trait

Hi all. I hope I'll be able to explain myself.

I'm trying to define a trait that requires a method that returns an iterator over objects implementing a specific trait.

My current achivement is here (it is a distilled part of a bigger application):

Let me explain the code.

ThingHolder1 and ThingHolder2 both contain some objects implementing the ThingBase trait. These objects are instances of Thing1 and Thing2 structs.

ThingHolder1 and ThingHolder2 implement the HolderBase trait, and this trait requires them to implement a function iterate_over_content() that returns an iterator over their content, and this iterator must yield references to objects implementing the ThingBase trait (see line 7).
So I defined a trait for this iterator, HolderIterator, line 4.

Moving on to the concrete implementations:

ThingHolder1::iterate_over_content() returns an instance of ThingHolder1Iterator (that implements the HolderIterator trait), that knows how to extract the content of a ThingHolder1 instance and yields references to the ThingBase instances it contains.
Similarily, ThingHolder2::iterate_over_content() returns an instance of ThingHolder2Iterator that does the same thing for its "parent" struct.

Finally, in main(), I build a vector of mixed ThingHolder1 and ThingHolder2 instances and proceed to iterate over all their content.

But just before the finish line the compiler trips me up:

error[E0282]: type annotations needed
   --> src/main.rs:163:38
    |
163 |             println!("output: {:?}", content.do_something());
    |                                      ^^^^^^^ cannot infer type

For more information about this error, try `rustc --explain E0282`.
error: could not compile `playground` due to previous error

I really stretch my rust-fu to arrive here, I'm stuck. I read the documentation about this error but I don't get where the mistake is in my code.
So, what I'm doing wrong? Maybe my solution to this problem is too convoluted? Is there a better way to do the same thing?

I'm open to any advice. Thanks in advance.

(Sorry for the lack of proper formatting and detailed explanation; on mobile.)

I think the compiler is trying too hard to soldier on after some errors and ultimately showing an unrelated problem it had.

Respecifying the Item type in HolderBase seemed off, so I changed the signature. The lifetimes are suspect (I would expect lifetime elision to line up with expectation instead), and sure enough this is the error that results.

Adjusting the lifetimes to address this obviates the need for a lifetime on HolderBase, but more lifetime errors remain.

Following the advice results in a working example. The borrowing semantics are different than your original though, so I don't know that it will work for you in the wider context.

Overall this seems a little trait heavy to me. I can return with more comments later if you'd like (and feel free to ask questions).

2 Likes

I'm honestly confused how it's even valid to specify it. By my reasoning, HolderBase has no associated type Item, so trying to specify it should cause an error, and furthermore it's specified as &'a Box<dyn ThingBase> so I would definitely expect Box<dyn HolderIterator<Item = ()>> to be invalid, but the compiler accepts it as a valid type as well. Is it related to where clauses are only elaborated for supertraits, and not other things · Issue #20671 · rust-lang/rust · GitHub ? If anyone has any idea, I'd love to hear about it.

Wow, I didn't expect a working solution, thanks a lot!
I'm still pretty new to the lifetime concept, as you can probably imagine.

You changed my

Box<dyn HolderIterator<Item = &'a Box<dyn ThingBase>>>

in

Box<dyn HolderIterator<'_> + '_>

What does this tell the compiler? I thought I was supposed to declare what type the iterator yields. How can this work?

What would be a more idiomatic solution in Rust?

You can specify associated items of the supertrait on the subtrait. It seemed natural to me (if not the most reader-friendly) in the general case, as you can put bounds on implementing types without specifying the trait, too (when unambiguous). And indeed, if you take my working playground and switch the implementations and/or the declaration between each of

fn iterate_over_content(&self) -> Box<dyn HolderIterator<Item = &Box<dyn ThingBase>> + '_>;
fn iterate_over_content(&self) -> Box<dyn HolderIterator<'_> + '_>;

...everything still compiles, demonstrating that the compiler considers them equivalent in the coherent case.


The fact that you can declare the Item = () version, even though it is impossible to implement, probably is because associated type equality is where-clause-esque (even though still not accepted in actual where-clauses). I don't think it's 20671 ([lack of] implied bounds) -- it's more that the implementer has the burden of proving they've met the bounds, in combination with the ability to state unmeetable bounds in the declaration. There are probably more reasonable cases that come up (macros, dyn-vs-!Sized workarounds...).

1 Like

OK, let's start with some concepts that apply outside of the example.

Supertraits

When you have a

trait SubTrait: SuperTrait { /* ... */ }
// same thing:
trait SubTrait where Self: SuperTrait { /* ... */ }

You're saying that "whenever you have something that implements SubTrait, it must also implement SuperTrait". With a supertrait bound (a bound on Self) specifically, the statement is so strong that you don't even have to declare the supertrait part elsewhere. (Rust may gain further "implied bounds" in the future -- if it doesn't break inference too much.)

And this carries through to concrete types that implement the trait -- including trait object types (dyn Trait) too. So you don't have to write this:

fn foo(st: &dyn SubTrait) where dyn SubTrait: SuperTrait { /* ... */ }

Instead you write:

fn foo(st: &dyn SubTrait) { /* ... */ }

and you can still utilize SuperTrait.

Lifetime Elision

Rust let's you not write out explicit lifetimes sometimes -- lifetime elision -- but this doesn't mean the lifetimes aren't actually there. When you elide the lifetimes, it just means that they have some sort of contextual default, or that you're asking the compiler to infer the lifetime for you (again depending on context). The difference between these two methods:

trait SomeLifetimeTrait<'lt> {
    fn foo(&self) -> Thing<'lt>;
    fn bar(&self) -> Thing<'_>;
    // Same as:
    // fn bar(&self) -> Thing;
    // ...but don't write that, it hides the fact that a borrow (lifetime)
    //    is involved with `Thing`, which is useful information
}

is that no matter the lifetime on &self, foo returns a Thing<'lt> -- some lifetime specific to the implementation, not the method call. While in contrast, as per the function lifetime elision rules, bar returns a Thing<'_> with a lifetime that's the same as the lifetime on &self. The implication is that the returned Thing is a sub-borrow of &self.

Being more explicit:

trait SomeLifetimeTrait<'lt> {
    fn foo<'a>(&'a self) -> Thing<'lt>;
    fn bar<'b>(&'b self) -> Thing<'b>;
}

Trait object lifetimes

Every dyn Trait has a lifetime parameter -- it's actually a dyn Trait + '_. Why? Well, you might type erase a reference with a non-'static lifetime into a dyn Trait for example. The compiler still needs to be able to track when the type-erased object is valid.

There are more contextual rules for this lifetime than others -- complete elision can act differently that writing out '_. In particular, Box<dyn Trait> usually means Box<dyn Trait + 'static>, whereas Box<dyn Trait + '_> asks the compiler to use the less specific [1] inference or elision rules that depend on the context.


Applying everything

OK, whew! I recognize that's a lot. But now we have enough background that we can apply it do your question.

You had this:

trait HolderBase<'a> {
    fn iterate_over_content(&self) -> Box<dyn HolderIterator<Item = &'a Box<dyn ThingBase>>>;
}

But as per the supertrait bound, once you have a dyn HolderIterator<'a>, it's already implied that you implement Iterator<Item = &'a Box<dyn ThingBase>>. This is why I initially changed it to

trait HolderBase<'a> {
    fn iterate_over_content(&self) -> Box<dyn HolderIterator<'a>>;
}

As per the side-conversation with @Heliozoa, this didn't change the meaning of anything, but let some different errors shine through.


Next, just by experience, I thought you probably meant to tie the borrows of &self and of the iterator together. That's how borrowing iterators generally work after all. So this was the change to

    fn iterate_over_content(&self) -> Box<dyn HolderIterator<'_>>;

as per the function lifetime elision rules. The 'a parameter on HolderBase wasn't used for anything else, so I got rid of it -- if you can avoid lifetimes on traits, it is best to do so.


Finally, the remaining errors were:

error: lifetime may not live long enough
  --> src/main.rs:62:16
   |
61 |     fn iterate_over_content(&self) -> Box<dyn HolderIterator<'_>> {
   |                             - let's call the lifetime of this reference `'1`
62 |         return Box::new(ThingHolder1Iterator::new(self));
   |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ returning this value requires that `'1` must outlive `'static`
   |
help: to declare that the trait object captures data from argument `self`, you can add an explicit `'_` lifetime bound
   |
61 |     fn iterate_over_content(&self) -> Box<dyn HolderIterator<'_> + '_> {

And I recognized this as "oh they're probably type-erasing (something that contains) a reference" -- this means you were going to have to override the default 'static trait object lifetime within the Box and use the same lifetime as is on &self again. That's what happens when you apply the hint -- so I just did so mechanically.


Summing everything up, there's no semantic difference between these:

    fn iterate_over_content(&self) -> Box<dyn HolderIterator<Item = &'a Box<dyn ThingBase>>>;
    fn iterate_over_content(&self) -> Box<dyn HolderIterator<'a>>;

But the latter is idiomatic and let better errors shine though, and the difference between these:

    fn iterate_over_content(&self) -> Box<dyn HolderIterator<'a>>;
    fn iterate_over_content(&self) -> Box<dyn HolderIterator<'_> + '_>;

is the same as the lifetime elision discussion above, with the trait object lifetime thrown into the mix as well.


Further high-level observations

Let me start from the working example and share some more thoughts. Starting right at the top:

trait HolderIterator<'a>: Iterator<Item = &'a Box<dyn ThingBase>> {}
// Aka:
// trait HolderIterator<'a>: 
//     Iterator<Item = &'a Box<dyn ThingBase + 'static>> {}

Specifying a &Box<...> feels a bit overly specific to me -- like specifying a &Vec<T> instead of a &[T]. Although it may require some additional maneuvering elsewhere, I'd suggest

trait HolderIterator<'a>: Iterator<Item = &'a dyn ThingBase> {}
// Aka:
// trait HolderIterator<'a>: Iterator<Item = &'a (dyn ThingBase + 'a)> {}

Note that this does change what is used for the elided trait object lifetime! But I don't think it will matter.

Adjusting the iterator implementations gets us here.

The next observation is that you're using this trait + supertrait as a trait alias, really -- you're not adding anything, and there's nothing for the implementations to do:

impl<'a> HolderIterator<'a> for ThingHolder1Iterator<'a> {}

In this case [2], I would just supply a blanket implementation:

+impl<'a, Iter> HolderIterator<'a> for Iter
+where
+    Iter: Iterator<Item = &'a dyn ThingBase>
+{}
-impl<'a> HolderIterator<'a> for ThingHolder1Iterator<'a> {}
-impl<'a> HolderIterator<'a> for ThingHolder2Iterator<'a> {}

Like so. But in fact, we also only use this in a type erased way, so perhaps it would even make sense to use a type alias.

// Note I've included `+ 'a` here to limit the trait object lifetime
type HolderIterator<'a> = dyn Iterator<Item = &'a dyn ThingBase> + 'a;

trait HolderBase {
    fn iterate_over_content(&self) -> Box<HolderIterator<'_>>;
}
// [adjust the implementors...]

Like so. One could argue this isn't an improvement though -- we've lost some abstraction (and thus flexibility) by not using our own trait and trait object type. But if you really only needed a type-erased iterator, why not just use a type-erased iterator?

This has the benefit that the trait object lifetime hoops we have to jump through get encapsulated in our type alias. We've also gotten ride of any lifetime-carrying traits, which can be bothersome for other reasons.

Side note:

It's clear in this form that HolderBase is a type-erased version of the sometimes-suggested Iterate trait (which requires the recently stabilized GAT feature):

trait Iterate {
    type Item;
    type Iter<'a>: Iterator<Item = &'a Self::Item> where Self: 'a;
    fn iter(&self) -> Self::Iter<'_>;
}

Instead of using this proposed trait, std has things like

impl Collection<T> {
    fn iter(&self) -> Iter<'_, T> { /* ... */ }
}

impl<'a, T> Iterator for Iter<'a, T> { /* ... */ }

using concrete types.

You could write out explicit and type erased forms and tie them altogether... but it's probably not worth it without a reason to do so, and I'm not going to make the effort myself.


Further code-level suggestions

Don't use return at the end of blocks; Rust is expression oriented.

     fn do_something(&self) -> String {
-         return format!("Thing1 {:?}", self);
+         format!("Thing1 {:?}", self)
     }

Don't use indexing when you can use iterators. When implementing custom iterators, this can often be accomplished by just holding onto an inner iterator.

struct ThingHolder1Iterator<'a> {
-    counter: usize,
-    holder_reference: &'a ThingHolder1
+    inner: std::slice::Iter<'a, Box<dyn ThingBase>>,
}

impl<'a> ThingHolder1Iterator<'a> {
    fn new(holder_reference: &'a ThingHolder1) -> Self {
-        return Self { counter: 0, holder_reference: holder_reference }
+        let inner = holder_reference.content_list.iter();
+        Self { inner }
    }
}

impl<'a> Iterator for ThingHolder1Iterator<'a> {
    type Item = &'a dyn ThingBase;
    fn next(&mut self) -> Option<Self::Item> {
-        if self.counter == self.holder_reference.content_list.len() {
-            None
-        } else {
-            self.counter += 1;
-            Some(&*self.holder_reference.content_list[self.counter - 1])
-        }
+        self.inner.next().map(|bx| &**bx)
    }
}

(Side note that you can use Self { field } instead of Self { field: field } too.)

(Intermediate link)

Maybe it's just an aspect of a minimized example, but note how the two iterators are exactly the same now, due to where your type erasure is (within each holder). You could use the same for both.

Maybe it's just an aspect of a minimized example, but if you never need your concrete iterators and only type-erased ones, you could get rid of the concrete ones and all their boilerplate altogether.

 impl HolderBase for ThingHolder1 {
     fn iterate_over_content(&self) -> Box<HolderIterator<'_>> {
-        Box::new(ThingHolder1Iterator::new(self))
+        Box::new(self.content_list.iter().map(|bx| &**bx))
    }
 }

-struct ThingHolder1Iterator<'a> { /* ... */ }
-impl<'a> ThingHolder1Iterator<'a> { /* ... */ }
-impl<'a> Iterator for ThingHolder1Iterator<'a> { /* ... */ }

(Another playground.)

Probably it's just an aspect of a minimized example, but at this point both ThingHolders are practically the same too.


Final vague thought

The fact that you're type erasing at every level and reaching for a new trait for every bit of functionality makes me wonder if you're trying to apply an OO paradigm to Rust overly hard. On the other hand, maybe it makes perfect sense for your use case. It's hard to be sure from the example.


  1. i.e. not dyn Trait specific ↩︎

  2. or when there are additional methods, but they all have default bodies and no reason to override them ↩︎

2 Likes

That's a lot to digest... I'll take my time to read, and I already know I'll have to go back and re-read a lot of documentation about lifetimes. I really appreciate your effort to write all of that. Thanks a lot again!

What I'm realizing right now is that lifetimes are really everywhere and if I don't specify them, the compiler implies/infers a lot of things behind the scene. I get this makes writing code easier, but at the same time it makes harder to understand what happens.

Regarding your final considerations about the structure of my code: yes, I'm 100% applying OO reasoning.
I gave myself the homework to build a raytracer as an exercise to learn Rust. Raytracing is a topic pretty suitable for OO, so I think what I'm trying to do is correct for my use case - avoiding this trait-y and OO-like code would require me to re-think a whole section of my application, maybe leveraging on some feature of Rust I still don't know or I don't know it can be used to simplify this specific part of my project.

Thanks again, bye!

Hello Quinedot, may I ask you further details?

You say that foo here returns a Thing instance with a lifetime specific to the implementation, while bar return a Thing instance that has the same lifetime as &self. I think I get the bar case, but I cannot grasp the meaning of "specific to the implementation" relative to foo. Could you kindly explain further, possibly with some examples of both use cases?