Problem with dynamic dispatch in nested datastructure

playground
I define a nested tree using trait Node for which I have two implementations struct NodeSingle and struct NodeSet, which differ in their payload (a single usize vs. a set of usize).
The trait Node requires that an iterator over the payload is returned but the type of this iterator depends on the struct implementing the Node trait (std::iter::Once<usize> vs. std::collections::hash_set::Iter<'a, usize>). So this should lead to dynamic dispatch as far as I understand.
I am a bit lost now that the code does not compile after adding some Box<..> around dyn ... in some places as suggested by the compiler.

Also, maybe only the usage of the code is incorrect as only the compilation of the test code fails now.

I guess, to switch to static dispatch, I would need to wrap the returned iterator into an enum capturing precisely the two allowed implementations of the trait Node (I saw something like this here / Simple Iterator Delegation; where also this example is referenced).
I have not tried this but I would not really be a fan of this either as I may want to allow users of my library to eventually extend the allowed implementations.

the code from the playground

#![allow(dead_code)]
use std::collections::{HashSet};
use std::fmt::{Debug};
use std::iter;

type BoxedNode<'a> = Box<dyn Node<'a, IterElements=Box<dyn Iterator<Item=usize> + 'a>>>;

trait Node<'a>: Debug {
    type IterElements;
    fn elements(&'a self) -> Self::IterElements;
    fn nested(&'a self) -> Option<&'a BoxedNode<'a>>;
}

#[derive(Debug)]
struct NodeSingle<'a> {
    data: usize,
    nested: Option<BoxedNode<'a>>,
}

impl<'a> Node<'a> for NodeSingle<'a> {
    type IterElements = Box<std::iter::Once<usize>>;

    fn elements(&'a self) -> Self::IterElements {
        Box::new(iter::once(self.data))
    }

    fn nested(&'a self) -> Option<&'a BoxedNode<'a>> {
        match &self.nested {
            None => { None }
            Some(x) => { Some(&x) }
        }
    }
}

#[derive(Debug)]
struct NodeSet<'a> {
    data: HashSet<usize>,
    nested: Option<BoxedNode<'a>>,
}

impl<'a> Node<'a> for NodeSet<'a> {
    type IterElements = Box<std::collections::hash_set::Iter<'a, usize>>;

    fn elements(&'a self) -> Self::IterElements {
        Box::new(self.data.iter())
    }

    fn nested(&'a self) -> Option<&'a BoxedNode<'a>> {
        match &self.nested {
            None => { None }
            Some(x) => { Some(&x) }
        }
    }
}

#[cfg(test)]
mod test {
    use super::*;

    #[test]
    fn test() {
        let mut node_single = NodeSingle { data: 0, nested: None };
        let mut set = HashSet::default();
        set.insert(1);
        set.insert(2);
        let node_set = NodeSet { data: set, nested: None };
        node_single.nested = Some(Box::new(node_set));
        println!("{:?}", node_single);
    }
}

compile error

   Compiling playground v0.0.1 (/playground)
error[E0271]: type mismatch resolving `<NodeSet<'_> as Node<'_>>::IterElements == Box<dyn Iterator<Item = usize>>`
  --> src/lib.rs:67:35
   |
67 |         node_single.nested = Some(Box::new(node_set));
   |                                   ^^^^^^^^^^^^^^^^^^ type mismatch resolving `<NodeSet<'_> as Node<'_>>::IterElements == Box<dyn Iterator<Item = usize>>`
   |
note: expected this to be `Box<dyn Iterator<Item = usize>>`
  --> src/lib.rs:42:25
   |
42 |     type IterElements = Box<std::collections::hash_set::Iter<'a, usize>>;
   |                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   = note: expected struct `Box<dyn Iterator<Item = usize>>`
              found struct `Box<std::collections::hash_set::Iter<'_, usize>>`
   = help: `std::collections::hash_set::Iter<'_, usize>` implements `Iterator` so you could box the found value and coerce it to the trait object `Box<dyn Iterator>`, you will have to change the expected type as well
   = note: required for the cast from `Box<NodeSet<'_>>` to `Box<dyn Node<'_, IterElements = Box<dyn Iterator<Item = usize>>>>`

For more information about this error, try `rustc --explain E0271`.
error: could not compile `playground` (lib test) due to 1 previous error

The error message is confusing to me as (in help:) it states that the concrete iterator (seems to) implement the Iterator trait as required. Also the following suggestion seems to suggest to use another Box wrapping...

your definition require the assocated type IterElements be Box<dyn Iterator>

but your actual impls have IterElements be concrete iterator types, e.g.:

  • <NodeSingle as Node>::IterElements = Box<std::iter::Once<usize>>

  • <NodeSet as Node>::ItelElements = Box<std::collections::hash_set::Iter<'a, usize>>;

but your design has other problems as well, most notably, the type &'a BoxedNode<'a>in the type signature of nested() method will cause problems:

this kind of design feels very OOP, it's much simpler (and more idiomatic) to just define node types for tree-like data structure as enum. also, you probably don't want to use borrows for the sub elements, use Box for tree-like structure, and Rc/Arc for DAG-like structure. if cycle is possible, Weak should also be used. your original nodes can be simply changed to use enum like this:

enum Node {
    Single {
        data: usize,
        nested: Option<Box<Node>>,
    },
    Set {
        data: HashSet<usize>,
        nested: Option<Box<Node>>,
    }
}

as for the iterator, either trait objects or another enum are both ok to me, although trait objects are simpler to implement.

impl Node {
    // to be compatible with generic element types, this iterator yields references
    // if `usize` is the only element type, this can be changed to yields values
    // implementation can use `.copied()` for this
    fn elements(&self) -> Box<dyn Iterator<Item=&'_ usize> + '_> {
        match self {
            Node::Single { ref data, ..} => Box::new(iter::once(data)),
            Node::Set { ref data, ..} => Box::new(data.iter()),
        }
    }
}
1 Like

When trying to type-erase complicated traits like this, I often find it useful to define two traits instead, one that isn’t object-safe and directly specifies the desired behavior, and one that is object-safe with a blanket implementation for anything that implements the first. In your case it might look something like this:

type BoxedIter<'a> = Box<dyn 'a+Iterator<Item=usize>>;
type BoxedNode = Box<dyn ErasedNode>;

trait Node: Debug {
    fn elements(&self) -> impl Iterator<Item=usize>;
    fn nested(&self) -> Option<impl Node>;
}

impl<T:Node+?Sized> Node for &'_ T {
    fn elements(&self)->impl Iterator<Item=usize> {
        (*self).elements()
    }

    fn nested(&self)->Option<impl Node> {
        (*self).nested()
    }
}

impl<T:Node+?Sized> Node for Box<T> {
    fn elements(&self)->impl Iterator<Item=usize> {
        <T as Node>::elements(&**self)
    }

    fn nested(&self)->Option<impl Node> {
        <T as Node>::nested(&**self)
    }
}

trait ErasedNode: Debug {
    fn dyn_elements(&self)->BoxedIter<'_>;
    fn dyn_nested(&self)->Option<Box<dyn ErasedNode+'_>>;
}

impl<T:Node> ErasedNode for T {
    fn dyn_elements(&self)->BoxedIter<'_> {
        Box::new(self.elements())
    }

    fn dyn_nested(&self)->Option<Box<dyn ErasedNode+'_>> {
        Some(Box::new(self.nested()?))
    }
}

impl Node for dyn ErasedNode +'_ {
    fn elements(&self)->impl Iterator<Item=usize> {
        self.dyn_elements()
    }

    fn nested(&self)->Option<Box<dyn ErasedNode+'_>> {
        self.dyn_nested()
    }
}

(I may have gone a little overboard here…)

1 Like

Thank you both for your messages. I think that I will have to learn a lot more to get my project done :melting_face:

Actually, to provide a little more context, I want to implement an algorithm on top the "tree" trait Node (or a suitable alterantive of it).
Hence, this trait acts as an interface between concrete trees (user side) and my tree-based algorithm (library side).
The considered instantiation was to check the usability of this tree trait from the user side.

solution of @nerditation

  • Pros:
    • Your proposal really makes the code a lot simpler.
    • I tried to elaborate it a bit in the playground.
    • The trait I implemented for it also appears suitable.
      trait Node: Debug {
          fn elements(&self) -> Box<dyn Iterator<Item=&'_ usize> + '_>;
          fn nested(&self) -> Option<&Box<Self>>;
          fn set_nested(&mut self, node: Self);
      }
      
  • Cons:
    • Assuming that I provide code for N kinds of nodes, then somebody needing an additional kind of nodes would have to duplicate the whole enum with its impl first to extend it. I guess this is why I believed to need to move towards the dynamic dispatch.
Comment

You stated:

Still, I do not get why Box<std::iter::Once<usize>> cannot be provided when Box<dyn Iterator> is expected. Yes, apparently they have to be equal and not just in implementation relation but isn't this strange? Presumably that's just my Java background acting up here.

solution of @2e71828

  • Pros:
    • The kind of coupling in the enum-based solution of @nerditation is not assumed, which want (and right now believe to need).
  • Cons:
    • Your proposal is a bit unclear to me even after spending a lot of time trying to figure it out :face_with_peeking_eye: so I hope that you can help me with understanding some points of it?
Q1: restrictions of the static dispatch version?

Based on the two test functions, I believe that the solution supports static as well as dynamic dispatch, right?

That a solution with static dispatch is possible surprised me.
I then figured out (Playground line 140) that the first assignment to the field ..nested using node_single.nested = Some(...) fixes the type of that field.

I conclude that the static dispatch version is more restrictive and that I therefore need to know whether reassignments using other node-kinds are required in my case [or when I store multiple nested Nodes (to actually get a tree) in e.g. a vec, whether these nested Nodes should have the same type].
I suppose that, with your solution I may not need to decide this now as it supports both cases.

Q2: static dispatch version only?

I wondered whether your solution would be simpler when only supporting static dispatch.
Since the methods in the impl blocks impl<T: Node + ?Sized> Node for Box<T> {, impl<T: Node> ErasedNode for T {, and impl Node for dyn ErasedNode + '_ { are only called when using dynamic dispatch in fn test_dyn(), I thought that these could be removed but this drastic/naive 'simplification' is inadequate resulting in compile errors. However, having such code that is never called (when only using the static dispatch version) is also kind of strange, isn't it?

Q3: unreachable code?

I wondered how the methods in impl<T: Node> Node for &'_ T { could be executed? I added panic!() lines there but it appears that they are not called although this impl block is required for fn nested(&self) -> Option<&T> { in impl<T: Node> Node for NodeSingle<T> { where &T must be a Node.

Is there an obvious example on how this code could be called? In my attempts the & before a type is always somehow ignored.

Q4: static lifetime?

About the 'static lifetime: used in the default type parameter for NodeSingle and NodeSet as well as whn erasing the type of NodeSet in the test case.
Until now I somewhat never really used 'static and thought of it as something that should not be used if possible.

Returning the iterator, which refereces data in the tree, requires that the tree remains where it is until the iterators are dropped, right?
The same applies for the returned nested nodes because they are returned using the Box pointer?

Are there obvious limitations from this 'static lifetime?
I tried to add a lifetime paramter to NodeSingle and NodeSet but without a field using it, the lifetime could not stay (unless maybe with some phantomdata field?)

Although the former can coerce to the latter, there is no trait-based sub/supertype relationship. And no dynamic typing either.

I personally found it really helps to think of dyn Trait + '_ as a concrete type, with it's own implementations and so on. Which... it is! But a lot of learning material doesn't really dwell on this or even necessarily point it out.

1 Like

if the possible "sub"-types are open, then dynamic dispatch is a better choice. but it's a orthogonal (and different) problem for the compile error as shown in OP.

because they are different types.

yes, you can coerce the former to the latter, but if the type doesn't match, then your trait constraint is not satisfied.

let's call the former Foo and latter Bar,

the compile error doesn't say Foo and Bar don't match, it actually says dyn Node<IterElements = Foo> and dyn Node<IterElements = Bar> don't match, and so you cannot give a value Some(Box::new(node_set)) (which has type Option<Box<dyn Node<IterElements = Foo>>>) to node_single.nested (which expects Option<Box<dyn Node<IterElements = Bar>>>).

the fix for this issue (type mismatch) is very straitforward: you just impl the trait with different associated IterElements, (but as I already said, your code has other issues, and this will not fix them)

 impl<'a> Node<'a> for NodeSingle<'a> {
-    type IterElements = Box<std::iter::Once<usize>>;
+    type IterElements = Box<dyn Iterator<Item = usize>>;
     fn elements(&'a self) -> Self::IterElements {
         // no change to the implementation!!!
         // because of unsize coercion for box of trait objects
         Box::new(iter::once(self.data))
     }
     //...
 }
 
 impl<'a> Node<'a> for NodeSet<'a> {
-    type IterElements = Box<std::collections::hash_set::Iter<'a, usize>>;
+    type IterElements = Box<dyn Iterator<Item = usize>>;
     //...
 }

but then you realize that, to make the trait objects the same types, the associated type is totally unnecessary since they end up being the same for all the implementations (actually it should be obvious from the beginning when you decided to use trait object, if you know how trait objects work), so you just remove it and hardcode the iterator type, then both the trait and the implementations are drastically simplified:

type BoxedNode<'a> = Box<dyn Node<'a>;
trait Node<'a>: Debug {
    fn elements(&'a self) -> Box<dyn Iterator<Item = usize>>;
    fn nested(&'a self) -> Option<&'a BoxedNode<'a>>;
}

but still, I suggest you to re-think about lifetimes and ownerships.

This is correct. If you know up front the exact types involved, you can use static dispatch. If you need to decide them at runtime or store different types in a single collection, then you’ll want to look at dynamic dispatch.

It’s also possible to mix them. For example, you may have a subtree pattern that always uses the same structure— It can use static dispatch internally but be coerced into a dyn Node to go in a Vec with other node types.

Even in the leaf nodes, you have to specify a child node type for the static analysis to work. Here, it’s BoxedNode<‘static> because of the default type parameter, and that relies on the impl Node for dyn ErasedNode block to be a valid Node.

You can get rid of the ErasedNode machinery altogether if you define a different type to indicate the bottom of the tree:

impl Node for () {
    fn elements(&self)->impl Iterator<Item=usize> {
        std::iter::empty()
    }
    fn nested(&self)->Option<()> {
        None
    }
}

#[derive(Debug)]
struct NodeSingle<T=()> {
    data: usize,
    nested: Option<T>,
}

impl<T:Node> Node for NodeSingle<T> {
    fn elements(&self) -> impl Iterator<Item=usize> {
         iter::once(self.data)
    }

    fn nested(&self) -> Option<&T> {
        self.nested.as_ref()
    }
}

There’s two things to note here. First, because all of the methods take &self, you need at least two levels of reference to naturally invoke this.

Second, I defined the nested function in the trait to return Option<impl Node> so &T needs to implement Node for Option<&T> to be a valid return type. You can therefore force this codepath to be used by working with a generic Node, e,g, going through the type erasure path.

    #[test]
    fn test_ref_forwarding() {
        let node = NodeSingle { data:0, nested:Some(NodeSingle{data:1, nested:None::<BoxedNode<'_>>})};
        dbg!((&&node).elements().collect::<Vec<_>>());
    }

    #[test]
    fn test_ref_dyn() {
        let node = NodeSingle { data:0, nested:Some(NodeSingle{data:1, nested:None::<BoxedNode<'_>>})};
        let node:BoxedNode<'_> = Box::new(node);
        dbg!(node.nested().unwrap().elements().collect::<Vec<_>>());
    }

These are both correct. The rules for impl Trait in trait methods say that they implicitly capture all lifetimes in the function signature, which is more-or-less another way of saying the same thing.

Not really; it only appears as part of a default type parameter, so downstream code can (and likely will) override it with something else as necessary. If that happens, the ’static goes away as well.