Trait objects with associated types

I just read the section in the book of the same title as this post, and I am confused. It says that when operating with Box'ed traits with associated types, the implementations of the child types should be explicitly declared like this:

let graph = MyGraph;
let obj = Box::new(graph) as Box<Graph<N=Node, E=Edge>>;

Doesn't this defeat the purpose of associated types? If the trait object "knows" the exact implementation of the trait, and that impl defines the specific associated types, than isn't it logical to assume that trait object should know about them as well? Am I missing something?

How do you type check obj.edges(x) if N isn't defined?

It may be redundant in your example, if there is only one known implementation of the trait for the object, but it's necessary when receiving the trait object in a function or a struct.

struct GraphCollection {
    graphs: Vec<Box<Graph>>,  //What is N and E here?
    ...
}

When you have a trait object Box<Graph>, it somehow contains the information of the specific implementation of Graph (I suppose, via something like a virtual method table). If the compiler knows to convert graph.add_node() to MyGraph::add_node, I don't see, why it can't also do the same to the methods of Graph::E and Graph::N.

You will never be able to use N and E if they are undefined. You can't possibly know what they can be used for and what fields they may have if they are structures at all. Let's extend my example:

impl GraphCollection {
    fn use_graphs(&self, ...) {
        for graph in &self.graphs {
            //What can I do here?
        }
    }
}

I can't use has_edge or edges because I don't know what N is. Each graph may as well have different N and E types and that will make them even more useless here.

[quote="eterevsky, post:4, topic:746"]
via something like a virtual method table
[/quote]The methods need to have compatible signatures or else how can you call them? This isn't a scripting language.

The methods need to have compatible signatures or else how can you call them? This isn't a scripting language.

I agree, but I think in most practical examples the associated types will also implement some traits. Consider this:

trait NodeTrait {}
trait EdgeTrait {}

trait Graph {
  type N: NodeTrait;
  type E: EdgeTrait;
}

Now, can I handle the boxed trait objects like

Box<Graph<N=NodeTrait, E=EdgeTrait>>

instead of the concrete types?

Come to think of it, even this explicit specification of the traits in the Box shouldn't be required. The types Graph::N and Graph::E should just silently implement all the traits that are defined for them in the parent trait.

If you're after dynamic dispatch you should probably go with trait objects all the way:

fn edges(&self, &NodeTrait) -> Vec<Box<EdgeTrait>>;

I still don't quite understand, why I can't do the following. Suppose I have a vector of boxed graphs (containing different implementations of the trait), and I want to iterate over it, applying some function to its members:

...
let graphs: Vec<Box<Graph>> = ...;
for graph in graphs.iter() {
  count_edges(graph);
}
...
fn count_edges<G: Graph>(graph: G) -> u32 {
  for node in graph.iter_nodes() {
    ...  // node is of the type &G::N
  }
}

Inside of the loop, we may call any methods of the traits defined for Graph::N.

This might be pretty useful. The count_edges function will use static dispatch, since all the types inside it are defined. But if I understand correctly, it's currently impossible to do this, because you are not allowed to create a non-uniform vector of trait objects to begin with.

Isn't this like the point of the difference between static and dynamic dispatch?
This count_edges relies on static monomorphization to find out the signatures of a node's methods (each copy of count_edges specific to its concrete G will use a different signature). After type erasure when you make a trait object, the virtual methods signatures need to be uniform, because there's no way to adapt to them at runtime.

Can I ask why dynamic dispatch variant from my previous post doesn't work for you?

Dynamic dispatch does work, of course. And, say, in Java it will be the only way. But since Rust already has associated types, why not use them here?

I do not see any fundamental problems with switching from dynamic dispatch to static dispatch as in my example. The top call is performed using a virtual method table (or its equivalent), but once we land in an instance of the count_edges method (BTW, I forgot the & in its declaration; of course, graph is passed by reference.), we have the complete static knowledge of the implementation of class G.

The benefits of such approach are clear: firstly, it many cases it will make class hierarchies and function declarations much simpler. The very idea of associated types was to remove redundant repetitions of type parameters in function declarations. Secondly, it's performance. While there is no substantial difference between static and dynamic dispatch on the top level, on the lower level the difference in performance may become significant. For example, in my very example from the previous comment if you change graph.iter_nodes() from iterating by direct node references to iterating by boxed traits, it will require: a) allocating memory for trait objects for each node, b) creating those trait objects, c) dispatching the call to the node via the trait wrapper. In case of very simple calls, you can get huge overhead from this.

The thing is that the concrete types of the graphs were erased when they were turned into Box<Graph>, as far as I know. The vtable in a Box is only a reference to the methods and not to what type it was (at least that's how I interpreted it from this post). That makes the above example tricky. It will probably try to call count_edges::<Box<Graph>> and that will make G::N very hard to determine at compile time. Everything must be known at compile time to make things work. I know too little about the internals of the compiler, but I don't think it's possible to select the correct version of count_edges in this case.

Thanks, I see the problem now. So, we can branch by the trait's methods implementation, but it's harder to do for a 3rd party function. But what if count_edges() is a method of the Graph trait? Surely, it'll be dispatched to its correct implementation, and it will know which implementations of the associated classes to use.

Speaking of vtables. They are not written directly into trait objects. A trait object contains a reference to vtable. This reference may be used as an identifier for trait implementation. For a 3rd party generic function over a trait object you could store a hashmap of implementations depending on the concrete type of the object. There's one caveat though with this approach: for a function that accepts several independent generic parameters, you'll have to compile a version for each combination of concrete types of the parameters. This will lead to an exponential growth of the number of versions of the function, which is bad. Maybe it'd be ok if such dynamic-to-static dispatch was restricted to just one type-parameter at a time?

It would definitely work if count_edges was a trait method, given that N and E are known. It's hard to utilize the vtable pointer, since it's determined at runtime, but I don't know. It may be possible somehow.

Similar to @eterevsky, I thought that associated types could be used to encode existential types (abstract types), as attempted below.

However, the only way I can get rustc to accept this code is when I say exactly what these associated types are, and expose their identities in terms of concrete types (i.e., String and u32).

This is exactly what I wanted to avoid. In particular, the tuple pattern that works below doesn't generalize to the case where I have an array or a list.

In those cases, I guess I need to abstract the associated type with another trait?
Is that right?

Details:

The trait T has associated type t. I define two implementations for this trait, with different definitions of t. Then, I try to create a pair where the first and second components of the pair look "the same". That is, I want that both just mention trait T, and each leaves t abstract, via an implicit existential quantifier. However, that version of the code is rejected by rustc. It insists that I give the concrete identity of t in both cases.

Questions:

Suppose, I want to create an array, list, or other generic structure with T objects, not just pairs of T objects. The pattern below is undesirable, since in these cases, I want the two pair components to have exactly the same type, including exactly the same associated types.

What's the canonical solution to this kind of problem?

In OCaml, I'd use first-class modules. I'm stuck since I can't find the corresponding pattern in Rust. Perhaps it requires a more involved use of trait objects?

// Traits can have associated types:
#![feature(collections)]

use std::fmt::Debug;
//use std::string;

trait T : Debug {
    type t;
    fn get (self:&Self) -> Self::t ;
    fn foo (self:&Self, Self::t ) -> String ;
}

#[derive(Debug)]
pub enum Foo { Foo }

#[derive(Debug)]
pub enum Bar { Bar }

impl T for Foo {
    type t = String;
    fn get (self:&Self) -> String { String::from_str("foo") }
    fn foo (self:&Self, s:Self::t) -> String { s }
}

impl T for Bar {
    type t = u32;
    fn get (self:&Self) -> u32 { 0 }
    fn foo (self:&Self, _s:Self::t) -> String { String::from_str("bar") }
}

#[derive(Debug)]
struct Packed {
    two : (Box<T<t=String>>,Box<T<t=u32>>)
}

pub fn main () {
    let a = Packed { two : (Box::new(Foo::Foo),
                            Box::new(Bar::Bar)) } ;
     println!("{:?}", a);
}
1 Like

I'm not sure I understand the problem correctly, but are you asking if you can "forget" the associated types in Packed? That's impossible because you can't know what kind of T it stores and thereby not use them. You can always use generics to leave the choice to the user:

struct Packed<A, B> {
    two: (Box<T<t=A>>, Box<T<t=B>>)
}

Yes, I want to "forget" the associated types.

In principle, this seems possible, since Rust has a mechanism of using objects that are trait T to forget whether the object is a Foo or a Bar, as in my example above. Forgetting this distinction is the entire point of traits, as I understand them (which is still limited).

Unfortunately, the solution that you give just pushes the problem to the user (using the generic types A and B). I don't want the user of this code to specify that type, I want the code that creates the boxes to make that choice, not the consumer of the boxed T.

Also, and perhaps more importantly, the pattern that you give doesn't work when I want a list or an array where each Box<T> in the structure makes a different choice for the associated type T::t.

An associated type can't be forgotten if you want to be able to use it. It is just like any other type parameter, with the exception that it's impossible to make multiple implementations of the same trait where only the associated type is different. If you would have something of the type Box<T>, where T::t is erased, you wouldn't be able to use get or foo because you wouldn't know what get returns or what foo takes. A way to totally erase T::t is to use an other trait as a "second layer":

trait AnyT: Debug {
    //...
}

impl<A, B: T<t=A>> AnyT for B {
    //...
}

struct Packed {
    two : (Box<AnyT>,Box<AnyT>)
}

You will not be able to use anything that depends on T::t because it is no longer known, but you can implement proxy methods for independent methods. Is this closer to what you want to do?

Yes, this is the pattern that I've been trying to find. Thanks @ogeon!

It seems that the way one leaves an associated type t abstract is by forgetting about it altogether. In particular, one can introduce a new trait that omits the associated type t entirely. Then, one can write a conversion that works universally. I use trait S below for this purpose (@ogeon's suggestion above uses trait AnyT for this purpose).

Here's that suggested pattern applied to my earlier example:

trait T : Debug {   // The same as before
    type t;
    fn get (self:&Self) -> Self::t ;
    fn doit (self:&Self, Self::t ) -> String ;
}

trait S : Debug { // S is implemented for all T below
    fn doit (self:&Self) -> String ;
}

#[derive(Debug)]
struct Packed {
    two : (Box<S>,Box<S>)  // Use trait S here, not trait T
}

impl <A, B: T<t=A>> S for B { // Convert any T object into an S object
    fn doit (self:&Self) -> String {
        self.doit ( self.get () )
    }
}

In sum, I've added trait S, which is similar to T except that associated type T::t is absent.

This pattern is satisfying since it provides three useful views of associated type t in trait T:

  1. The concrete view in Foo and in Bar, where t is known.

  2. The abstract view of t, e.g., within the implementation of S for any trait T. In this case, t is known, but it is abstract.

  3. The "forgetful" view provided by S, where T::t is absent, and thus any differences in type T::t are irrelevant. In this example, boxed versions of Foo::Foo and Bar::Bar can be given a common type Box<S>. Hence, they can be placed into data structures such as arrays and lists, etc.

This makes sense now. Thanks again!

3 Likes