Stabilized Trait Upcasting does not work with borrows

Hello,
according to the following pages, upcasting has been stabilized with Rust 1.86:

but still minimal code example below does not compile (Stable, Edition 2024, Rust 1.92):

trait Sup {
    fn do_sup(&self);
}

trait Sub: Sup {
    fn do_sub(&self);
}

impl Sup for () {
    fn do_sup(&self) {
        println!("Doing sup things!");
    }
}

impl Sub for () {
    fn do_sub(&self) {
        println!("Doing sub things!");
    }
}

fn expecting_borrow_sup_box(a: &Box<dyn Sup>) {
    a.do_sup();
}

fn expecting_sup(a: &dyn Sup) {
    a.do_sup();
}

fn main() {
    // Stack allocated trait objects, i.e. borrowed values work
    let a: &dyn Sub = &();
    expecting_sup(a);
    
    // Coercion requires strange syntax
    // FAILS:
    // let b: &Box<dyn Sub> = &Box::new(()) as &Box<dyn Sub>;
    
    // WORKS:
    let b: &Box<dyn Sub> = &(Box::new(()) as Box<dyn Sub>);
    
    // Upcasting fails for borrowed values (owned values work in (some cases))
    //expecting_borrow_sup_box(b);
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error[E0308]: mismatched types
  --> src/main.rs:41:30
   |
41 |     expecting_borrow_sup_box(b);
   |     ------------------------ ^ expected trait `Sup`, found trait `Sub`
   |     |
   |     arguments to this function are incorrect
   |
   = note: expected reference `&Box<(dyn Sup + 'static)>`
              found reference `&Box<dyn Sub>`
note: function defined here
  --> src/main.rs:21:4
   |
21 | fn expecting_borrow_sup_box(a: &Box<dyn Sup>) {
   |    ^^^^^^^^^^^^^^^^^^^^^^^^ ----------------

For more information about this error, try `rustc --explain E0308`.
error: could not compile `playground` (bin "playground") due to 1 previous error

It looks like upcasting works for:

  • Raw stack allocated trait objects
  • Owned smart pointers

But fails for references to smart pointers.

Is this is expected and why doesn't rust support it for borrowed smart pointers?.

This restricts you to owned values. It seems the coercion operator also has not been updated to catch trait objects inside smart pointers.

1 Like

This is expected and AFAIK cannot be supported. In particular coercion to a trait object and upcasting work only when the data being coerced/upcasted is behind only one level of indirection (i.e. one pointer/reference/smart pointer/etc etc). In your case the () in &Box<()> is behind two pointer so it's not supported.

To understand this you need to understand first that the vtable for the trait object is stored in the pointer pointing to it. So for example in the case of Box<dyn Sup> the Box is actually made up of pointers: one pointing to the heap allocation and one pointing to the vtable. When you cast a Box<()> to a Box<dyn Sup> or a Box<dyn Sub> to a Box<dyn Sup> the pointer to the vtable is either added (for Box<()> which had none before) or possibly changed (for Box<dyn Sub>, i.e. when performing upcasting). In order to do this you have to mutate Box itself, so it's not possible to do if what you actually have is a reference to the Box.

10 Likes

Thank you for your explanation. I think the error message is confusing in that case.

I did not expect that "coercion" or "upcasting" triggers any operation and is a thing that only takes place at the type level (and just changes the way how the compiler translate interactions with the type). Especially since the compiler knows that Sub must implement everything a Sup has (as denoted by the type bounds) it sounds quite inefficient to me to replace the vtable. The original one already contained all function pointers with the exact same matching function signatures. So in theory there should be no need to replace it. At least for going from Sub to Sup (which I would expect to be more common case).

Maybe I misunderstood you regarding indirection and modification of the Box itself, because it's not clear to me why storing the vtable in the pointer implies it can be only one level indirection. For example: Vec<Box<dyn Sub>> has one more level of indirection, but both structures are owned. Why shouldn't the compiler replace the vtables in such a case?

dyn Sup is a concrete type, not a generic placeholder. Every instance of this type has to have the exact same layout, which includes the vtable.

5 Likes

I understood that the compiler currently requires this. I was just curious about the reasoning behind this, since it maybe could be done more efficiently.

For static dispatch the "one concrete type" works due to monomorphization but applying the same constraints to trait objects does not seem to translate well (in sense of ergonomics).

To wrap it up I thought there has been implemented an approach (or planned?) like this:

  • Got a dyn Sub
  • This function requires a dyn Sup
  • Provide a dyn Sub as dyn Sup since it anyway is a pointer with fixed size and a vtable that at least contains the same functions as a dyn Sup would contain
  • (Constraint: Keep Layout compatible for types implementing multiple traits or add indirection to trait specific vtables so that all tables have a fixed layout)

The compiler sees that Vec<Box<dyn Sub>> contains a pointer to Box<dyn Sub>, but it doesn't know if there's actually a Box<dyn Sub> there or how many there are to update.

How would you do this in general? How can the trait vtable for Sub be compatible with the trait vtables for both Sup1 and Sup2?

You may claim this could work with only a single supertrait Sup, but that requires committing to exposing this implementation detail; it would also make a breaking change to add a new supertrait to Sub (assuming you already handled the other semver issues of course).

What do you mean with adding a new super trait, something like this?:

trait Sup1 {
// fn a
}
trait Sup2 {
// fn b, fn c
}

trait Sub: Sup1 + Sup2 {
// fn d
}

// vtable for every type implementing Sub:
fn a
fn b
fn c
fn d

// vtable for every type implementing Sup1
fn a
empty
empty
empty

// vtable for every type implementing Sup2
empty
fn b
fn c
empty

(I See the limitation and that such approach would easily create really big tables wasting memory). Instead one could just store pointers to multiple vtables:

// Type implementing Sub (and therefore Sup1 and Sup2):
vtable to Sub functions for type
vtable for Sup1 functions for type
vtable for Sup2 functions for type

// Type implementing Sup1:
empty
vtable for Sup1 functions for type
empty

Maybe more sophisticated approaches are possible involving a combination of static and dynamic dispatch based on function argument trait bounds. But maybe requires architectural changes in the compilation and linkage process of the rust toolchain.

I'm not sure how this relates to server compatibility or how this on it's own would be a breaking change, since:

  • It's anyway the responsibility of the developer to handle this correctly, i.e. changing major version if an API change breaks backward compatibility.
  • Rust Programs are anyway compiled from source and need re-compilation depending on what has been changed and adding a new super trait means adding a new trait bound to all types implementing Sub in this example -> Already a breaking change.

This is a major draw that's not easy to solve.

Moreover it requires delaying the codegen of a trait object until all its usages are known, which is not ideal for compile times.

When those more sophisticated approaches will be discovered this limitation will be able to be fully lifted.

Sorry, that point assumed familiarity with how trait upcasting is currently implemented. Currently the vtable layout is compatible with the first supertrait, while for the second and on the vtable stores a pointer to it and upcasting to those other traits will replace the vtable pointer with the one from the subtrait vtable.

This means that technically, with the current implementation details, upcasting to the first supertrait (and only that one) is a noop. This would allow some of the upcasts that you were originally expecting to work, but it would require exposing this rather arbitrary implementation detail. If this was done (i.e. upcasting to the first supertrait becomes a language level noop, basically making those trait objects subtypes) then adding a new supertrait as first supertrait would be a breaking change for a library.

ps: it's semver (short for Semantic Versioning)

1 Like

Care to explain how do you plan to do that?

Find copy of The Annotated C++ Reference Manual and you find a long discussion about pluses and minuses of different approaches, there.

Yes, this story if over 35 years old, Rust have just moved vtable pointers from objects to “fat pointers”, the rest of the story is the exact same as it was with C++, back then.

Yes, but they are not compiled from source as one unit of compilation that includes all the crates in the project.

Your scheme would need just that scheme… and it looks just a tiny bit impractical to provide something that not a lot of developers need or want.

Maybe, but expect these to be discovered closer to 2060. Lindy's Law says that's the realistic expectation.

No one seems to have stated this simply and directly yet:

The situations where trait upcasting can happen are the same as the situations where coercion to dyn Trait in the first place can happen.[1]

Incidentally the RFC went over many of the talking points.


  1. Your playground actually contained an example of this: FAILS: let b: &Box<dyn Sub> = &Box::new(()) as &Box<dyn Sub>; ↩︎

6 Likes

That's true… but how is that relevant? These are two different operations, there are no need for them to only be applicable in the same situations…

An explanation of where trait upcasting can apply is relevant to a thread about where trait upcasting can apply. That is how the language works today.

3 Likes

I mean I just took naive examples to solve it.
So is it really some unsolved issue? Other languages solved it by using object vtables instead of Fat Pointers as far as my understanding goes.

That was a autocorrect :slight_smile: But I don't get how this is related to semver compatibility. Semver compatibility of the compiler or of the source code someone writes? I think the first is solved by introducing new Rust Editions. In the second case if someone for example distributes a Lib and adds a new Supertrait it's anyway a breaking change. You require all users to implement the new Supertrait for their types.

I gave a description in my post to which you replied.

That's why I said it would probably require architectural changes in the compilation and linkage process.

Why do you think this is something which "not a lot of developers need or want". I'm not aware of any data backing this. Based on how many questions are asked about some form of polymorphism in Rust I would disagree with a general statement like this.

Thanks for the hint, the RFC Is very informative.

&dyn Trait is a type strictly linked to the trait doing as &dyn Sup
a statement like &dyn Trait as &dyn Sup is somewhat similar to doing a i32 as i8
and &(&dyn Trait) as &(&dyn Sup) is invalid because it would be kinda like doing &i32 as &i8

3 Likes

Relevant conversation on zulip: #t-lang > Should we support dyn dispatch on &Adt<_>?

Nope, “other languages” (if you mean C# or Java) solved it by JIT-compiling everything and constructing these vtables at runtime. Then garbage-collection the unused ones.

Yes, but the suggestion you gave offered is clearly not serious. It needs to turn the whole language on its ear to implement fringe, rarely used feature.

From experience. It's something that C++ supports but that's not used very often.

The number of people who even register on that forum is such a tiny percentage of users that “questions asked” is not a representative of anything.

And ultimately it's question not “how often that question is asked” but “how often this would benefit anyone”.

Given the fact that you are proposing to impose huge cost on everyone to benefir very few (most developers don't even know such features may exist to ask them on forums)… the onus is on you to prove that it's worth it.

Even the existing coercion is pretty “heavy” feature and not easy to support, what you are proposing would be on the order of magniture more impressive and on the order of magnitude less useful (given that what we have is already of limited utility).

I just wanted to discuss this openly and simply saying nobody wants this or needs it without any data backing this is just subjective or an opinion (as saying it's total beneficial for everyone to support this, which I did not and don't want to claim). I just argue there are situations there this would be beneficial and besides from Rust it's a very common pattern to have polymorphic types in programs. An example would be something like this:

trait A {
    fn do_a(&self);
}

trait B: A {
    fn do_b(&self);
}

impl A for () {
    fn do_a(&self) {
        println!("Do a");
    }
}

impl B for () {
    fn do_b(&self) {
        println!("Do b");
    }
}

fn use_a_vec(data: &[Box<dyn A>]) {
    for item in data {
        item.do_a();
    }
}

fn use_b_vec(data: &[Box<dyn B>]) {
    for item in data {
        item.do_b();
    }
    // Fails
    use_a_vec(data);
}

fn main() {
    let data = vec![Box::new(()) as Box<dyn B>, Box::new(()) as Box<dyn B>];
    use_b_vec(&data[..]);
}

Playground

(EDIT: Same issue occurs with slices of references to trait objects, i.e. when not using Box or other Pointer types)

The only way I found out so far to make this work was to create a new Vec and clone and coerce every Box element-wise leading to a lot of additional allocations just for being able to work with the Supertrait.

I was referring to C++, which is not a JIT compiled language.

I would say that it's very subjective without “a business case”, at least one. Not “here is the neat trick that we may add to the compiler to make it even more complicated”, but “here is the business case that one may want to do” (not related to Rust in any shape or form) and then series of steps that made you arrive at your desire.

Because when dealing with Rust it's incredibly common to just an attempt to shove square peg into a round hole and very often you may avoid the corner you paint yourself into by simply doing different choices on steps that lead to that corner. Thus discussing other alternatives that could be applicable to your “business case” would be more prudent that expect that people added a way to bring OOP into Rust “via the back door”.

From what I understand the fact that people would try to bring OOP into Rust via even existing coercion mechanism was quite real but eventually it was decided that it's not too bad and there are valid usecases for it, besides emulation of OOP.

C++ have the exact same problem, sorry:

class A {
 public:
  virtual void do_a() {
    std::cout << "Do a\n";
  }
};

class B : public A {
 public:
  virtual void do_b() {
    std::cout << "Do a\n";
  }
};

void use_a_vec(std::span<std::unique_ptr<A>> data) {
    for (auto& item : data) {
        item->do_a();
    }
}

void use_b_vec(std::span<std::unique_ptr<B>> data) {
    for (auto& item : data) {
        item->do_b();
    }
    use_a_vec(data);
}

int main() {
    std::array<std::unique_ptr<B>, 2> array = {
        std::make_unique<B>(),  std::make_unique<B>()
    };
    use_b_vec(std::span<std::unique_ptr<B>>{begin(array), end(array)});
}

And for exact same reason (which were expressed very explicitly in the ARM 35 years ago… that's precisely why I was talking about 35 years even when Rust wasn't even dreamed about 35 years ago… Rust uses the same approach C++, just a slightly modified version — and have the same limitations).

Yup. And it works precisely like Rust: conversion from std::unique_ptr<B> to std::unique_ptr<A> is possible, conversion from std::span<std::unique_ptr<B>> to std::span<std::unique_ptr<A>> is not possible:

clang:
error: no matching function for call to 'use_a_vec'
   30 |     use_a_vec(data);
      |     ^~~~~~~~~
note: candidate function not viable: no known conversion from 'span<unique_ptr<B>>' to 'span<unique_ptr<A>>' for 1st argument
   20 | void use_a_vec(std::span<std::unique_ptr<A>> data) {

gcc:
error: could not convert 'data' from 'span<unique_ptr<B>>' to 'span<unique_ptr<A>>'
   30 |     use_a_vec(data);
      |               ^~~~
      |               |
      |               span<unique_ptr<B>>

msvc:
error C2664: 'void use_a_vec(std::span<std::unique_ptr<A,std::default_delete<A>>,18446744073709551615>)': cannot convert argument 1 from 'std::span<std::unique_ptr<B,std::default_delete<B>>,18446744073709551615>' to 'std::span<std::unique_ptr<A,std::default_delete<A>>,18446744073709551615>'
note: No user-defined-conversion operator available that can perform this conversion, or the operator cannot be called
note: see declaration of 'use_a_vec'
note: while trying to match the argument list '(std::span<std::unique_ptr<B,std::default_delete<B>>,18446744073709551615>)'

The fact you doin't even know that tells us about frequency of that need for the feature, don't you think?

Note that C++ support is very limited and often requires explicit conditional casts that may or may not be safe. Rust doesn't play these games…

1 Like

That approach works best if all the interfaces/classes that a class implement are known ahead of time and are a finite number. Working around that requires again delaying the codegen of such vtables until all trait object usages are know, which has its own set of challenges and downsides.

Object vtables also force the vtable to be present in every object, but having a vtable in every struct would not be acceptable in Rust.

Yes I was referring to the second case, but it was very much a nit that just adds to the other downsides. Right now it's not necessarily a breaking change if you also add a blanket implementation of the new supertrait for the subtrait, or if that supertrait was already somehow required in other ways.

Is this solution acceptable? Rust Playground

1 Like