How to have one code for dyn, impl and T?

Hi all! I'm here with a question, which can be expressed in "how to not repeat code".

I'll assume with all know a little about Generics, Impl and dyn (with Box), the main issue is that we can use the same code using the three of them, I'll continue with Generics and Box because are the easier to use as examples:

trait A<T> {}
trait B {}

fn fn_T<T, W>(j: W)
where
        W: A<T>,
{
        todo!()
}

fn fn_Box(j: Box<dyn A<Box<dyn B>>>) {
        todo!()
}

The first thing you need to notice is that both functions, fn_T and fn_Box can have exactly the same code in the body, and all this thread is about that, when the code in both are the same.

We could also add this to the examples, a lot simpler:

trait A {}

fn fn_T<T: A>(j: Vec<T>) {
        todo!()
}

fn fn_Box(j: Vec<Box<dyn A>>) {
        todo!()
}

Well, we have several ways to define this functions, lets start with the differences:

  • Using Generics the vector can only accept one type
  • Using Box, the vector can have any types that implements A, can all of them be different types

With this in mind we could think that Box is a lot better! but there is a trick...
Actually Box<dyn A> can't be optimized, as we know, Box will request at first to all the data already exists on memory before doing any operation, because do not know where the data is, this causes that all operations using Box can not be optimized by the compiler.

While if we use Generics, the compiler can do a lot of better job making optimizations on the code! faster code, and loose some flexibility on the input. If we are able to use other types that can even skip Heap generics are very very powerful in performance, a lot more than Box.

Lets do not go in the wrong side, this is not about how better or bad can be this alternatives, nor how can we implement them, the key point of what I wrote, is that each solution have its place, each one will have its advantages and disadvantages, and there can be other solutions, going back a little, the issue here is that we need to choose (Or I don't know how to mix both), if declare a function as a Box or Generics, and we choose in our needs.

We can also even implement the algorithm two times using in one function Box and in other Generics, which have the issue of duplicate all the code.

From what we have checked, in general Generics are fast, but more limited than Box, so in functionality we can think of Generics as a subset of Box, which relying on the title, trying to avoid duplicate title, would be nice be able to do something like this:

  • If at compile time the compiler knows we are calling the function with a vector where T do not implements dynamic dispatch (the vector contains only one type), use Generics building the function's args.
  • If the vector contains more than one type (dynamic dispatch) use function with Box in the args.

This really makes me think, because I don't know any type which based on Generics we could derive to Generics or Dynamic Dispatch using the input type that is known at compile time, yes we can know all this at compile time! so we could choose the best function to call with this!

The idea would be able to avoid duplicate code, and have one code which can be sent to different options, I don't know if this is solved, maybe there is already something and I don't know it.

Thx!

If you impl B for Box<dyn B + '_>, et cetra, then your fn_Box are unnecessary as fn_T will accept the fn_Box argument type. For example.


I don't know that I fully follow the rest of your post, but I'll leave some passing remarks.

The Vec<_> you're referring to do contain just one type, and that type is Box<dyn A> (or whatever). dyn A is a concrete, statically known type. For example, it can have statically dispatched methods, too.

If you're thinking the compiler could automatically supply fn_Box functionality for fn_T when the Box does not implement the trait, that runs into[1] the same set problems as automatically supplying the impl Trait for Box<dyn Trait + '_>[2] itself. It's not always possible, it would have to be conditional on the implementation not having been provided by the coder, and it would make adding a custom implementation that differed from the compiler-provided implementation a breaking change.


  1. at least â†Šī¸Ž

  2. or for any other pointer type â†Šī¸Ž

4 Likes

Hi, thx for the answer! I had the feeling I was not able to explain all properly.

I would say, "how to have one code without lost performance".

If we check your example, yes, we are able to one only one code, but the price is high, that code is still locks us in Box, while the f we want to use, needs to pass through a Box the compiler will not be able to optimize what is deeper on the function.

Some Call to F -> Box Param -> Internal function doing

While there is a Box, the compiler will can't optimize it.

This can also be checked with the allocations with the page https://rust.godbolt.org/ and set -C opt-level=3 and the next example:

fn foo_b(b: Box<String>) {
        println!("{b}");
}

fn foo_n(b: String) {
        println!("{b}");
}

pub fn main() {
        let a = String::from("hola");
        // remove comment this one to see box effect
        //foo_b(Box::new(a));
        // remove comment to see optimizations without box 
        //foo_n(a);
}

What I'm looking is... in both examples, the trait is always implemented, but the circumstances are clear in what to choose one or other, if we use only one type for a vector we can use generics, if the vector can implement different objects with the trait use the Box one.

Even if dyn A is a concrete and statically known type, the only way to use it is with a pointer object, and all pointer objects will wait for allocations what would stop the compiler to optimize the code.....

This is hard... because while the code is more global, is less efficient compared with more specialized one, which is intuitive, but causes that in something like Box<Box<>, Box<Box...>> each replacement of Box with a generics can be a important improvement in performance, so is like replace from inside to outside but this very a mess try to optimize this case.

Yes, that is how dyn Trait works. You have a wide pointer to the erased type and to the vtable, and the call happens through the vtable. And yes, that will inhibit some optimizations. (It can also have advantages, such as smaller code size.)

In general, there's no way for functions that do semantically different things to compile down to the same output. If you don't know which implementation(s) to call until runtime, then of course you must include some check at runtime to determine which implementation(s) to call.

Those checks may take the form of dynamic dispatch, or they could take the form of checking an enum discriminant -- there are multiple approaches that can have different performance impacts (and other tradeoffs).

A way to have one code for the different approaches -- as in, to have all the core logic in one place in your source code -- is to implement the trait for the different approaches (for Box<dyn Trait>, for an enum...). Another way is to use macros.

If you need multiple implementors in the same vector, you have no choice but to use something like an enum or type erasure (dyn Trait) -- something that encapsulates the multiple implementors in a single type. If you only need a single implementor, then yes, not using dyn Trait is generally preferred.

I know your question is about how to have one solution, no matter what underlying types are involved. But the reality is that different underlying types are treated differently.

  • When a function reads the contents of something that can be cheaply converted to &str (as with the above two functions), the parameter is simply b: &str. The conversion to &str, from Box<String> (which would normally be Box<str>) and String, occurs in the caller. No generics are needed.
  • The same is true for reading and updating types like Vec<T> and array ([T; N]) that can be converted to a slice (&[T] or &mut [T]).
  • In some other cases, generic code is written for types than be converted to a common type using AsRef, From, etc. Learning how to do this takes some experience with different traits and how they are normally used. I'm still learning this and I expect to continue learning indefinitely, especially as I encounter more such traits in std and other crates.
  • And of course, you can use generic types with trait bounds that provide all the methods you need for the implementation, without any conversion to a common/underlying type.

This may not answer your question directly, but it is the sort of thing people often do in practice.

Hi. thx for all the answers!

@jumpnbrownweasel After our big generic thread we talked some time ago, I was able to polish a lot of things! rn I'm also getting used to the convert crate and how other crates uses it :slight_smile:

This thread opens the door to a rusty objective, which is do not repeat code, so is an issue do not have mechanism to write only one time code like this.

@quinedot Which ideas do you have to handle this? Think our base case of have two functions one for each case...

I don't get very well how to use enums for this, we could add one to know if is box or other value, but this solution would be worse than just have two functions, but maybe is not what you mean.

Use different Impl in particular cases, using a trait for example would help a lot to do not several f_something.

I would avoid us a base Box everywhere, just because we could not specialize the code for the tradeoffs we want by use case.

Seems macros... is the way to go (? probs Derive one, a way to read all possible types, start with all Box, then from inside to outside replace Box with Generics until all the implementations are done.

How I handle it depends on the actual case in question.

1 Like

thinking this and the ideas, I had some other ones, actually T: Foo is able to constrain to Box while we also implement the trait to Box, so would left how to handle the implementation to pointers, Box is an example, obvs the core logic must be applied to any type of pointers, so my next idea was implement using Deref:

use std::ops::Deref;

struct Ele;

impl Foo for Ele {}

trait Foo {}

impl<W: Deref<Target = T>, T: Foo> Foo for W {}

fn f_box(t: Vec<Box<dyn Foo>>) {
        // Same code as fn_T
        todo!()
}

fn fn_T<T: Foo>(t: Vec<T>) {
        // Same code as f_box
        todo!()
}

trait StaticFoo: Foo + Sized {}

fn f_a() {
        let ele = Ele;
        let b: Vec<Box<dyn Foo>> = vec![Box::new(ele)];
        fn_T(b);
}

But I don't understand very well, and how to handle the error in this code:

error[E0277]: the size for values of type `dyn a::Foo` cannot be known at compilation time
  --> src/a.rs:27:14
   |
27 |         fn_T(b);
   |         ---- ^ doesn't have a size known at compile-time

note: required for `Box<dyn a::Foo>` to implement `a::Foo`
  --> src/a.rs:9:36
   |
9  | impl<W: Deref<Target = T>, T: Foo> Foo for W {}
   |                            -       ^^^     ^
   |                            |
   |                            unsatisfied trait bound introduced here

That are the two main cases, if we can get a complete way to call any pointer to generics, this would also be solved!

and sorry, seems I was not fully been able to understand you example here, is actually the path I was looking on!

In your implementation, W and T have an implicit Sized bound which you can remove with ?Sized:

impl<W: ?Sized + Deref<Target = T>, T: ?Sized + Foo> Foo for W {}
//      ^^^^^^^^                       ^^^^^^^^

I forgot that! generics always have Sized implicitly.

Well, the main question is solved :3

We can use directly generics and the rest can be infer or handled from the caller, if we want to use one or other we can always transform to dyn.

I'm still getting used to impl<T> Foo for Something<T>, I have the feeling that is too easy to break things, maybe because I don't know the rules in how it will works, we could mix that with a lot of things like Bo<Box<T>>, which is also Box<T>.

You will have to implement Foo twice, once for W and once for Ele. So there is still code duplication, right? (I may misunderstand the goal.)

Yes, but a lot less than before, in the first case we would need to duplicate even by each Pointer type we wanted to support.

In this case, using Deref we would only need to do it once, and in the worst case we could use macros and just implement using *self.f(), which is a lot more trivial than mix all this.

Obvs, would be nice be able to have a macro that does this by it self, but I don't know if worth the work that would be learn much, but to keep the issue in the right path, I think we should give some alignments about how to do it.

I think, the first part of create a macro, would be a derive one for the Trait Foo, which would implement the Deref logic, for this type of macro there is info to know about it.

The idea is the macro to go function by function, implementing all of them and just doing the (*self).f(), this would also help to always have all the functions and do not miss anyone if we change the trait.

Have I missed something?

Why use macros when you can just call a shared function? Not all sharing needs to use traits or generics.

can you give me an example pls?

Sorry, I shouldn't have suggested that, as I don't know what code you're trying to share. I was just making a general comment.

I think is nice have more cases, I did the example above as a simplification for my use case, but maybe this can hit other aspects or designs, feel free to show and comment more :slight_smile:

You're right, you can simply deref and call the function implemented for T. The common code is shared.

The compiler will tell you if you don't implement all functions in a trait.

So no need for a macro, since it is only done once.

because is comfy! write the macro and you will never need to touch that piece of code again, but is true that is not mainly needed.

Not all sharing needs to use traits or generics.

Which case would not need them?

Come to think of, there is something we did not thought, in the Deref case, how would we do it work to return values? a function that returns for example Box, the normal return will not works, because the struct most be moved to the Pointer type, but we do not have any trait for New, which means... the trick above will no works!

use std::ops::Deref;

struct Ele;

impl Foo for Ele {
  fn ff(self) -> Self {
    self
  }
}

trait Foo {
  fn ff(self) -> Self;
}

impl<W: Deref<Target = T>, T: Foo> Foo for W {
  fn ff(self) -> Self {
    todo!("What do we do here? conversion to W and avid write by hand for all pointers")
  }
}

@quinedot @jumpnbrownweasel