Statically-dispatched strategy using traits?


#1

Suppose we have a trait:

trait Rank {
   fn rank(&self) -> u32;
}

Then we implement some useful generic algorithm:

fn sum_of_ranks<T: Rank>(slice: &[T]) -> u32 {
    slice.iter().map(Rank::rank).sum()
}

Yay! Now we can implement Rank for i32 and get sum_of_ranks() support for slices of i32s:

impl Rank for i32 {
    fn rank(&self) -> u32 {
        if *self >= 0 { *self as u32 } else { 0 }
    }
}

fn main() {
    println!("{}", sum_of_ranks(&[1, 2, 3])); // ===> 6
}

However, there is only one implementation of Rank per type. One cannot use a different ranking method for i32 as there is already one defined.

One way to circumvent this seems to add an ad-hoc wrapper type:

#[allow(non_camel_case_types)]
struct i32_ranked_higher(i32);

impl Rank for i32_ranked_higher {
    fn rank(&self) -> u32 {
        self.0.rank() + 5
    }
}

But I do not see an obvious way to interoperate with existing i32s. Either i32_ranked_higher should be used from start (which defeats the goal), or i32 slices should be copied into Vec<i32_ranked_higher> or something (which is hardly efficient), or a cast should be forced with unsafe std::mem::transmute() (which is morally wrong).

As a bonus, all of these are cumbersome. I’ve tried playing with From and Into to get automagical reference casts, but the resulting boilerplate that actually worked also used unsafe transmute. Not to mention that it got quite boilerplaty.

Another ways that I see would be using dynamic dispatch (fn sum_of_ranks(slice: &[Box<Rank>]) -> u32) or an explicit callback (fn sum_of_ranks<T>(slice: &[T], rank: Fn(&T) -> u32) -> u32), but the point is that I would like to have static dispatch.

I believe the same issue arises with any container or algorithm that uses the standard traits like PartialEq, Ord, Display. All of them assume that there is only one true implementation for any type. HashMap is interesting in that it does allow customization of its hash function. However, it does so by using a default type parameter to get the default builder which then dynamically constructs a hasher. This will look something like this in my case of a standalone function:

trait Ranker<T> {
    fn rank(&T) -> u32;
}

struct DefaultRanker;

impl<T> Ranker<T> for DefaultRanker where T: Rank {
    fn rank(v: &T) -> u32 {
        v.rank()
    }
}

fn sum_of_ranks2<T, R: Ranker<T> = DefaultRanker>(slice: &[T]) -> u32 {
    slice.iter().map(R::rank).sum()
}

So this becomes possible (with somewhat ugly required explicit type specification):

fn main() {
    println!("{}", sum_of_ranks2::<i32>(&[1, 2, 3])); // ===> 6
}

And it is also possible to provide an alternate implementation:

struct HigherRanker;

impl<T> Ranker<T> for HigherRanker where T: Rank {
    fn rank(v: &T) -> u32 {
        v.rank() + 5
    }
}

fn main() {
    println!("{}", sum_of_ranks2::<i32, HigherRanker>(&[1, 2, 3])); // ===> 21
}

But there is a small problem with this approach:

rustc 1.13.0 (2c6933acc 2016-11-07)
warning: defaults for type parameters are only allowed in `struct`, `enum`, `type`,
    or `trait` definitions., #[warn(invalid_type_param_default)] on by default
  --> <anon>:27:21
   |
27 | fn sum_of_ranks2<T, R: Ranker<T> = DefaultRanker>(slice: &[T]) -> u32 {
   |                     ^
   |
   = warning: this was previously accepted by the compiler but is being phased out;
     it will become a hard error in a future release!
   = note: for more information, see PR 30724
     <https://github.com/rust-lang/rust/pull/30724>

So, is it possible to cleanly implement this ‘statically dispatched strategy’ pattern in Rust, or am I too weird?


#2

Have you tried this?

trait Rank<R = DefaultRanker> {
    fn rank(&self) -> u32;

    fn sum(slice: &[Self]) -> u32 {
       slice.iter().map(Self::rank).sum()
    }
}

struct DefaultRanker;
struct HigherRanker;

impl Rank<DefaultRanker> for i32 { ... }
impl Rank<HigherRanker> for i32 { ... }

#3

Past that, I want to consider this statement, which is echo’d throughout your post:

This seems like a suggestion that this is a problem with the way traits operate. I think this is an idea that’s worth considering, but on reflection I don’t agree that it is a problem.

In particular, I think this is a definitional aspect of any polymorphism system. In languages that are strictly ‘monomorphic,’ each function call will jump to exactly one implementation of that function. In languages that are ‘polymorphic,’ that jump will instead be determined based on the types involved in that function. But for any tuple of types, that function must dispatch the same. This is true not only of “generics-based” (or “parametric”) systems like Rusts, but also of “inheritance-based” (or “subtyping”) systems like Java - and this is true whether the dispatch is static or dynamic.

The Rank trait as you defined it dispatches on the basis of one type - the Self type. By adding a second parameter, it can now dispatch on the basis of two types - Self and R. Because Rust gives the Self type some syntactic advantage, you can think of it as “two implementations for the same type,” but I think its useful to pull back the curtain a bit to recognize the underlying pattern.


#4

You can have static dispatch with an explicit callback thanks to monomorphization by using a trait bound. (As opposed to dynamic dispach through a trait object.)

Something like this:

trait Rank {
   fn rank(&self) -> u32;
}

impl Rank for i32 {
    fn rank(&self) -> u32 {
        if *self >= 0 { *self as u32 } else { 0 }
    }
}

fn sum_of_ranks_custom<T, F>(slice: &[T], ranker: F) -> u32
where F : FnMut(&T) -> u32 {
    slice.iter().map(ranker).sum()
}

fn sum_of_ranks<T: Rank>(slice: &[T]) -> u32 {
    sum_of_ranks_custom(slice, Rank::rank)
}

fn main() {
    println!("{}", sum_of_ranks(&[1, 2, 3])); // ===> 6

    let ranked_higher = |x:&i32| { x.rank() + 5 };
    println!("{}", sum_of_ranks_custom(&[1, 2, 3], ranked_higher)); // ===> 21
}

Note that sum_of_ranks_custom doesn’t need to care whether something is Rank.

(Checking playground, this computes the “21” at compile time, which, while not exactly a proof of static dispatch, is probably close enough.)


#5

Well, I do not treat it as a problem (like, something I’m displeased with) with traits. They are what they are, and I believe they are called traits (not interfaces) exactly because they do not allow multiple implementations. Trait is something inherent to a single type so it does not make sense to have multiple specifications of a trait of a type.

Maybe that’s just a part of my background. I occasionally unconsciously think of PartialEq trait as of something like Comparable interface. With interfaces as in the Java-OOP world the abstraction they provide is more important that the actual types (you can add an implementation of an interface, but cannot add an interface to an already defined type). While with traits the types are more important than them as you can always implement (or ‘discover’) a trait at any point for any type.

The thing is, I just want to see if there is a way to combine traits, generics, and possibly some magic to have guaranteed static dispatch with some default behavior and preferably with minimum syntactic overhead. Sorry if this looks like I’m ranting about “traits being irreparably broken” as I cannot do what I want to do with them :slight_smile:


#6

I once had a similar pb and wrote about it: http://yanns.github.io/blog/2016/05/23/rust-closures-as-input-parameters/

maybe it can help you.


#7

Java doesn’t allow multiple implementations of the same interface for the same class, though, does it? How would it dispatch them? The difference in Rust is just that you can’t create a subclass to solve this problem, instead you have to do something like adding another type into the mix.


#8

Hm… it seems I’ve overlooked the fact that traits can contain default implementations of methods.With that I’ve got something that looks like what I wanted to achieve. Thanks for the insight :slight_smile: The point, I believe, was in that I need to dispatch on two types: the one that defines data and the one that defines behavior.

/// A trait of rankable data.
trait Rank<R> {
    fn rank(&self) -> u32;
}

/// Marker for default rank implementations.
struct DefaultRanker;

/// A trait of slice-summable rankable data.
trait SumRankSlice<R> where Self: Rank<R> + Sized {
    fn sum(slice: &[Self]) -> u32 {
        slice.iter().map(Self::rank).sum()
    }
}

/// Tautology: slice-summable rankable data is rankable data that can be slice-summed.
impl<R, T> SumRankSlice<R> for T where T: Rank<R> {}

Now people can define additional algorithms for different data without modifying the Rank trait itself.

Library developers can provide a sensible default for their types, with all algorithms that are applicable to Something: Rank<R> automagically becoming applicable to that type:

impl Rank<DefaultRanker> for i32 {
    fn rank(&self) -> u32 {
        if *self >= 0 { *self as u32 } else { 0 }
    }
}

#[test]
fn test_i32_default() {
    assert_eq!(6, SumRankSlice::sum(&[1, 2, 3]));
}

If a user of the library wants something different then they can define their own ranking method:

struct HigherRanker;

impl Rank<HigherRanker> for i32 {
    fn rank(&self) -> u32 {
        Rank::<DefaultRanker>::rank(self) + 5
    }
}

#[test]
fn test_i32_higher() {
    assert_eq!(21, SumRankSlice::<HigherRanker>::sum(&[1, 2, 3]));
}

The only thing that I could not get working is the default type arguments. They are just ignored when there are multiple strategy types visible in scope, so SumRankSlice::<DefaultRanker> must be spelled out.

Well, I’d happily think of it as a feature: if there is only one strategy in scope then it is the default one. The user can opt to not import DefaultRanker and use HigherRanker instead, for example. Otherwise the complier cannot read your mind so please be explicit.

I guess this stems from the fact that R is actually not used anywhere in Rank or SumRankSlice, thus the compiler can infer its type in only one case: if there is exactly one suitable type.


#9

AFAIK, Rust treats lambdas as distinct types, so I believe it should be possible for rustc to even avoid passing the function parameter as a function pointer. Like, in sum_of_ranks_custom(&[1, 2, 3], ranked_higher) it monomorphizes (gah!) sum_of_ranks_custom() into something like sum_of_ranks_custom_i32_:sparkles:|x| x.rank() + 5:sparkles: () so it does not need the ranker parameter anymore.

While Fn parameters seem to be great for these ad-hoc mini-sized customizations, my problem with them was that they seem to scale worse as well. Like, if you have only one point of customization in the algorithm then it’s fine to use an Fn. But if there are three of these (with some of them which you could like to keep default sometimes) then it can become a mess.

Shocking disclosure of my true intentions

I have written a simple diffing algorithm for sequences. I wanted it to work automagically for anything that implements PartialEq. So it was diff<T>(&[T]) where T: PartialEq. But sometimes I needed to compare things in a different way. So I added a diff_with<T, F>(&[T], F) where F: Fn(&T, &T) -> bool and implemented it just like in your snippet.

And then I needed a diffing algoritm for trees. Trees are harder than sequences. They need some method to get children of nodes. And the diffing algorithm requires the comparison to be shallow in order to produce good diffs. The default #[derive(PartialEq)] is a deep comparison. Well, this still could be solved by using some new ShallowEq trait.

But then I wondered: what if I needed to customize both the strategy for retrieving child nodes and the comparison method? Will I need to add three diff_with_something() functions that take one or two Fns? Is there a better way of doing this, while still having some ‘I don’t care, just use the default one’ behavior?

So with one variance we have two functions, a default one and a customizable one:

  • fn foo(Container<T>) -> Result
  • fn foo_custom(Container<T>, Fn(&T) -> PartialResult) -> Result

But with three of them it blows up combinatorially:

  • fn foo(Container<T>) -> Result
  • fn foo_customA(Container<T>, Fn(&T) -> PResultA) -> Result
  • fn foo_customB(Container<T>, Fn(&T) -> PResultB) -> Result
  • fn foo_customC(Container<T>, Fn(&T) -> PResultC) -> Result
  • fn foo_customAB(Container<T>, Fn(&T) -> PResultA, Fn(&T) -> PResultB) -> Result
  • fn foo_customBC(Container<T>, Fn(&T) -> PResultB, Fn(&T) -> PResultC) -> Result
  • fn foo_customAC(Container<T>, Fn(&T) -> PResultA, Fn(&T) -> PResultC) -> Result
  • fn foo_customABC(Container<T>, Fn(&T) -> PResultA, Fn(&T) -> PResultB, Fn(&T) -> PResultC) -> Result

Traits just seem to capture the idea of multiple connected behaviors better than a collection of distinct lambdas.