Why does Rust require explicit Trait specification when writing generic code?

Hi,

Just a potential suggestion - more of the beginnings of an idea really... Not sure where to post this so just posting as uncategorized. Feel free to move it.

Sorry my Rust may also be a bit "rusty". I haven't written much Rust for quite some considerable period of time. I might get some things wrong here... (Hopefully not all of it...)

The following example is taken from Programming Rust 2nd Edition (O'Reilly).

fn top_ten<T: Debug + Hash + Eq>(values: &vec<T>) { ... }

In this example (and also in general), when writing a generic function the programmer must specify the traits which constrain a generic type parameter such as T in this example.

I wonder whether this explicit information is neccessary. Could it be possible to write an implementation of the Rust compiler which figures out at compile time whether or not T implements some set of required methods?

Another way to phrase this question would be to ask Why are the explicit Traits to constrain T required?

For example, could it be possible to write this:

fn top_ten<T>(values: &vec<T>) {
    let mut h = HashMap::new();
    for value in values {
        // ignored code...
        h.insert(value.clone()); // this function call implies that T: Hash + Eq because
            // internall it will explicitly call a hashing function and an equality comparison operator ?
    }
}

This doesn't require T: Debug, but we could make it require this too by calling println!("{value:?}"); somewhere in the loop.

Actually, I don't think it is the Trait information which is important. What matters is (for example) "does T implement a clone() function?" My assumption is that the compiler could answer this question at compile time.

Can anyone provide any insight here? I am interested to know more about why the explicit Trait specification is required.

If you want a deeper understanding of why I am asking this, the reason is due to the following hypothesys:

  • (This is my hypothesys, which may be incorrect.) If Rust did not require explicit Trait specification, then it would be easier to avoid the "type soup" which comes with writing large collections of Traits when inheritance is involved.

Similar issues arrise in C++ (and other OOP languages) whenever inheritance exists. It can be difficult to avoid writing inflexible designs which involve lots of types with complex inheritance relations. Of course, there are ways to avoid this, but it requires some careful thought/design.

I have found myself also creating type soups with Traits in Rust, hence why this question came to mind.

Even if they weren't needed, they would still be tremendously helpful. Having to figure out in my head what traits a type must implement when reading a piece of code sounds like a terrible waste of time, if I instead could've just looked at the explicit list of trait bounds required. Code that is optimized for reading rather than writing it is one of the biggest perks of Rust IMO. Or put a bit more poetically: explicitness is a blessing, not a curse.

What if there are two traits implementing a method named clone for a potential T. How would the compiler know which trait to choose?

2 Likes

The same thought same to me just before I signed back in to reply.

I don't think this is any different to the current situation with explicit Traits. Two explicitly specified Traits could implement a function with the same name. My guess would be the Rust compiler complains about this and has some way to disambiguate the two calls. Correct me if that's wrong.

Well - this would no longer be a concern. You wouldn't be looking for the name of a Trait, you would be looking to see if some type has a function with some particular name in an associated impl block. Whether or not that's easier or harder to read, I'm not sure. I don't think it makes much difference.

What would be significant would be the refactoring work required. (There would be none.)

That is correct. You'd use fully qualified syntax for disambigutation. But wouldn't that be detrimental to your goal? Which I assume is saving time typing out the trait bounds explicitly. Fully qualified syntax isn't exactly what you'd call concise.

1 Like

Well, from personal experience I'd say the former is hugely advantageous over the latter. Trait bounds are clearly defined in the function signature. Imagine having to look through a function with 100s of lines of code, trying to deduce the bounds from that. Even worse, now you want to publish your library. How am I as a user supposed to know what T I can pass to your function?

1 Like

I don't think I explained clearly. The point would be you would not have to do that - assuming this could work as I expect, you would not have to be concerned with the name of a Trait.

Take for example the Clone Trait. You would never have to worry about the Trait Clone.

You would write some code in a function, and that code would include a call to T.clone(). The compiler would then figure out if T implements a clone, somewhere. In this case, it would happen to be part of a Trait called Clone, but this is unimportant.

If there were a second Trait which defined a clone() function, then as you say you would have to be more verbose and disambiguate between the two clone() methods. But that's ok. This isn't the most common situation.

But it might be there is another, deeper reason why explicitly specifying the Trait is required. (Possibly it simplifies the compiler logic substantially, or it isn't practical to implement a compiler which goes and automatically figures out if a particular type has a particular set of methods? I don't know much about it.)

I don't see how this can work. Let's assume you build a library. It runs in the context of your users. One user implements their own trait MyClone for a T[1] they want to pass to a generic function in your library, that calls T.clone() somewhere. How is the compiler supposed to figure out whether to choose <T as Clone>::clone() or <T as MyClone>::clone(), without you having use the fully qualified syntax (<T as Clone>::clone() or Clone::clone(&T)) in the first place?


  1. In the following used to refer both a generic type T and a variable thereof. Like if you'd have a function fn foo<T>(T: T) { ... } ↩︎

2 Likes

That would basically be templates like in C++. With the same advantages and drawbacks.

Which you can [basically] already do in Rust with macros.

5 Likes

Bounds on a generic function document what types the function can be used with. If they were entirely implicit, it would be harder to read programs. It would also be easier to make a mistake. Suppose you add a .clone() call on a value of type T in a function that previously had none — then you would have changed the function to have an implicit T: Clone bound it did not have, which could break existing uses of the function.

Rust follows the general principle that function bodies (the code in the {...}) do not affect the signature of the function — they do not affect under what conditions the function can be used. Therefore, you know that changing the code in a function is never a breaking change (in the sense of causing code to stop compiling; of course you can make the function do the wrong thing).

Explicit bounds in the function signature mean that all changes to them are intentional, not accidental.

8 Likes

Rust's mandatory trait bounds are a response exactly to the mess C++ templates caused.

If explicit capabilities on type parameters are not enforced, then the author of a generic function will never know they've written what they intended, and it's not possible to be sure about that assumption by trying to test the call site in every imaginable way. Which is, of course, extremely brittle or outright impossible.

If you want syntactic abstraction (that is, repeat code without types), use macros.

5 Likes

For the same reason that Rust functions require types for parameters and Rust has types in the first place. Even though duck-typed generics can feel convenient, C++ experience tells us that it leads to undecipherable error messages and many of the same issues that lack of typing causes, such as difficulty of local reasoning, poor readability and discoverability, Hyrum’s Law, and brittleness in the face of API changes and refactoring,

2 Likes

Think of it from a library consumer point of view.

  • You use a library, using your ADTs as their generics
  • They publish an update -- e.g. rewrite a method -- which requires some new capability
  • Your ADTs don't have that capability and your project breaks

In Rust, this almost never happens:[1] the library maintainer generally has to consciously add new requirements (trait bounds) to their functions to cause it. I.e. they must consciously make a breaking change. But in ecosystems/languages where you don't have to state the requirements in code, the library maintainer can very easily do things like this on accident. And any given non-expert programmer in that language/for that library is also probably more likely to make such breaking changes, because the language never made them get in the habit of thinking about such things.

Trait bounds also free the library maintainer from having to worry about breaking changes like this. You can rewrite your function body from scratch, and if you didn't have to change the trait bounds, you haven't changed the ability of downstream to call your function. Assuming you care about being a good maintainer, your code base is in some senses actually less brittle and easier to maintain. And it helps delineate when a new major version of your library is warranted -- you more easily know which of your desired changes are breaking changes.

It makes the ecosystem as a whole healthier.


:asterisk: Opaque return types -- which silently leak auto-trait ability -- are a hole that Rust has in this model at the function level; i.e. part of the API contract which is invisible and thus easy to accidentally break. ADTs also have some leaks (and other properties like size, which downstream usually isn't expected to rely upon).


  1. :asterisk: ↩︎

6 Likes

With a small asterisk — the compilation model of Rust macros is different from that of C++ templates / Rust generics. A generic function is only compiled once[1] per set of types it's instantiated with, but a macro is compiled separately each time it's invoked. The API is functionally similar, but this is a real difference to keep in mind.

There's some light thought about introducing a kind of macro fn[2] which sits between the two and offers some of the benefits of either. It seems unlikely to actually happen, but it's an interesting thing to think about.

Basically all systems are best-effort at some level, unfortunately; concessions to reality are a necessity of practicality. unsafe has holes[3], and so does "the signature is fully explicit" as an axiomatic ideal.

But honestly, I see that as more of a strength than as a weakness. In a strong majority of applications[4], the various autotrait inferred type properties aren't super relevant to pay attention to, and when they are, it's reasonably straightforward to do so.


  1. Caveats abound, the main one being that instantiations rarely get shared between separate codegen units (for optimization reasons), so you often end up with duplication between CGUs. C++ compilers typically try very hard to reunify template function instantiations during linking; the Rust compiler makes it possible but puts less effort into it. ↩︎

  2. Essentially, macro fn would act like a normal function boundary and be invoked like a normal function, but would permit using _ as a placeholder inferred type. Type checking of inferred types (or anything derived from) would only happen once the function is invoked in a context where the types can be inferred. Type inference would likely still treat the function boundary as a boundary, i.e. as if the function were a fully unbound generic, and any instantiations would be memoized by the compiler query architecture. ↩︎

  3. Thankfully, only minor ones: open I-unsound issues with the compiler (which should all be fixable eventually) and environmental access considered out of the model (e.g. /proc/self/mem or loosely unique ownership of linearly allocated OS resource handles). ↩︎

  4. Task stealing async environments are somewhat of a unique outlier exception to prove the rule, here. I'm sympathetic to both sides of local-by-default and Send+Sync-by-default tradeoff. ↩︎

2 Likes

Even C++ realized that duck-typed generics are not always the right thing and is instroducing concepts, so the rust way seems like current state-of-the-art.

Just coming back to this and wanted to take a slightly different angle on it.

This might be too much of an open ended question, but here goes.

How do you avoid the inheritance hierachy soup which comes with Traits?

From a OOP point of view, if you end up with a complex web of inheritance, usually you either re-design, or throw away (at least part of) your design which relies on inheritance. Both of which require a lot of refactoring work.

It might be because I have less experience working with Rust, but what I have found so far is if you end up in such a situation, it seems harder to get out of. You can get into this situation where you design interfaces which require generics constrained by Traits, and the number of Traits explode because it becomes a game of combinatorics.

(In other words, I create a new Trait because I need behaviours A, B and C, but not X, Y and Z and I don't currently have a Trait which has this exact specification.)

Can anyone share any wisdom on this? I hope it's fairly obvious what I'm attempting to describe here... If not please let me know.

Rust doesn't have trait inheritance.

Rust has supertraits. But it's a different beast than object inheritance and everything being an object.

1 Like

You don't need a new trait in this case, you can just bound your functionality on Type: A + B + C instead. Specifically hierarchy soup is avoided by having smaller, single-purpose traits.

Trait aliases that allow you to directly give a name to more complex bounds (including where clauses) are a very useful nightly feature as well.

The other thing is just being less generic, and using dynamic dispatch where appropriate. E.g. how there's no trait for an associated map, only just HashMap and BTreeMap with very similar API surface.

4 Likes

Notice in particular that many traits have exactly one method (and perhaps an associated type). This is common because single purposes are often accomplished with single methods.

Any time you find yourself designing a complex trait, ask yourself whether it really needs to be that complex. For example, I sometimes see beginners designing a trait like

trait ThingDoer {
    fn new() -> Self;
    fn do_thing(&mut self);
}

This combination of two operations means that:

  • Every ThingDoer must be constructible from no inputs, or stub it out with fn new() -> Self { unimplemented!() }.
  • It's impossible to use do_thing() with dynamic dispatch.

Instead, if you design the trait with just the significant operation,

trait DoThing {
    fn do_thing(&mut self);
}

then the trait can be used as dyn DoThing, and to handle construction, in cases where you actually need a no-argument constructor, you can require the bound T: DoThing + Default, and in other cases, you usually can and should leave construction up to the caller that chooses the specific type that's implementing DoThing. So, it's more flexible all around, at the small price of having to write + Default sometimes.

5 Likes