Consequences of breaking the orphan rule

The rule is explained here, where it says:

An orphan implementation is one that implements a foreign trait for a foreign type.

And explains why:

If these were freely allowed, two crates could implement the same trait for the same type in incompatible ways, creating a situation where adding or updating a dependency could break compilation due to conflicting implementations.

I'm trying to follow the text to find the issue i.e how two orphan implementations are a problem.

I take two crates that implement the foreign trait on a foreign struct. The homonym structs would operate differently since the associated functions are implemented differently.

I am unsure how the update of dependencies turns it into a problem.

...and than, in your own code, create this struct and call a method from this trait. Which implementation should be used?

So then follow the :: path to that struct in that crate, what is the issue?

The method is the method associated with the struct I am calling, which is in that crate.

One could also bring up the other struct, from the other crate, and there is no conflict.

Which crate would that crate be?

You have a struct FooStruct in the crate foo.

You have a trait BarTrait in the crate bar.

You have a crate foobar which implements BarTrait for FooStruct in one way.

You have a crate barfoo which implements BarTrait for FooStruct in another one.

You have your own binary application for which you use both the foo and the bar with their FooStruct and BarTrait respectively. You later import foobar and barfoo with their own implementations for the FooStruct of the BarTrait. Which one does the compiler pick?

1 Like

I don't see the problem as I said above. FooStruct isn't a thing on its own that is duplicated, it is a path in a module tree which directs to an unambiguous struct, with its own implementations.

There are two paths to two different structs, each is fine, and the path selects the one chosen.

It is a thing on its own. It's never duplicated. Unless you're using several versions of the same crate as a dependency throughout your imports: every struct or union or enum declaration from a given crate is its own unique type. This one doesn't change from one crate to another. No matter how many different impl for how many trait's along which :: path the crate provides.

If the things worked they way you're describing them then adding any new dependency added might breaking the existing code as now you have to clearly disambiguate in between which struct along which :: path from which exact crate you want to use which impl methods from.

2 Likes

One of the major features of Rust’s trait system is that it lets you add custom behaviors to a type that can be consumed by third-party code. For that to work, the compiler needs to be able to choose a consistent implementation for any trait-type pairing.

If there really are two different types in play, the orphan rule doesn’t apply; that’s how the newtype pattern works. But this introduces some interoperability problems, as you now can’t do things like store values of the “same” type from multiple sources in a single collection.

These two approaches can’t really live together— Any language design that wants both capabilities will require an explicit marker for at least one of them, and that decision will cascade downstream to lots if other language design decisions. Rust chose implicit extension traits with explicit newtypes. In an alternate history, the decision might have gone the other way, and we would have a quite different (but still viable) programming language.

that's what i meant though, didn't put it in the best words i guess. by "isn't a thing on its own" i meant it's not a single entity for both crates, there are two independent entities: one for each path (so they are not duplicated.)

In my opinion, if you want to force loading both, you can simply load both, and use as X and as Y. But that's not required either.

I can't understand it sorry; I restate what I tried to say in OP:

  1. Two crates implement the same (foreign) trait in the same (foreign) Struct.
  2. The devs write different "associated methods" code, although they share signature given by the trait definition.
  3. However, these items are independent items in the module tree, and should execute independently.
  4. This isn't important, but you should be able to load each, even in the same file with the as X, as Y for each path to each struct.

This is where your misunderstanding lies. Trait implementations are global and don’t get their own place in the module tree— Regardless of where they appear in the code, they are tied to the module tree (and privacy system) by the relevant type and trait declarations.

Because of this global nature, the orphan rules exist to ensure there is at most one implementation for any given (type, trait) pair.

1 Like

                                         :worried:

Why are they global ? Where is this stated ?

But it follows from your statement " Trait implementations are global" that then a foreign trait implementation on different types in two different crates would also be a problem though. I don't think this is the case.

What I mean when I say that implementations are global is that, conceptually, the compiler maintains a single lookup table of the shape (type, trait) => Option<implementation>. Whenever any code, anywhere in your program, wants to call a method of trait Tr on a value of type Ty, it goes to this table, looks up (Ty, Tr), and then uses whatever implementation it finds there.

So, there is no conflict in implementing a single trait for lots of different types as they map to different rows in that table. But multiple impl Tr for Ty blocks with the same Tr and Ty are a problem because they are both trying to write into the same slot of this global table.

(The actual implementation inside the compiler is quite different, of course).

Unfortunately, I don’t have a good citation for this behavior— It’s sort of implied by lots of documentation, and generally matches what the compiler does, but I don’t know where (or if) it’s explicitly stated anywhere.


Just this doesn’t get us all the way to the orphan rules, though— You could satisfy this by simply ensuring at compile time that there are no conflicts. The orphan rules exist as a coordination tool between multiple crate authors, by describing who is allowed to provide each possible trait implementation. Because the implementations can only be provided by two distinct crates (the ones that define the trait or type at issue), I don’t have to worry about a conflict coming up when I add a third, supposedly unrelated, crate into my project.

2 Likes

So taking your claim that <Trait,Type> combinations are global then of course it makes sense that two pairs that are the same would conflict.

Although it seems to be that what they should say is exactly what you said (and my par above repeats). Isn't it simpler?

So the orphan rule ensures that at least one of those is local.

I don't see the "dependencies" issue coming here though?

So in my own code either we have <Local_Trait, Type> or <Trait, Local_Type>. Is this correct or am I going astray?

But I don't understand that:

Oh by "adding or updating" they mean adding a second crate that conflicts with a previous one on a pair.

creating a situation where adding or updating a dependency could break compilation due to conflicting implementations.

right?

But actually, you can have 0 pairs of <External_Trait, External_Type>. And that's good for working with multiple crates.

1 Like

Yes, that's correct. The primary aim of the rules is to forbid <external, external> implementations, but they have been reformulated into allowing the complement <local,any> | <any,local>.

One thing that makes the orphan rules tricky to understand is that they primarily prevent forward-compatibility problems: To really see them in action, you have to consider not just what the code looks like today, but also what it might look like tomorrow.

Let's say that we have three crates, a, b, and c and look at impl a::Tr for b::Ty:

It can appear in a, because a::Tr is local. It can also appear in b, because b::Ty is local. The two crate authors need to agree between themselves who gets to provide the implementation— If they both do, the compiler will error out.

Under the orphan rules, c is never allowed to provide this implementation because both the trait and type are foreign. Without the orphan rules, c could provide this implementation as long as neither a nor b does, but that can cause problems:

Another crate d would necessarily be able to also provide an implementation, which will also work just fine. Right up until some project wants to include both c and d as dependencies, which will then fail to compile because of the conflicting implementation.

Also, If the authors of a and b later agree that the trait implementation makes sense to provide, one of them will want to add the implementation. This, though, would break any program that relied on c, because there are now duplicate implementations in play. So we are now in an awkward impasse where the implementation in c is preventing the authors of a and b from adding functionality that they would like to, and might even be a critical need for some users that can't pull in c as a dependency for some reason.

The solution that Rust came up with for this problem is to forbid the implementation in c (and d and e ...) entirely, so that the authors of a and b just need to coordinate with each other instead of (potentially) every third-party crate author that happens to rely on both a and b.

4 Likes

I've sometimes wondered, too, if allowing another external trait on external type, but only locally, wouldn't make things easier.

It could be seen as syntactic sugar for the newtype workaround, MyWrapper(ExternalType), sparing the programmer the boilerplate code to access the inner data and not hiding the actual type under another name. The scope must of cause be local only, and any function called locally on that type but defined in another crate and relying on another trait-type association mustn't be impacted. As if ExternalType was actually MyWrapper locally.

But I haven't spent a lot of time thinking about it, so I may have missed other problems.

Just a minor question, the OP is solved: Does this happen frequently enough that it becomes annoying? For which I mean, that c does not implement anything, but depends on both packages a and b (and both implement the same trait in the same struct i.e didn't co-ordinate). (Maybe I got confused and misinterpreted something.)

That problem doesn't show up very much— it's usually pretty obvious which crate the implementation belongs in, and there are some guidelines about when adding an implementation might break things (and therefore needs a major version bump). Even if they don't follow the guidelines, they'll get a compile error pretty quickly: Both crates are necessarily included in every build because you have to refer to both¹ to write the implementation.

The more common miscoordination issue to run into is where a defines a trait that would be useful on one of b's types but the two crates are entirely unaware of each other, and so no implementation is provided. That's when c will need to reach for something like newtypes and is the situation where people are most likely to want the orphan rules relaxed (so that they can define the missing implementation themselves instead of getting it added to a or b).


Âą The exception to this is generic/blanket implementations, which can cover foreign types without naming them explicitly. These are the source of most of the complexity in the orphan rules and stability guidelines.

1 Like

No, this cannot happen. The only way for b to implement a::Tr for b::Ty is by depending on a, but then when compiling b the compiler would see that a implemented a::Tr for b::Ty and error on b. This still results in an error, but it's not a coordination issue: b is simply wrong.

1 Like

That's reassuring..!