Deriving external traits on external structs

I am new to Rust. I read in the Rust Book that you cannot implement a trait from an external crate for a struct which is in an external crate. Unfortunately, I'm in the position of needing to do exactly this. I want to take structs from systemstat, serialize them using bincode (which requires Serialize from serde_derive) and send them over the network to be deserialized by the receiver.

I have read the responses to these StackOverflow issues:

It seems like the answer is:

  • I can't derive Serialize, Deserialize from serde_derive on types defined in systemstat.
  • I could contact the authors of either crate and ask that they depend on the other, and derive Serialize, Deserialize.
  • I could fork one of the modules and add the derive tags myself.
  • I could wrap types in systemstat with my own type. This would mean:
    • I would have to manually implement Serialize, Deserialize, deriving would not be possible.
    • To manually implement it I would have to hard code many of the fieldnames, duplicating much of the struct code. In addition, I would essentially be duplicating what the impl Serialize code from the serde_derive module does.
    • When I get stats as a vector of structs from systemstat I would have to traverse the vector and wrap the elements in my new type.
    • Any time I wanted to use the native functionality I would have to unwrap the new type first.

Is this really considered best practice? If so, are there any RFCs which propose a better solution to this?

This isn't an immediate assistance to you, but the Rust API guidelines recommend that library authors implement Serialize and Deserialize for public facing data structures for exactly this reason. If serde isn't already a dependency of the library, it's best to add it as an optional dependency.

Perhaps systemstat would accept a patch that adds the traits.

1 Like

Serde has a workaround for this case:

https://serde.rs/remote-derive.html

5 Likes

That's good to know, although I can't see this as a long-term solution for Rust. It requires that developers from crate A and crate B know about each other, and one derives implementations from traits in the other. In this particular instance of serialization I can see it being general enough such that it warrants being added to most crates, but this won't be the case with most traits.

This encourages bad coding practices for crate developers:

  • Discourages using structs: if, instead of creating a struct, a crate uses only primitive types or, at most, type aliases, then trait implementations from external traits which apply to the primitive types will naturally apply.
  • Discourages using custom traits: if you write functions which only depend on existing traits it is more likely that structs from external crates will have implemented them.

Thanks, this makes it a little easier, although it still suffers from a lot of duplication. (Basically copying all the struct definitions from the external crate(s) into my own file.) In a larger sense, it places a burden on crate developers to implement workarounds which duplicate the functionality provided by derive, as serde has done.

I've been in this situation a few times (and I'm currently in this state now). I'll fork the library for which I need serde traits (or whatever else), and then direct Cargo to use my fork like this:

# Cargo.toml
[patch.crates-io]
dependency = { git = "http://github.com/user/repo", rev="<<hash-of-my-fork>>" }

In the meantime, I'll submit a PR, or wait for an existing one to be merged, and then I can remove the above patch from my manifest.

I guess I'm just thinking that this is not a sustainable approach in general.

I don't think a crate which defines some public structs should have to be aware about all possible traits which other crates define in order to allow a new crate to combine that functionality. It makes the ecosystem quite difficult to leverage.

3 Likes

The main problem which the coherence rules aim to solve is that if you can implement an external trait on an external struct, then you can conflict with a later update from either the struct author or the trait author which adds another implementation.

This makes implementing a trait a backwards-incompatibility hazard, and therefore prevents libraries from evolving easily, ultimately causing either ecosystem stagnation or instability.

Any attempt to relax the coherence rules would need to answer this concern.

2 Likes

As it stands right now I have to copy the entire struct into my own crate. If the struct author changes their struct in any way, even superficially changing the name of a variable, my crate will break just the same. For some reason this doesn't lead to ecosystem stagnation, is implementing traits such a common occurrence that it would?

Furthermore, implementing traits is already a breaking change! For example:

Crate a

pub trait TraitA {
    fn trait_func(&self);
}
impl<T> TraitA for T {
    fn trait_func(&self) {
        println!("TraitA");
    }
}

Crate b

pub trait TraitB {
    fn trait_func(&self);
}
// Note: does not implement it for now

Crate c

extern crate a;
extern crate b;

use a::TraitA;
use b::TraitB;

fn main() {
    0.trait_func(); // Prints "TraitA"
}

This compiles and runs. Now, Crate b decides to implement their trait:

Crate b (new)

pub trait TraitB {
    fn trait_func(&self);
}
impl<T> TraitB for T {
    fn trait_func(&self) {
        println!("TraitB");
    }
}

Crate c will no longer compile, complaining about multiple trait_func found, so implementing a trait defined in your crate is already a backward-incompatible change.

It is true that in this scenario, crate c must do a bit of work, however...

  • Unrelated traits sharing common method names and adding blanket impls late in the life cycle of a crate are both uncommon, making this scenario highly artificial.
  • For crate c, the fix is as helping type inference a bit by reformulating the code as TraitA::trait_func(&0), which cargo update could even do automatically if the problem was a common occurence. Whereas with a true impl conflict there is no clean way to resolve the ambiguity.

The blanket impls are not really relevant, just used as an easy way to show the conflict. They just need to be impl over the same type, and have the same function name inside them. As the number of crates grows, naming overlaps are guaranteed to happen.

I agree, the fix in this case is easy. What's wrong with doing something similar for impl? The compiler checks to make sure there is a unique definition for the trait. (This already would be an improvement from the current system.) If there is not, it asks you to clarify which impl should be used.

For example:

Crate a

pub struct TypeA;

pub trait TraitA {
    fn trait_func(&self);
}

Crate b

extern crate a;

use a::{TypeA, TraitA};

impl TraitA for TypeA {
    fn trait_func(&self) {
        println!("impl from b");
    }
}

The compiler checks and finds that there is a unique definition, no problem. Then, the author of crate a implements the trait as well:

Crate a (new)

pub struct TypeA;

pub trait TraitA {
    fn trait_func(&self);
}

impl TraitA for TypeA {
    fn trait_func(&self) {
        println!("impl from a");
    }
}

When I go to compile crate b I receive an error, telling me that two implementations were found, and I need to either remove mine or specify which one should be used.

I guess my point is: this is just a namespace conflict. There are already namespace conflicts with regard to impl as demonstrated above, and at the level of the method names these are allowed even though they have the potential to cause breaking changes, but at the level of the trait names they are not, and I'm not sure why.

That's exactly the compatibility hazard that coherence is meant to avoid -- otherwise it's a breaking change for a crate to add any new impl at all.

Consider also that such conflicts may arise from afar. Suppose b didn't implement TraitA itself, but had a chain of dependencies c -> d -> e, where e has that impl TraitA for TypeA. Then b decides it also needs the independent crate z. Then an update to z adds the same impl, and again we have a conflict.

Yes, but as I showed above, those naming conflicts can already happen. It's already a (potentially) breaking change to add any new impl in a crate that owns the trait or type. Method names share a namespace between all traits that are implemented. These conflicts may also arise from afar, but they are allowed.

@HadrienG replied and said that the reason this is allowed is that it's not considered to happen very often (which I disagree about) and the fix is straightforward - basically dictating which function should be used by using the trait's namespace.

Why can't something similar be done for traits themselves?

With method clashes, it's possible to write a fully disambiguated form of the call like:
<my_type as Trait>::method(args...)

If the Trait itself could have multiple implementations for that type, how would you choose?

modulename::Trait?

I was thinking that if the compiler observed that there were two implementations of the trait itself, you could simply write something like:

prefer modulename impl SomeTrait for SomeType

and it would treat this as the de facto implementation.

That's incoherent. (Now where'd I put that link... ah!)

Some of the previous discussion here in this thread may have given the wrong impression that incoherence is an issue of API evolution. It isn't; coherence is an issue of type system soundness. (The orphan rules which are the solution to this problem just happen to have been designed with API evolution in mind, hence why the two are often conflated)

Coherence is the property where, given a type and a trait, all code in the compiled binary agrees on the same implementation for that trait. If incoherent impls are allowed to exist in the language, it can lead to what is known in the rust community as the hashtable problem.

4 Likes

Admittedly, I need to think about how this would interact with code in the original module, which may have expected the trait to work the way it did in the original. In particular, if there were side-effects that could be problematic, but that's always problematic.

Still, the fact is that it's already functionally possible to avoid this. Idiomatic usage suggests not to, but the original crate can do something like this:

Crate a

pub trait TraitA<T> {
    fn trait_func(&self); // note we don't actually make use of T
}

Crate b

pub struct TypeB;

and then you can come along and do
Crate c

extern crate a;
extern crate b;

use a::TraitA;
use b::TypeB;

struct Dummy;
impl TraitA<Dummy> for TypeB {
    fn trait_func(&self) {
        println!("NewTrait!");
    }
}

This compiles just fine, and if the original crate simply depended on things which had this parametric dummy type it should all work out.

My point isn't that this solves the problem, it doesn't, but my point is that the current orphan rules artificially restrict some programs which should compile just fine (like my original, where there was only one unique impl) and don't fully address the situation they claim to, as the above example shows.

I’ve also had my gripes with coherence (I think it’s a rite of passage).

By far the best explanation I’ve seen of current rules and their rationale is this recent expose by @sgrif.

3 Likes

Ok, I read through the hashtable link. What is described there is a similar solution to what I described above, except it's even more granular - when you create an instance of an object which has multiple implementations for a trait, you specify which implementation it should use. If you don't specify, the computer selects for you based on some predefined rules. In particular, if only one implementation is in scope then it uses that one. (So it is backward compatible with the language as it stands.) This sounds great, and would completely solve the problem. Why was this not implemented? The email thread is from 7 years ago, so I assume it was considered. Is there a downside? More discussion of it somewhere?

I strongly feel that this issue is going to be a significant impediment for the ecosystem.

I would suggest the RFC linked by vitalyd above, which is the most recent revision of the coherence rules, and can thus serve as a more recent reference. It also points to other texts on the topic.

"orphan impls rust" also sounds like a good search engine query, it gives plenty of interesting-looking results when I type it in google.

1 Like