[Solved] Variance of `dyn Trait + 'a`

I am still somewhat puzzled. So the conclusion was that it is sound because assignment of unsized values is impossible?

That would be a very dissatisfying answer. Subtyping should have to do with the question "is every instance of T also an instance of U", not something about assignment. Relying on unassignability of unsized value is a band-aid if our subtyping relation is broken.

But there also seems to be the conclusion that actually there is no subtyping for &mut trait objects either and something else was going on. (And indeed since &mut is invariant everything else would be really strange.) So mystery solved? Not entirely, I am still wondering if we could have subtyping on that trait object lifetime...

So, is every instance of dyn Trait + 'static also an instance of dyn Trait + 'a? I actually think so, but I am not sure -- because I am not sure if I actually properly understood the meaning of that lifetime bound. Sure, intuitively it has something to do with "the lifetime of the underlying value", but I never found that answer very satisfying -- in building a formal model of Rust lifetimes, the "lifetime of a value" is not a concept that ever came up.

To understand these lifetime bounds I started wondering what would go wrong in the proof if we just didn't have the bound, and came to a conclusion that quite surprised me: I think it has to do with the implicit "well-formed" condition on each function call. When we have a function like fn foo(self: &'a mut Struct<'b>), there are implicit requirements that 'b: 'a and that 'a: 'fn, where I use 'fn to denote the lifetime of the function call. In Rust it is impossible to write a function that does not have these bounds, but in our formal system we do not have that limitation, and you get a perfectly sane system that way -- it's just more explicit. To recover a more Rust-like feel, we added some notation where given a type we compute the WF conditions that a function taking that type would add.

But what happens now if you add type erasure (aka trait objects)? Well, we don't know what the WF conditions of the underlying type are! If we assume foo is part of Trait and we impl Trait for Struct<'_>, then Rust auto-generates a function with the signature fn foo(self: &'a dyn Trait) for us. That function then dispatches to Struct::foo. But Struct::foo requires 'b: 'a, and TraitObjShim::foo has no way to prove that condition! Nothing prevents the user from calling TraitObjShim::foo after 'b is over. Thus doing the call would be unsound.

Now I first thought "OMG we have to abstract over the well-formedness predicate" but it turns out we don't. The lifetime bound in a trait object is a rather ingenious solution to avoid having to do that. The signature of TraitObjShim::foo instead becomes fn foo(self: &'a dyn Trait + 'b), and there is an invariant that outliving 'b is good enough to satisfy all well-formedness requirements of the actual underlying type. That is how the shim can prove to Struct::foo that all well-formedness requirements are upheld.

Basically, the meaning of the lifetime 'a in dyn Trait + 'a is "'a has to still be ongoing for you to be allowed to call trait object methods on it". It has nothing to do with the "lifetime of the underlying value", it is a proxy ensuring you are upholding well-formedness conditions of the methods you are calling.

That is my theory, anyway. I haven't proven it yet, we haven't gotten around to actually formally modelling trait objects. So maybe @eddyb will take it apart immediately. I am curious what y'all think about this.


Now, with that theory in mind, what does that teach us about variance? If the point of the lifetime parameter is to ensure that trait object shims can uphold the well-formedness requirements of the functions they are dynamically dispatching to, that means:

  • If the trait has no methods, we are actually free to change the lifetime any way we want. I think. This is a bold claim and thus probably the best way to test if my theory makes any sense...
  • But generally the trait will have methods. Then the variance should be such that well-formedness can only become stricter, which corresponds to the lifetime becoming shorter. So that's the same variance as references have wrt. their lifetime.

Maybe that's not surprising to anyone involved, but I had to go through all this to make sense of such a statement. But now I am worried because @Yandros said in their original post that it would be "unsound" if Box<dyn Trait + 'long> would coerce to Box<dyn Trait + 'short>. What would be the example for that?

8 Likes