Question about dyn Trait coercions

Hi all,

currently I'm reading this part of the article from @quinedot's very good Tour of dyn Trait. Unfortunately I don't quite understand why here's happening an unsized coercion which leads to a compile error due to nested unsized coercion:

fn foo<'l: 's, 's>(v: *mut Box<dyn Trait + 'l>) -> *mut Box<dyn Trait + 's> {
    v
}

Playground

and this works on the other hand because of the covariant context according to the article:

fn foo<'l: 's, 's>(v: *const Box<dyn Trait + 'l>) -> *const Box<dyn Trait + 's> {
    v
}

Playground

Does the unsized coercion in the *mut version happen due to the invariant context and the dyn Trait to dyn Trait cast because of the covariant context? That's not 100% clear to me from the text in the article. EDIT: Ok, I've read the article again carefully and it seems to me as if my assumption might be correct, but it would be nice if somebody could confirm this.^^

Regards
keks

What is it that isn’t clear? Is it what the rules are, or is it why they are the way they are?


Also, maybe we should talk with @quinedot about the choice of using raw pointers for the example; in the context of things like

fn foo<S, T>(v: *mut S) -> *mut T {
    v
}

being as easy to fix as

fn foo<S, T>(v: *mut S) -> *mut T {
    v as _
}

[1]


  1. and whilst for some reason the compiler really doesn’t like doing even

    fn foo<'l: 's, 's>(v: *mut Box<dyn Trait + 'l>) -> *mut Box<dyn Trait + 's> {
        v as _
    }
    

    (presumably if the as cast smells like a reflexive case, it ought to be treated as one)

    the function is made to compile as easy as

    fn foo<'l: 's, 's>(v: *mut Box<dyn Trait + 'l>) -> *mut Box<dyn Trait + 's> {
        v.cast()
    }
    
    ↩︎
1 Like

So, the point of view regarding

fn foo<'l: 's, 's>(v: *mut Box<dyn Trait + 'l>) -> *mut Box<dyn Trait + 's> {
    v
}

seems a bit backwards.

The idea is that 'lt -> *mut Box<dyn Trait + 'lt> cannot see its 'lt parameter shrink (which is what would be needed for 'l : 's -> 's to work) because such a type is non-covariant because the 'lt is behind a *mut which is non-covariant (nor-contravariant, i.e., invariant).

  • Whereas replacing *mut with *const, which, on the other hand, is covariant, lets it compile fine[1].

The "exception" to this point of view is the very specific case of *mut or &mut to a dyn ... + 'lt directly:

fn bar<'l : 's, 's>(l: *mut (dyn Tr + 'l)) -> *mut (dyn Tr + 's) { l }

compiles fine! :open_mouth: That's the one that ought not to compile based on variance rules, and which nonetheless does. Regulars of this forum such as myself were genuinely puzzled by this at the time.

And it turns out that this specific case works thanks to another rule allowing lifetimes to shrink, but which is way more restrictive than variance / which does not compose like covariance does, which is the rule of "(re)unsized coërcions":

  1. given a PtrTo<dyn Trait + 'usability>,

  2. the original Concrete type behind it was : 'usability, and thus, : 'shorter for any 'shorter where 'usability : 'shorter

  3. thus[2], it is sound to allow PtrTo<dyn Trait + 'usability> to be coërced to PtrTo<dyn Trait + 'shorter>.

And your initial example just does not apply here, since we don't have a PtrTo<dyn Trait ...>, but a PtrTo<PtrTo<dyn Trait ...>>


  1. since 'lt -> dyn Trait + 'lt is covariant as well ↩︎

  2. (because nobody is supposed to be able to overwrite the pointee with another dyn Trait + 'usability, which is the mechanism which elsewhere requires &mut to be invariant) ↩︎

5 Likes

@Yandros Thank you very much for your explaination! Things are clearer to me now! :slight_smile:
@steffahn I'm sorry if I once again expressed myself unclearly.^^ But sometimes it's difficult to even express what the real problem is that you have. :smiley:

1 Like

It was just for providing nesting with invariance; I could replace it with a Cell or such I suppose.

I always wondered why this was allowed, and concluded that it happens to line up with the fact that !Sized types can't be written by-value. But I thought that this is somewhat accidental, and surely not a strong enough guarantee to make &mut dyn Trait covariant?! After all, we may get moving-around-unsized-stuff-by-value, in which case this is going to be unsound once again. Is this the real/only reason behind this exception?

Isn't there also something wrong with it? It seems to me I should be allowed to bytewise overwrite the referent with an identically-sized (or maybe even a smaller, I'm not sure) trait object through a mutable reference using ptr::copy(). Of course I wouldn't ever want to do something as horrendous as that, but… nevertheless, isn't this a soundness hole?

2 Likes

No that should be unsound no matter if the variance exception in question existed. A &mut dyn Trait is already potentially the result of an upcast, from a concrete type &mut Foo with Foo: Trait. E.g. it can be a re-borrow of a &mut Foo reference that still exists somewhere, borrowing an owned Foo value that still exists somewhere, and will be further used, and dropped as such a concretely typed value, after the borrow has ended. Overwriting the target of the &mut dyn Trait with anything but a Foo value is thus unsound. For what it’s worth, the Trait may be empty (no methods), so there’s no way from a &mut dyn Trait value alone to enforce the necessary restrictions (i.e. you can’t tell what the underlying type Foo was), so overwriting trickery is outlawed.

The situation is different for owned trait objects. Like a Box<dyn Trait> or Arc<dyn Trait>. Those can be copied moved around safely (after all, the conversions between Box<T> and Arc<T> work for unsized types), because there’s no other owner that’s knowing more about the concrete type. They are owned as trait objects and will be destroyed through the destructor in the vtable.

Ah, „destructor in the vtable“ – perhaps a factor to make the think I said isn’t possible in the first paragraph possible after all? E.g. if the function pointer of that destructor matches? (Not that that would be a particularly nice set of restrictions…) No, even for types without destructors, e.g. turnig a &mut bool and a &mut u8 into &mut (dyn Trait + 'static), doesn’t mean you’re allowed to start copying the u8 value into the bool. The original owner cares (under threat of UB) about illegal bit patterns.

It’s perhaps unfortunate, that there isn’t a type to say &mut truly-dyn Trait that allows moving/replacing values, too (though, the abstraction might need to be a bit different anyways, since the vtables would need to move with the data… so, I guess, no re-borrows whatsoever? :thinking: [1]). Compare this to slices where through a reference &mut [T] you are allowed to move the slice by bit-copying (if the length is correct).


  1. at which point, are we perhaps fully, or mostly, at an “owned” type already!? (owning the value, not necessarily the allocation) ↩︎

2 Likes

It wasn't accidental (search for "Interaction with object coercion"). The justification for soundness in that RFC is based around ?Sized and assignments, but you can read Ralf's alternative take on the matter here (he didn't buy the assignment argument on its own).

The RFC I linked to isn't the source of the covariant-like behavior of the unsizing coercion; it says the behavior is pre-existing. I believe it's been that way since object lifetimes existed,[1] but I didn't find a justification presented at that time. However, the default bounds do introduce another motivation, albeit self-inflicted: If you take a &'a mut (dyn Trait + 'a) and try to call two other &'a mut (dyn Trait + 'a) taking functions, you would get a borrow check error if the inner 'a couldn't coerce to something shorter, ala &'a mut This<'a>. And if all you had was a &'short mut Box<dyn Trait + 'static>, you couldn't call a &'a mut (dyn Trait + 'a)-taking function either.

It's unclear to me how common a &'a mut (dyn Trait + 'a) pattern (versus &'a mut (dyn Trait + 'b) pattern) was before the defaults, so perhaps it was just amplifying an existing motivation. The former pattern is what was suggested as part of RFC 0192 (which introduced object lifetimes), so maybe it was used "everywhere". (There was only 6 months between the two landing.[2])


  1. that's in src/librustc/middle/typeck/check/regionck.rs line 812, thanks GH ↩︎

  2. pre-1.0 to boot ↩︎

1 Like

Ah, it's been a while since I thought about dyn trait lifetimes. :wink:

But my argument only works when dyn Trait is used in a covariant position. &'a mut dyn Trait + 'b should still be invariant in 'b.

2 Likes

It is definitely a good and legitimate question, that's for sure!

1) It does not seem possible to exploit this with a legitimate impl

Click to hide

I'll start with @steffahn points, which are the main argument about why the following function ought to be unsound no matter the trait involved:

fn somehow_swap<'usability>(
    a: &mut (dyn SomeTrait + 'usability),
    b: &mut (dyn SomeTrait + 'usability),
)
  1. First and foremost, "the vtable problem": while Rust represents these wide pointers as:

    &mut dyn Trait
    

    In practice, layout-wise, this is rather:

    Dyn<Trait, &mut ?>
    
    • In fact, if we write the + 'usability here, we end up with:

      Dyn<Trait + 'usability, &mut ?>
      

      So we can see this + usability is not really a property of the pointee, but rather of the pointer!

    insofar:

    • it is a special &mut type (w.r.t. the usual &mut impl Sized[1] references).

    • with the dyn Trait metadata inline rather than behind the &mut ? indirection (even if this metadata happens to be, itself, behind &'static indirection): every such &mut dyn Trait = Dyn<&mut ?, Trait> packs its own copy of this metadata!

    This, thus, already makes overwriting the pointees with different backing concrete types unsound. To illustrate, if different backing types were allowed, then already something as simple as:

    trait ToBool {
        fn to_bool(self: &Self) -> bool;
    }
    
    impl ToBool for bool { fn to_bool(self: &bool) -> bool { *self } }
    impl ToBool for u8 { fn to_bool(self: &u8) -> bool { *self != 0 } }
    

    would be problematic. Indeed, consider:

    // note: the `<bool as ToBool>::to_bool` is thus, machine-wise,
    // probably implemented as `transmute_copy::<?, bool>(self)`.
    
    let a = &mut true as &mut dyn ToBool;
    let b = &mut 42_u8 as &mut dyn ToBool;
    
    somehow_swap(a, b);
    
    a.to_bool(); // <- does `transmute_copy::<?, bool>(&42_u8)` 💥
    

    Can only swap the same type!

  2. Then, therefore, we have the problem of type identification. This leads us to Any, which in turn requires : 'static. So no problem there.

  3. But what about user-implemented dyn LtAny<'some_lifetime>? (these can be written in a sound manner, although that's a topic for another thread). Well, in that case, on top of that + 'some_lifetime "dangerously covariant" lifetime parameter, now we also have this explicit <'some_lifetime> parameter, which is invariant. Hence no problem either.

Now, this explanation seems to suggest that the "accountability for UB" seems to lie in the somehow_swap() function more than in the + 'lt "acting covariantly when directly behind a &mut", but it does so with a list of cases that we hope is exhaustive rather than with a more direct conceptual point, so we may still have a lingering doubt of "have we truly considered all cases" looming over us, as is often the case with unsafe and trying to reason about 100% air-tight soundness.


2) Back to basics to illustrate: the unsized coërcion

Click to hide

The observation that convinced me about all this was, rather than focusing on the re-unsizing coërcion case, to first think about the simpler unsizing coërcion case:

/// For any `dyn`-safe `Trait`:
fn demo<'usability>(
  r: &mut (impl 'usability + Trait /* + Sized */),
) -> &mut (dyn  'usability + Trait)
{
  r /* as _ */
}

Now, depending on the point of view, this should both:

  • be blatantly obviously fine and sound;
  • be screaming the "but &mut is invariant!" issue: the very issue we are ourselves wondering about.

The latter may not be obvious at first, but I hope that once I present it, and thanks to the former, we'll be convinced of the soundness of the latter, and by extension, of re-unsizing coërcions, even behind &mut, as well.

Indeed, lets pick:

  • impl 'usability + Trait = &'static str (and Trait = Display or w/e),
  • and 'usability something shorter than 'static:
fn demo<'usability>(
  r: &mut (&'static str),
) -> &mut (dyn 'usability + Display)
{
  r
}

Indeed, &'static str : 'static : 'usability, so we do have an impl 'usability + … in the input, and thus are allowed to have a dyn 'usability + … in the output, despite having had to deal with a &mut all along.

Moreover, conceptually, the implicit + Sized bound on the impl 'usability + Trait in that function was not playing a critical role, API-wise, only a technical role, implementation-wise.

So we could consider replacing it with ?Sized + TechnicallyDynCoërcible in pseudo-code / conceptually:

fn demo<'u>(
  r: &mut (impl 'u + Trait + ?Sized + TechnicallyDynCoërcible),
) -> &mut (dyn  'u + Trait)

and from there, we could then pick, for any 'big : 'u:

impl 'u + Trait + ?Sized + TechnicallyDynCoërcible = dyn 'big + Trait

The re-unsizing coërcion is thus really, conceptually, just an extended case of unsized coërcion!

  • For those considering that my TechnicallyDynCoërcible is using a circular argument, the reason I've gone with such a roundabout name rather than Upcast<dyn 'u + Trait> (an actual proper trait expressing this), was precisely to avoid the circular problem: I'm trying to argue that dyn 'big + Trait : Unsize<dyn 'u + Trait> even behind &mut, by assuming there being some technical (i.e., compiler-intrinsic and/or unsafe operation) allowing the transformation, and trying to illustrate that, conceptually, the impl 'u + … -> dyn 'u + … coërcion, even behind &mut, is fine, modulo the technical implementation, which happens to be trivial in the case of dyn 'b + Trait.

  • another way of seeing this is to consider having a Super Rust language (with perfect type introspect/reflection), which would allow perfect downcasting, even in the face of lifetimes:

    fn perfect_downcast(
      r: &mut (dyn  'u + Trait),
    ) -> &mut (impl 'u + Trait)
    {
        // would yield the original thin / "concrete" type, but which
        // happens to be existential as far as the caller is concerned.
    }
    

    Armed with such a tool, we could then implement re-unsized coërcions using plain unsized coërcions:

    fn reunsized_coercion<'b : 's, 's>(
      r: &mut (dyn 'b + Trait),
    ) -> &mut (dyn 's + Trait)
    {
        let r: &mut (impl 'b + Trait) = perfect_downcast(r);
        let r: &mut (impl 's  + Trait) = r; // `impl 'b + … : 'b : 's`
        r as _ // <- unsized coërcion with `'u = 's`
    }
    

3. Why somehow_swap() cannot be sound

Click to hide

The idea is to observe there is a fundamental difference between the meaning of <'lt>, and that of + 'usability, to which I hinted, regarding the LtAny<'lt> design:

  • <'lt> expresses the property of being infected by exactly 'lt. A corollary is that for Self : 'u to hold, it is necessary that 'lt : 'u hold.

  • + 'usability expresses an existential and erased lifetime property.

    It may seem suprising to talk of erasing lifetimes when dyn Traits always have, at the very least, this seemingly pesky + 'u around, but that's just because "sane"/frequent Rust types only involve types infected with one lifetime parameter. In such a case Type<'lt> : 'lt follows, in both directions, which is why it is easy to conflate both notions.

    • Box<&'a mut &'b str> -> Box<dyn 'a + Debug>, for instance, is a nice example of this: we had two infecting lifetime parameters in input, and end up with fewer lifetimes in output: effectively, we have achieved lifetime erasure!

    • Another example could be the typical BoxFuture that #[async_trait] yields:

      #[async_trait(?Send)]
      trait Demo {
          async fn foo<'a, 'b>(
              a: Arc<Mutex<&'a str>>,
              b: Arc<Mutex<&'b str>>,
          )
          {}
      }
      

      is basically:

      trait Demo {
          fn foo<'a, 'b, 'intersection>(
              a: Arc<Mutex<&'a str>>,
              b: Arc<Mutex<&'b str>>,
          ) -> Pin<Box<dyn 'intersection + Future<Output = ()>>>
          where
          //     ⊇
              'a : 'intersection,
              'b : 'intersection,
          {
              return Box::pin(async move {
                  // impl InfectedBy<'a> + InfectedBy<'b>
                  let _captures = (&a, &b);
              });
          }
      }
      

      In which case we have erased all lifetime params, both 'a and 'b, down to their "lowest common denominator": their intersection lifetime. Anything else was "superfluous lifetime info" (w.r.t. the Future API), which could thus be erased. Should the Trait API need to use any of these lifetime parameters explicitly (e.g., dyn FnOnce() -> &'a str), then it means that lifetime parameter is to be part of the trait definition itself, as one of its generic lifetime parameters: it gets infected by it! (e.g., back to dyn LtAny<'lt>).

All that to say that what

  • + 'usability
    

conveys, in fact, is the property:

  • exists<'a, 'b, … where
        'a : 'usability,
        'b : 'usability,
           ⋮
        '… : 'usability,
    > InfectedBy<'a> + InfectedBy<'b> + …
    

To illustrate, we could say there is an "electronic cloud" of 'lifetimes orbiting around our type, but all contained within the 'usability..'∞ range (using '∞ as syntax for 'static).

And knowing that:

  • it is correct to "lose information" and just know that this "electronic cloud of lifetimes" is within the 'shorter..'∞ range (where 'usability : 'shorter);

  • it has to be wildly unsound to pretend that one "electronic cloud of lifetimes" is allowed to substitute/replace another one such.

    In other words, somehow_swap() is unsound.


4) If we replace + 'usability with + Trait, reünsizing is just upcasting

Consider:

&mut dyn 'big -> &mut dyn 'small

and compare it to:

&mut dyn Fn() -> &mut dyn FnMut()

In the world of existential capabilities which + … means, these two operations are conceptually the same.

  • it just so happens that we know, implementation-wise / at runtime, that the former conversion does not involve machinery (since lifetimes don't exist at the machine code level), which is why we just talk of a coërcion in that case, versus trait_upcasting in the latter case.

  • see my following post for a short demo that ought to remove any outstanding doubt (using unsized coërcions to reïmplement reünsized coërcions).


  1. &mut impl Thin, actually ↩︎

Yeah, but I implied that the vtable should be changed, too. (I obviously recognize that different concrete types have different vtables.) AFAIK there are unstable APIs for messing around with pointer metadata, so this sounds like it's possible.


That said, the lifetime argument is of course legitimate, but it still feels ugly/weird that trait objects are special-cased and not invariant.

Okay, after that lengthy post, I've just reälized I should be able to drop-mic this question :smile::

trait Trait {
    … // <- user stuff

    /// Rust, under the hood, also adds:
    fn upcast_lt<'s>(
        //         dyn Trait
        self: &mut Self,
    ) -> &mut (dyn 's + Trait)
    where
        Self : 's,
    ;
}

/// Rust, under the hood, also adds:
partial impl<T : Trait> Trait for T {
    fn upcast_lt<'s>(
        self: &mut T,
    ) -> &mut (dyn 's + Trait)
    where
        Self : 's,
    {
        self /* as _ */ // <- UNSIZED COËRCION
    }
}

fn reunsized_coercion<'l : 's, 's>(
  r: &mut (dyn 'l + Trait),
) -> &mut (dyn 's + Trait)
{
  <dyn 'l + Trait as Trait>::upcast_lt(r)
}
1 Like

Hey folks,

I'm also trying to understand the ?Sized argument for the covariance in unsized coercions.

When I understand it correctly the point is that in &mut dyn Trait it's not possible that dyn Trait is something which is somewhere directly assigned to furthermore nested unsized coercions for e. g. &mut &dyn Trait are not possible therefore the covariant behaviour here is allowed as you can exclude (as far as I can tell) unsound side effects, right?

Regards
keks

Where the (re-)unsizing coercion cannot apply, i.e. in nested context, you only have normal variance to rely on.[1] For that example type, the trait object lifetime is in invariant position and the lifetime cannot be shortened.

             dyn Trait + 'tr  // covariant in 'tr
        &'r (dyn Trait + 'tr) // covariant in 'r and 'tr
&'m mut &'r (dyn Trait + 'tr) // covariant in 'm, *invariant* in 'r and 'tr

// As `&'r T` is covariant in T, and
// `&'m mut U` is covariant in `'m` but *invariant* in `U`

In contrast with

           dyn Trait + 'tr  // covariant in 'tr
     &'r1 (dyn Trait + 'tr) // covariant in 'r1 and 'tr
&'r2 &'r1 (dyn Trait + 'tr) // covariant in 'r2, 'r1, and 'tr

For example.


In both cases "normal variance" is allowed / can take affect; that's covariant behavior in covariant position, but invariant behavior (a no-op) in invariant position, etc. Variance never goes away; the (re-)unsizing coercion adds strictly more possible changes, when applicable.


  1. Unless perhaps you wrote some other explicit conversion somewhere. ↩︎

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.