Asserting covariance with a macro

Asserting covariance naively

For a crate I'm working on (a souped-up version of yoke, tentatively named attached-ref), I need to be able to assert that some higher-kinded type T<'varying> is covariant over its 'varying parameter in a macro[1].

Brushing past all the details about how I'm implementing higher-kinded types with a lifetime parameter, one might naively try the following[2]:

macro_rules! assert_covariant {
    ($type_path:path) => {
        fn __coerce_owned<'long: 'short, 'short>(
            this: $type_path<'long>
        ) -> $type_path<'short> {
            this
        }
        
        fn __coerce_ref<'r, 'long: 'short, 'short>(
            this: &'r $type_path<'long>
        ) -> &'r $type_path<'short> {
            this
        }
    };
}

If T<'varying> is covariant, then the 'long lifetime can be shortened to 'short in covariant positions, and the above compiles. The assumption is that the reverse should also hold: if assert_covariant!(T) compiles, then T<'varying> should be covariant over 'varying. The above two checks are essentially what derive(Yokeable) performs.

An obstacle: Deref

However, proving that T<'long> can coerce to T<'short> in a few covariant positions doesn't imply that a covariant coercion was the thing responsible. There are several other possible coercions (with variance-based coercions falling in the category of subtyping coercions): Type coercions - The Rust Reference

The most "interesting" to me is Deref coercion, which lets you run arbitrary code and perhaps produce some very pathological implementations. (See also https://internals.rust-lang.org/t/unsoundness-in-pin/11311.) I was able to whip up the following, which (for at least a little while) made me fear for derive(Yokeable)'s soundness:

struct Invariant<'a>(&'a mut &'a str);
struct Covariant<'a>(&'a str);

// Idea:
// &'long Invariant<'long> coerces (via Deref) to
// &'long Covariant<'long> which coerces (via subtyping) to
// &'short Covariant<'short> which coerces (via Deref) to
// &'short Invariant<'short>

impl<'a> core::ops::Deref for Invariant<'a> {
    type Target = Covariant<'a>;

    fn deref(&self) -> &Self::Target {
        Box::leak(Box::new(Covariant("hello")))
    }
}

impl<'a> core::ops::Deref for Covariant<'a> {
    type Target = Invariant<'a>;

    fn deref(&self) -> &Self::Target {
        Box::leak(Box::new(
            Invariant(
                Box::leak(Box::new("hi"))
            )
        ))
    }
}

Fortunately, the impl generated by derive(Yokeable) fails, and the closest I could come to causing a problem is the following:

unsafe impl<'a> yoke::Yokeable<'a> for Invariant<'static> {
    type Output = Invariant<'a>;
    #[inline]
    fn transform(&'a self) -> &'a Self::Output {
        self as &Covariant<'_>// as &Invariant<'_>
    }
    #[inline]
    fn transform_owned(self) -> Self::Output {
        *(&self as &Covariant<'_> as &Invariant<'_>)
    }

    // ...
}

Crucially, an arbitrary owned self is not necessarily a position where deref coercion can occur; I needed to throw in a & to enable the coercions. The compiler should never insert a *& in an implicit coercion. (Plus, it fails for other reasons, too; implicit autoderefs would hit the recursion limit, and while a Sufficiently Smart Compiler™ might figure out which coercions to apply to transitively coerce Invariant<'long> to Invariant<'short>, the current compiler would need to be manually informed of the middle Covariant step.)

Moreover, derive(Yokeable) cannot be applied to any of the std types for which deref coercion can occur (that is, derive(Yokeable) does not have to work when Self is &'static Invariant<'static>, in which case deref coercion could apply).

However, I'm not implementing yoke. My crate is much more generalized, and, unfortunately, I do also need to care about T<'varying> HKTs where the outermost type may be (for instance) a reference, including the &'varying Invariant<'varying> HKT. While I could simply rely on rustc being far from a Sufficiently Smart Compiler™ capable of causing a problem... I hate that approach.

Properly (I hope) asserting covariance

So, I need to get a wrapper W around T<'varying> such that W<T<'long>> coerces to W<T<'short> if and only if a T<'varying> is covariant over 'varying. Assuming that W is covariant over its parameter, that's equivalent to requiring that a subtyping coercion be the only possible coercion from W<T<'long>> to W<T<'short>. My revised attempt, then, is

macro_rules! assert_covariant_improved {
    ($type_path:path) => {        
        fn __coerce_option<'long: 'short, 'short>(
            this: ::core::option::Option<*const $type_path<'long>>
        ) -> ::core::option::Option<*const $type_path<'short>> {
            this
        }
    };
}

Going through the possible coercions (and using 'lt_1 and 'lt_2, since the below reasoning should also apply to showing that T<'varying> is contravariant over 'varying):

  • Option<*const T<'lt_1>> can coerce to Option<*const T<'lt_2>> if Option<*const T<'lt_1>> is a subtype of Option<*const T<'lt_2>>. (By the covariance of Option and *const over their sole parameters, this subtyping occurs if T<'lt_1> is a subtype of T<'lt_2>. Unless I'm missing something really important, at least in simple cases like Option and *const even if not in some pathological edge case I'm unaware of, it also occurs only if T<'lt_1> is a subtype of T<'lt_2>.)
  • Transitive coercions build on other coercions, and subtyping relationships are transitive; so, if subtyping coercions are the only possible coercion from Option<*const T<'lt_1>> to Option<*const T<'lt_2>> other than transitive coercions, then all possible coercions between the types are subtyping coercions.
  • &mut T to &T, *mut T to *const T, &T to *const T, and &mut T to *mut T do not apply, since Option<*const T<'lt_1>> is not a reference or raw pointer. Only the outer type matters (and a quick test on the playground, just in case the reference is wrong, confirmed as much).
  • Neither &T or &mut T to &U if T implements Deref<Target = U> nor &mut T to &mut U if T implements DerefMut<Target = U> apply, since Option<*const T<'lt_1>> is not a reference. The dreaded pathological Deref coercion cannot harm us!
  • Unsizing coercions cannot coerce Option<*const T<'lt_1>> to Option<*const T<'lt_2>>, since Option<*const T<'lt_1>> is not &_, &mut _, *const _, *mut _, or Box<_>.
  • Option<*const T<'lt_1>> is not a function item type or a non-capturing closure.
  • Option<*const T<'lt_1>> is not !.

Therefore, only subtyping coercions can apply; assert_covariant_improved should work. I suppose one (extreme) potential fear, then, is that some new coercion gets introduced into a future Rust version which somehow manages to break no safe code and no unsafe code other than mine (due to letting more stuff compile than I assume possible), making my code then be unsound.

Review

Since there's not much point in worrying about hypothetical breakage, the most relevant fear is that something about my above reasoning is wrong. (Maybe there's a form of coercion not listed in the reference that I'm unaware of.) Thus why I've marked this post as a code review; I'd like a few more eyes on these ideas. Is assert_covariant_improved as correct as I think it is?

I threw together a few things in this playground (some of it slightly out of date from above code) that might serve as a starting point for messing around:


  1. This is needed to provide a macro that soundly implements an unsafe trait indicating that a HKT is covariant, for user convenience in cases simple enough for the macro to implement the trait. ↩︎

  2. setting aside details like generic parameters and where-bounds ↩︎

IIRC, ouroboros also does some variance sanity checks like this from a macro, I don't recall off the top of my head what exact approach they're using, but it could be worth to find that out as a point of comparison. You're right to not use an owned to owned coercion as a test, not only because of Deref but also because of unsizing coercions, etc.

(I have never considered Deref too strongly myself I believe, when reviewing that crate. So it's possible that could be another avenue for finding new soundness issues in ouroboros :thinking:.)

I tried to find unsoundness in ouroboros for a while, even using really strange proc macros. I found out that apparently you can use a proc macro to generate the type of one of the self-referential fields, and have the type substantially differ depending on the lifetime fed to it ('static, 'this, and 'this0 are the three I saw ouroboros use).

None of that amounted to much, though, since ouroboros almost entirely relies on the compiler's coercions. I think my understanding of the covariant and not_covariant markers, then, is that they genuinely just determine what methods are generated; even if I trick it into generating covariance-requiring methods on a type that's actually invariant, it just blindly generates the method and causes a compiler error with no room to exploit the oddity. The "variance sanity checks" seem to be about providing convenient defaults to avoid always requiring variance notations, but ouroboros doesn't need to be correct about variance, in the sense that they're load-bearing for "does this compile" but not "is this sound".

I suppose it's for the best my efforts came to nothing.

In ouroboros, that is. I caused a use-after-free fairly easily via yoke with a simple attribute macro that changes the definition of the struct that derive(Yokeable) had been applied to. I'm sure most derive macros out there aren't incredibly rigorous about ensuring their limited reflection capabilities weren't horribly manipulated. There's so many other hygiene concerns that I'm not sure whether anyone's encountered this possibility before.

1 Like

you may want to look at `derive(PartialEq)` on enums is unsound with user-defined attribute macros. · Issue #148423 · rust-lang/rust · GitHub which has a similar concern.

relly unsure how/if this will ever be fixed.
imo what would be necessary would be to have a mechanism that completly separates macros that can modify a struct's definition and the ones that can add additional traits and impls, so that all of the first group runs before the second.

1 Like