Rust contravariance, Regarding the contravariance of `fn(&'a i32) -> ()`

As is well known, contravariance is quite rare in Rust. According to the Rust Reference (Subtyping and Variance - The Rust Reference), only fn(T) -> () is contravariant on T. I’ve never fully understood how the following conclusion is inferred:

Since &'static i32 <: &'a i32, it follows that fn(&'a i32) -> () <: fn(&'static i32) -> ().

Why is fn(&'a i32) a subtype of fn(&'static i32)?
Can someone please provide a detailed derivation of this? Thank you very much!

when type A is subtype of B, then it is allowed to assign a value of A to a variable of type B.

in the case of a function, if I say I need a function that take &'a i32 as argument, it means I have (or can produce) some value of subtype of &'a i32, i.e. reference of i32 with lifetime 'b outliving 'a, so I can call the function.

on the other hand, the function's parameter type can be any type, of which the &'a i32 type is a subtype, in other words, you can give me a function if the parameter type is a supertype of what is required to be.

use your example:

  • &'static i32 is a subtype of &'a i32, means a value of &'static i32 can be assigned to a variable of type &'a i32;
  • conversely, fn(&'a i32) is a subtype of fn(&'static i32) means, when a value of type fn (&'static i32) is required (which means the caller can provide a &'static i32 value), you can give a function of type fn(&'a i32), which can happily accept the &'static i32 argument that the caller were able to provide.

the key to understand why function types is contravariant to the parameter types, is the function "demands" or "consumes" the argument, not "give" or "produce" values of that type. you might have heard people referring the parameter types as "in negative position" and the return types as "in positive position".

6 Likes

Could you explain in detail why fn(&'a i32) is a subtype of fn(&'static i32)?

you can sort of think subtyping like the "is-a" relation (as in object-oriented languages).

when we say A "is-a" B, it really means, everywhere I can "use" B, I must be able to also "use" A in the exact same manner.

the type fn(&'static i32) is a function that "demands" an argument of &'static i32, while fn(&'a i32) "demands" &'a i32 (for some lifetime 'a), in some sense, it "demands less".

for the function types, what does it mean to "use" a function? to call it! and in order to call fn(&'static i32), I must be able to provide a &'static i32.

if you think about it, everywhere I can call fn(&'static i32), I can also call (i.e. "use") fn(&'a i32), in the exact same manner (meaning, with the exact same argument).

this subtyping relation can be demonstrated with this snippet:

fn f1(_: &'static i32) {}
fn f2<'a>(_: &'a i32) {}

static x: i32 = 42;

// declare a function pointer of type `fn(&'static i32)`
let mut fp: fn(&'static i32);
// this assignment is valid: `f1` is a function of the exact type
fp = f1;
// "use" it, give it an argument of `&'static i32`
(fp)(&x);

// this assignment is also valid, type of `f2` is a subtype
fp = f2;
// "use" it exactly the same way
(fp)(&x);
2 Likes

From a logical deduction perspective, the conclusion I reached is that fn(&'static i32) is a subtype of fn(&'a i32). Clearly, this is incorrect. Here's my reasoning:

Since &'static i32 <: &'a i32, I concluded that in places where the return type is &'a i32, we could replace it with &'static i32. This is covariance, and such replacement follows the Liskov Substitution Principle (LSP).

For the types fn(&'a i32) and fn(&'static i32), I reasoned that wherever fn(&'a i32) is used, we can replace it with fn(&'static i32) because the Rust compiler will automatically downcast 'static to 'a. Thus, &'static i32 can naturally become &'a i32. This is also what we discussed before: &'static i32 is effectively a &'a i32.

According to this logic, it seems that fn(&'static i32) is the subtype and fn(&'a i32) is the supertype. So, we would conclude: if &'static i32 <: &'a i32, then fn(&'static i32) <: fn(&'a i32), which is not contravariance.

Though this conclusion is incorrect, I'm not sure where I misunderstood.

As for the code snippet you provided above, it's the result of the compiler's compilation process, which can only verify that contravariance does indeed exist in Rust. I don't think it proves contravariance itself.

Well, you are assuming covariance here, of course you get covariance (which is the wrong result).

2 Likes

We can't. If we use fn(&'a i32), it means that we can handle this short-lived lifetime - e.g. we won't shove it into thread-local to live forever (since it would dangle). If we have fn(&'static i32) though, it can treat this reference as "living forever", and therefore we can't use it in place where &'a i32 is passed.

On the other hand, if we have &'static i32, we can pass it to fn(&'a i32) - the function would simply not use the extra capabilities. Therefore, fn(&'a i32) can be used where fn(&'static i32) is expected.

1 Like

only for the "return" value, NOT for the arguments.

the information of the arguments are not "contained" within the value of fn(&'static i32). you can "downcast" when you "give", but you must "upcast" when you "demand".

2 Likes

Is this what you mean: When we need to provide a type as a function return type, only subtype can be converted to supertype, and when we define a function signature that requires a type as a parameter, only supertype can be converted to subtype.

However, where can I find a detailed explanation of these conversion restrictions under the terms "provide" and "require"? I haven't seen such detailed explanations in the Rust Reference or Rust Onomicon.

Could you explain in more detail? The issue with short-lived references should be something the compiler can recognize. As for the second point you mentioned, I didn't quite understand what you meant.

Yes, I did make that assumption. However, when a generic with a lifetime parameter appears in the parameter list, doesn't the compiler automatically infer the lifetime parameter?

And it does, since it doesn't allow use fn(&'static i32) where fn(&'a i32) is expected.

1 Like

Nope. It's always "subtype -> supertype", no matter what. It's just that for function types, "type of function A can be converted to type of function B" means "argument for function B can be converted to argument for function A".

1 Like

What you just said: "For function types, 'the type of function A can be converted to the type of function B' means 'the argument for function B can be converted to the argument for function A.'"

Where can I find a more detailed explanation of this point? I think this should be a feature supported by most programming languages with generics. Other languages, besides Rust, should have more detailed explanations on this, but I'm not very familiar with them.

Frankly, the most of the complexity that you see in IT (and everywhere) is result of an attempt to oversimplify. Everything should be as simple as it can be, but not simpler! When we try to oversimplify… we end up with the end result that's more complex… or, sometimes, makes no sense at all.

Forget about “long words”. Go to the beginning. The main building blocks here are references. Shared and exclusive, not mutable and and immutable, as we know.

Exclusive reference is “simple”: they are, well, unique. While one reference exists, another couldn't exist. This leads to somewhat non-trivial complication with reborrows, but for variance it's simple: invariant… unchangeable. That's it.

Now. Shared references are, well… copyable. You can have many of them. In fact, in your head (and in compiler's prover) you may have as many of them as needed. You can imagine that one, single, 8-byte slot in a machine code carries many references, maybe even infinitely many… maybe not… but there are as many as needed. The catch here is that they all have to point to one object and thus have the same representation in memory (lifetimes only exist during compilation time, but actual address exists at runtime) and we can only go from long lifetime to short one (in particular if we know that reference is valid for 'static then it's valid for any other lifetime, too).

But, well, it would be nice to be able to pass these entities around. Again, with mutable references everything is simple: these are invariant, functions that accept them are invariant, too.

And with shared references it's like this:

If we “carry together” infinite number of references and can always say “we have a long-living one, thus we have a short-living one, too” then functions would invert that: “if function is Ok with a short reference, then we can pass long-living, one, too… function would “cut-it-to-size” as needed”.

And now we can semi-arbitrarily assign names: shared references would be covariant and functions would be contravariant… we can assign names in the opposite style and say functions are covariant and references are contravariant… it's the same endless discussion about whether Square is the subtype of Rectangle or the other way around.

But the key insight is that precise sentence: properties of types and function that accept these types are complementary. They go in the opposite direction: if type can be “cut-to-size” then functions would “want” to get largest possible type they can get, if types can be “inflated as needed” then functions would “want” to get smallest possible thingie. You can say 'static is top or bottom, but functions would always “want” it (but would rarely get).

2 Likes

You don't even need generics for that. If you have foo(Animal) in any OOP language then you can create a new, more concrete, function |c: Cat| foo(c) and thus go from foo(Animal) to foo(Cat) (even if language doesn't support that). Like this, in C++:

extern void foo(Animal*);

using Bar = void (*)(Cat*);

Bar ptr_to_foo = [](Cat* cat) { return foo(cat);};
1 Like

In your C++ code, it only demonstrates converting Cat* to Animal*, which is a subtype-to-supertype conversion. However, the function type void (*)(T*) itself is not demonstrated.

What do you mean? I had void foo(Animal*), I have got ptr_to_foo of type void (*)(Cat*).

But that's the thing! It's because we can go from Cat* to Animal* we can use function that accept Animal* where function that accepts Car* is needed.

That's the whole story of covariance and contravariance, really: if we can go from B to A, somehow, then we can go f(A) to f(B) by creating such “adapter” and, if said adaptor is zero-sized (as in Rust with lifetimes that are always removed from machine code), then f(A) can be converted to f(B).

Conversion of types and conversion of functions that accept these types always go “in the opposite direction”.

2 Likes

I don't really understand what you mean with that phrase, BTW. The fact that compiler doesn't do that automatically? Well, duh: not everything that can be added to some programming language is automatically added to it. One may imagine Rust where covariance/contravariance doesn't exist and one have to convert from &'static i32 to &'a i32 or from fn(&'a i32) to fn(&'static i32) manually.

It would be harder to use, but it would be possible to more-or-less mechanically convert from Rust as it exist today into such language anyway.

In most language with “normal” types that's not done simply because it's not too coomon of a need, but with lifetimes refusal to do that would make language very unergonomic. But manually, in C++, you can do go from function to function like this, if you don't like lambdas:

template<auto fn, typename T>
void Adapt(T* t) {
    return fn(t);
}

extern void foo(Animal*);

using FnAnimal = void (*)(Animal*);
using FnCat = void (*)(Cat*);
using FnBlackCat = void (*)(BlackCat*);

FnAnimal fn_animal = foo;
FnCat fn_cat = Adapt<foo, Cat>;
FnBlackCat fn_black_cat = Adapt<Adapt<foo, Cat>, BlackCat>;

Here you can see how we are going from foo to Adapt<foo, Cat> and then from Adapt<foo, Cat> to Adapt<Adapt<foo, Cat>, BlackCat>.

We can go from void(Animal*) to void(Cat*) and from void(Cat*) to void(BlackCat*) exactly and precisely because we can go from Cat* to Animal* and from BlackCat* to Cat*.

1 Like

Hm ... let me provide a demonstration somewhat closer to formal logic:

(all code here is only pseudo-rust, I hope it can be understood)

The "A is subtype of B" means (can be viewed as) "there exists a function coerce<A, B>: A -> B (that is sufficiently trivial (so it can be applied implicitly))".

Now, given that A is subtype of B: what is the relation of fn(A) -> C and fn(B) -> C?

We can implement coerce<fn(B) -> C, fn(A) -> C> like this:

fn coerce(fn_b_to_c: fn(B) -> C) -> (fn(A) -> C) where A <: B {
    // i.e. compose given `fn_b_to_c` with `coerce<A, B>`
    // the existence of `coerce<A, B>` follows from the assumption of `A <: B`
    |arg: A| fn_b_to_c(coerce<A, B>(arg))
}

whether that is considered sufficiently trvial would depend on the language. When it is, we have "fn(B) -> C is subtype of fn(A) -> C".

Now if you are claiming "fn(A) -> C is subtype of fn(B) -> C" (still under the assumption of "A is subtype of B"), can you provide an implementation of coerce: (fn(A) -> C) -> (fn(B) -> C)?

In general (i.e. unless more details about A and B are known) it is not possible (you cannot easily convert an apple-consumer into a fruit-consumer).

So in most (all?) programming languages function types are either contravariant (as demonstrated above) or invariant (when the above coercion is not applied implicitly) in their argument types, but never covariant.

3 Likes