ELI5 "existential"

So there's this RFC for named existential types

When I read through it, I kinda get what's going on in an "oh I'd like to be able to do that" sorta way, but I don't really understand the jargon.

Maybe it's due to a fuzzy grasp on impl Trait which this apparently builds on, but in addition to that, I don't really get what "existential" means here.

Are these types unhappy with the identities they create for themselves, and therefore experience anguish and confusion stemming from the apparent meaninglessness of a world in which they are are compelled to find or create meaning? (cribbed from Wikipedia on Existentialism)

Ok that's a bad joke, of course this has nothing to do with existential philosophy or narrative, but can someone please explain "[named] existential types", ideally just in the context of Rust, without relying on too much general type theory knowledge?

(of course a more advanced explanation that does dig into the math and theory background is appreciated and would be welcome here, I'd just also really appreciate a simple ELI5 style explanation, like what makes a type "existential" in the context of Rust)

4 Likes

The name "existential" comes from the fact that when you type impl Trait, you are saying "there exists a concrete type that goes here". This is as opposed to generics where you say "all types that implement this trait can go here".

Having this pair of concepts with one meaning "all of them" and the other meaning "there is at least one of them" is really common in mathematics.

7 Likes

The name comes from the existential quantification operator βˆƒ from predicate logic. The existential type impl Trait represents some type that implements Trait, without explicitly specifying which type it is. It’s most useful for things like closures or futures, which don’t have a name you can write out explicitly.

For example, there’s no other way to represent the output type of this function:

fn filter_min<I>(iter: I, min: i32)->impl Iterator<Item=i32>
    where I: IntoIterator<Item=i32>
{
    iter.into_iter().filter(move |&x| x > min)
}

The type_alias_impl_trait feature extends this ability to more contexts. In particular, it allows these unnameable types to be used as trait associated types, so you can write something like this:

struct MinFilter<I>(I, i32);

impl<I> IntoIterator for MinFilter
    where I: IntoIterator<Item=i32>
{
    type Item=i32;
    type IntoIter=impl Iterator<Item=i32>;
    fn into_iter(self)->Self::IntoIter {
        let min=self.1;
        self.0.into_iter().filter(move |&x| x>min)
    }
}
6 Likes

For impl Trait doesn't that depend on where it's being used? e.g. in function return position it can only be "precisely one", not "at least one"?

Well yes, but you could imagine an extension where, if you try to return two different types in two branches in the function, the compiler would generate an anonymous enum of those two types and return that. This would correspond better to the "at least" one concept.

3 Likes

Ok, so... I think I get something here - from a practical perspective this is awesome because it allows the impl to define the type instead of having to define it at the call site, which would be impossible?

1 Like

Yeah. We already have impl Trait in return position, which lets us say "this function returns a specific type that implements the trait". The feature you linked would let you do that anywhere, not just in the return position of a function.

2 Likes

ok interesting, that actually fills in a lot of gaps of how this is different than generics :slight_smile:

The missing piece for me I think is how this relates to the definition of "all possible" vs. "at least one"...

I mean, I do see that for this to work the return type needs to be something that exists and it can't be everything...

But on the other hand, it's possible to be confused since generics also get monomorphized by the compiler into "things that exist". They aren't actually universal the way dynamic dispatch is.

The only difference to me seems to be where the "things that exist" are determined. For generics it's determined by the caller and for existentials it's determined by the callee? This doesn't quite click for me how it relates to existance or sets.

Maybe it'll sink in more when the RFC lands and I play more with being able to say "this thing exists I just don't know what it is, let the compiler figure it out"

Thanks!

In practical terms, that’s correct. One way to think about it more theoretically is to consider a generic function as defining a relationship between types. For example, the function signature

fn foo<A:Trait1>(a:A)->impl Trait2

can be translated to a statement like β€œFor every type A which implements Trait1, there is some type B which implements Trait2 such that calling foo(A) produces B.” In logic notation, that would look something like this:

βˆ€(A ∈ Trait1). βˆƒ(B ∈ Trait2). foo(A)->B

(Note: I’m out of practice with this sort of notation, so there’s probably a mistake in there somewhere)

4 Likes

That the final binary wont include a version of the method for literally all types due to monomorphization is just an implementation detail. It's irrelevant from the perspective of us trying to understand the type system.

As for who determines what things are, that is really a layer on top of the whole "exists" vs "for all" thing. When callers determine the type, this corresponds to "for all" because there may be many callers that don't agree on what it should be. However there is only one function definition, so when the callee decides, then it is an "exists".

3 Likes

ah... interesting! So, just for the sake of talking it out... cementing it a bit more for myself and maybe future readers, this is ok:

fn foo<A:Trait1, B: Trait2>(a:A) {
}

because it would mean:

"For every combo of type A which implements Trait1 and type B which implements Trait2, there is a function foo(A)"

The "universality" here is that as the programmer we are free to define all those combos if we want... we can define the same function multiple times, provided we're doing it for different combos of A,B.

However, if we tried to do:

fn foo<A:Trait1, B: Trait2>(a:A) -> B {
}

We get stuck (edit: this is wrong if some info about B is provided, see @alice 's response below) because it would mean:

"For every combo of type A which implements Trait1 and type B which implements Trait2, calling foo(A) produces every type B"

We can't begin to satisfy that the way we could the generics, because we can't define multiple return types the way we could define the function multiple times. The non-universality here is that as the programmer we must teach the compiler what that return type is, it can't be "everything" (but it can be a single "unnameable type" like a closure, or a single container for multiple types like an enum)

Trait Objects blurry the lines though.. they do allow returning "every Trait2"... so are these not really part of the same rigorous type system somehow?

2 Likes

Actually you can have methods like your foo. For example:

use std::iter::FromIterator;

fn foo<A, B>(arg: A) -> B
where
    A: Iterator,
    B: FromIterator<A::Item>,
{
    arg.collect()
}

Here, for every combo A and B that satisfies the constraints, there is indeed a method you can call that takes an A and returns a B.

As for trait objects, those are a single specific type that any implementer of the trait can be converted into.

3 Likes

Can this be inferred or does foo need to be called with the explicit types?

edit oh I see, yeah it can... cool!

let bar:Vec<u32> = foo(vec![42].into_iter());

It will infer what B is if you use it in a manner where there is only one possible choice, e.g. if you pass the return type to a function that takes a Vec<i32>. Otherwise it will need your help.

There's also this, which muddles the waters a bit:

fn foo<A, B>(mut arg: A) -> Option<B>
where
    A: Iterator<Item = B>,
{
    arg.next()
}

Here there really is only one possible choice B for each choice of A, but it is still an "for all". It just so happens that you are saying "for all" in a set containing only one element.

3 Likes

Just to note - in my foo there was not enough information about B so the compiler can't figure out what the function returns. In yours, B is somewhat defined, albeit via A.

Not sure if this is veering too far off-topic, but with min_type_alias_impl_trait I'm not sure why the callback defined in new() isn't enough and the compiler needs a separate function call to get it?

playground

#![feature(min_type_alias_impl_trait)]

type CallbackFn = impl Fn();

struct Foo {
    cb: CallbackFn
}

impl Foo {
    pub fn new() -> Self {
        Self {
            //This doesn't work
            //cb: || {}
            
            //This does:
            cb: Self::make_callback()
        }
    }
    
    fn make_callback() -> CallbackFn {
        || {}
    }
}

The compiler has a fixed list of possible defining uses, and cb: || {} isn't one of them. :shrug:

1 Like

My own short take on this: to be honest, using the word existential for type_alias_impl_trait or for -> impl Trait alone is not really required, since it boils down to a convoluted way to express a much simpler thing.

Let's take a simpler example: say you want me to write down a number between 1 and 10 on a piece of paper. So I do that, and write "some such number". That's the clearer way to put it. But another way to put it, which, granted, does sound a bit weird out of context (but which is technically correct), is that there exists a number between 1 and 10 such that that number equals the one I wrote down.

That is, some unknown but fixed thing is equivalent to "there exists an element [upholding the mentioned properties] that is that thing". So besides playing with words / terminology, nothing complicated going on.

That being said, you may notice that word of fixed I have just introduced. Indeed, in mathematics (in the broader sense, that is, one that involves any form of logical reasoning, it doesn't have to be numerical) we like to express how a given state depends on some parameters / inputs, and then study the changes that occur (or the changes that do not occur / the invariants) when that parameter changes / moves.

Back to Rust, the parameter that may definitely move are type parameters, here: despite being constrained to upholding some property (the trait bound applied to the type), depending on the usage, there may be multiple possible types used there, or not.

And this is where the impl Trait notation, in and of itself, comes into play. As of today's Rust, that notation can be used in two (three on nightly) places:

  • in a function's argument / parameter (type) position;
    fn foo(f: impl Fn(), …) -> …

  • in a function's return (type) position;
    -> impl Fn()

  • and on nightly, within a type alias definition.
    type Bar = impl Fn();

In all these cases, we know the type described by the impl … syntax is:

  1. a type…

  2. … that implements / meets the given (trait, lifetime) bounds.

But from that 1. line, "a type", stems a legitimate question:

Does "a type" refer to some type, or to any type ?

In other words, is there freeway to move / change that type (generic type parameter), or is it a fixed (but still potentially unknown) type?

  • In mathematical terms, this question refers to the quantification of this parameter.

And it turns out that impl Trait can mean either one (depending on where it is used)! Thus, one has to be very explicit about the quantification of the impl Trait type.

  • in a function's argument / parameter (type) position, impl Trait is sugar for a generic type parameter: "(for) any type that implements Fn(), we can take a parameter called f of that type"

    • //         +--------------+
      //         |              |
      //         |          ↓↓↓↓↓↓↓↓↓
      fn foo  ↓↓↓↓↓↓↓↓  (f: impl Fn(), …) -> …
      fn foo <F : Fn()> (f: F        , …) -> …
      

    So in argument position, impl Trait stands for a generic type parameter, i.e., any arbitrary type that the caller of the function may choose, thus, the function's body must be able to handle any type implementing that Trait: it must thus be able to handle all the types implementing that Trait: it is thus a universal quantification.

  • whereas in a function's return (type) position and a type alias definition, impl Trait stands for a fixed (but potentially unknown) type:

    • //               +-----------------------------+
      //               |                             |
      //           vvvvvvvvvv                        |
      fn baz () -> impl Trait                     // |
      {                                           // |
          struct UnnameableOutside { … }          // |
          impl Trait for UnnameableOutside { … }  // |
          …                                       // |
          return UnnameableOutside { … };         // | fixed type
      } //       ^^^^^^^^^^^^^^^^^                   | but unnameable
        //               |                           | and thus "unknown"
        //               +---------------------------+ to the caller
      

    So, in return position (or in type alias definition), it stands for one single fixed type, and as I mentioned at the beginning of this post, this can be mathematically expressed with the property that there must exist an actual concrete type that "is" / matches the type used there.

While that phrasing looks weird on its own, existential and universal are "duals" / "opposites" of each other; and the universal terminology fits very well that of a generic parameter. Thus, existential is used to mark an opposition, to explicit a semantic distinction with the universal impl Trait.

5 Likes

Thanks for that awesome explanation @Yandros ! :smiley:

1 Like

As already mentioned, it comes from existential quantification which is related to predicate logic.

It can be described in many ways.

Relating it to generics, one can think of an existential (βˆƒ) as a fixed but unknown constant type parameter while all-quantified (βˆ€) type parameters (what we see in rust as normal type parameters) are still volatile given the scope they are declared in, s.t.

fn fun<βˆƒS:Trait,βˆ€T>(s:S,t:T){...}

is equal to f<S>: //of type: Type=>(S,T)=>() (f is applied to type S)

where f is

fn f<S>{return fn fun<T>(s:S,t:T){...}} // of type: Type=>Type=>(S,T)=>()

In return, an existentially quantified type parameter was an all-quantified type parameter before which got bounded by a previous function application (yielded by monomorphization), see post.
So, an existential type parameter is some captured closure parameter, but we don't know what it is because we don't know which closure (in our case f) captured that type parameter and to which type it was set to.

A more set theoretic view of existential types is given in another post.

1 Like