Why `match` arms have incompatible types?

Hello,

I am trying to implement a simple adjacency matrix and now I am stuck with `Iterator' for all existing connections. Below is a minimal example of the problem I am facing:

use fixedbitset::FixedBitSet;

enum Order<T> {
    Row(T),
    Column(T),
}

struct AdjacencyMatrix {
    values: Order<Vec<FixedBitSet>>,
}

pub struct Pivot {
    row: usize,
    column: usize,
}

impl AdjacencyMatrix {
    pub fn connections(&self) -> impl Iterator<Item = Pivot> {
        match self.values {
            Order::Row(ref values) => values
                .iter()
                .enumerate()
                .flat_map(|(row, values)| values.ones().map(|column| Pivot { row, column })),
            Order::Column(ref values) => values
                .iter()
                .enumerate()
                .flat_map(|(column, values)| values.ones().map(|row| Pivot { row, column })),
        }
    }
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error[E0308]: `match` arms have incompatible types
  --> src/lib.rs:24:42
   |
19 |           match self.values {
   |           ----------------- `match` arms have incompatible types
20 |               Order::Row(ref values) => values
   |  _______________________________________-
21 | |                 .iter()
22 | |                 .enumerate()
23 | |                 .flat_map(|(row, values)| values.ones().map(|column| Pivot { row, column })),
   | |___________________________---------------___________________--------_______________________- this is found to be of type `FlatMap<Enumerate<std::slice::Iter<'_, FixedBitSet>>, Map<Ones<'_>, [closure@src/lib.rs:23:61: 23:69]>, [closure@src/lib.rs:23:27: 23:42]>`
   |                             |                                 |
   |                             |                                 one of the expected closures
   |                             one of the expected closures
24 |               Order::Column(ref values) => values
   |  __________________________________________^
25 | |                 .iter()
26 | |                 .enumerate()
27 | |                 .flat_map(|(column, values)| values.ones().map(|row| Pivot { row, column })),
   | |____________________________________________________________________________________________^ expected closure, found a different closure
   |
   = note: expected struct `FlatMap<Enumerate<std::slice::Iter<'_, _>>, Map<Ones<'_>, [closure@src/lib.rs:23:61: 23:69]>, [closure@src/lib.rs:23:27: 23:42]>`
              found struct `FlatMap<Enumerate<std::slice::Iter<'_, _>>, Map<Ones<'_>, [closure@src/lib.rs:27:64: 27:69]>, [closure@src/lib.rs:27:27: 27:45]>`
   = note: no two closures, even if identical, have the same type
   = help: consider boxing your closure and/or using it as a trait object
help: you could change the return type to be a boxed trait object
   |
18 |     pub fn connections(&self) -> Box<dyn Iterator<Item = Pivot>> {
   |                                  ~~~~~~~                       +
help: if you change the return type to expect trait objects, box the returned expressions
   |
20 ~             Order::Row(ref values) => Box::new(values
21 |                 .iter()
22 |                 .enumerate()
23 ~                 .flat_map(|(row, values)| values.ones().map(|column| Pivot { row, column }))),
24 ~             Order::Column(ref values) => Box::new(values
25 |                 .iter()
26 |                 .enumerate()
27 ~                 .flat_map(|(column, values)| values.ones().map(|row| Pivot { row, column }))),
   |

For more information about this error, try `rustc --explain E0308`.
error: could not compile `playground` due to previous error

I can put that into the Box, however first I need to comprehend why compiler consider following types as different:

expected struct `FlatMap<Enumerate<std::slice::Iter<'_, _>>, Map<Ones<'_>, [closure@src/lib.rs:23:61: 23:69]>, [closure@src/lib.rs:23:27: 23:42]>`
   found struct `FlatMap<Enumerate<std::slice::Iter<'_, _>>, Map<Ones<'_>, [closure@src/lib.rs:27:64: 27:69]>, [closure@src/lib.rs:27:27: 27:45]>`

It looks weird for me... :confused:

Every closure in Rust has its unique type:

A closure expression produces a closure value with a unique, anonymous type that cannot be written out.

So the closures you pass to flat_map and values.ones().map have different types in your match arms. Because of this you have two distinct, non-matching concrete types implementing Iterator returned by your two match arms. As you said, you can circumvent this with a boxed trait object.

2 Likes

Try returning a concrete type such as Vec instead of impl Iterator.

Thank you, that was useful. I just was not aware of this particular aspect of closures design in Rust. BTW, based on that I was able to find a workaround:

fn connections<'a>(&'a self) -> impl Iterator<Item = Pivot> + 'a {
    let wrap = |bitset: &'a _| match self.values {
        Row(_) => Row(bitset),
        Column(_) => Column(bitset),
    };
    
    self.values
        .as_ref()
        .into_iter()
        .map(wrap)
        .enumerate()
        .map(|(idx, order)| match order {
            Row(bitset) => (Row(idx), bitset),
            Column(bitset) => (Column(idx), bitset),
        })
        .flat_map(|(order, bitset)| bitset.ones().map(move |idx| (order.clone(), idx)))
        .map(|pivot| match pivot {
            (Row(row), column) => Pivot { row, column },
            (Column(column), row) => Pivot { row, column },
        })
}

It looks creepy, but it does what I need it to do.

Probably it is not the best feature in the language design. I think comparison should be by interface (input & output types), but not some anonymous type that cannot be written out. Why do I need to care about inwards of a closure logic? What could be a possible case?

Don't quote me on that but I assume it is because closures can capture variables from the surrounding scope. Note that non-capturing closures can be coerced to fn pointers: Closure types - The Rust Reference, which backs me up I'd say :sweat_smile:. That won't help you though, because in your original example, the inner closure captures row or column from the outer closure respectively.

Closures can be thought of as a struct and a function taking that struct as one of its arguments. The generated struct is all of the captured state. This can be important in a language such as Rust which has users who care about the size of their code.

Joke tweet I like about that:

This is actually explained at the top of the reference page linked by @jofas

2 Likes

Closures can be thought of as a struct and a function taking that struct as one of its arguments.

I don't see how it contradicts to my words. Let say we have a closure |x, y| -> v which captures z, which means that we have function of 3 arguments (|x, y, z| -> v), but not 2. I cannot understand why compiler see it differently? If I define both closures in the same scope and they refer to the same variables, it means both closures have identical arguments.

I have refactored my workaround a bit harder and now it looks much cleaner:

fn connections(&self) -> impl Iterator<Item = Pivot> + '_ {
    self.values
        .as_ref()
        .into_iter()
        .enumerate()
        .flat_map(move |(major, bitset)| {
            bitset.ones().map(move |minor| match self.values {
                Row(_) => Pivot::new(major, minor),
                Column(_) => Pivot::new(minor, major),
            })
        })
}

Everything I did is just moved match statement into the body of the Iterator and now compiler happy with it. And as you can see now I capture self.values from the closure.

It's not to contradict you. It's to help you understand how closures work "under the hood" and why your first version could not compile.

Different closures, in the general case, don't capture the same environment, and also have unique types. The compiler won't try to reason about what's equivalent to what.

1 Like

Yes, it's possible complication that one may add to the language but what will it buy in your case? You closures refer different variables: one closure refers row and other refers cols.

To have a low-level language without GC?

In this particular example closures may be coerced to one type because types of captured arguments actually match.

But that wouldn't be a case always and, more importantly, you can not predict what will happen if closures are not created by your function directly but by some other function.

Or, rather, you can predict precisely because it's closure is it's own type.

If you declare all closures with the same arguments having the same type then you can not do that and your only choice would be to automatically box everything, place closures on the heap (and, eventually, adding GC).

Is it possible? Sure, there are hundreds of languages like that, some of them very popular: C#, Java, you name it.

Is it feasible? I don't think so: Rust is supposed to work in a very constrained environments without allocator, too and such design would make it impossible.

Would have Rust ever have become popular is it were just another Java clone? IDK, but that's not what Rust makers tried to do.

Yes. And, more importantly, now you have one closure and thus one type.

Rust haven't invented that approach to closures. C++ uses the same approach and for the exact same reason.

1 Like

As in the last example, which is compiles with no issue. There is also different order.

Is it obvious at compile time? I think - Yes.

This is unclear to me. In C I can define as many functions as I need from int to int (fn(i32)->i32). I just need to care about naming, or scope (namespace).

I'm trying to comprehend what approach it was intended. This is far from the first or dozen's language I learn.

It's always possible to find out whether they actually match but it's unclear whether they are supposed to match.

And you can do the exact same thing in Rust, too! As long as we are talking about functions (that is: fn(i32)->i32 and not Fn(i32)->i32… note the fn vs Fn difference) it works in the same way.

But you can not declare even one function which carries payload inside in C which makes discussion about closures kinda irrelevant: C doesn't have them, there's nothing to discuss.

Of course you need to care about more than that! Just try to replicate the following few functions in C and you'll find out that you need to invent a way to carry payload (42 or 3, 5, 7, 9) somewhere, you would need to invent a way to pass it along with function pointer and so on:

fn make_closure1(num: i32) -> impl Fn(i32) -> bool {
    move |x| x == num
}

fn make_closure2<const N: usize>(nums: [i32; N]) -> impl Fn(i32) -> bool {
    move |x| nums.iter().position(|y| &x == y).is_some()
}

fn make_closure3(nums: &'static [i32]) -> impl Fn(i32) -> bool {
    move |x| nums.iter().position(|y| &x == y).is_some()
}

This is far from the first or dozen's language I learn.

Which languages do you know which have closures (means: not functions but “functions with payload”) and don't rely on a GC or hidden memory allocations (all others are kinda out of scope)?

Maybe we can, then, compare implementations of closures in Rust and these languages.

Consider the example above: closure1 and closure2 have different sizes, thus they are clearly incompatible, but closure2 and closure3 have the same size and can be made compatible… on 64bit platform. Is it obvious at compile time that they have to be the same type or not?

1 Like

A simple way to think of it is this:

  1. Two or more structs containing all the same fields are not interchangeable.
  2. Closures are more like structs than they are functions.

If you have a closure (lambda) that doesn't really capture anything, then it can be accessed as a simple fn pointer, but if it captures something then it cannot. Because closures don't have names, there is no way to indicate which closures with the same 'fields' should be the same type, so the compiler treats them all as unique. If you need a dynamic collection of multiple closures, then the traits Fn, FnMut, and FnOnce can be used, just as you would use traits to store a heterogenous collection of structs.

1 Like

Remember that C's int(*)(int) is a function pointer, which is an erased type. It has non-zero size, so multiple different things can be put behind it, and calling it is an indirect call because of that. If you have closures that don't capture anything, they can be coerced to fns, and the same things work. But Rust defaults to zero-sized types that use static dispatch, though. Closures: Magic Functions | Rusty Yato

This is basically the same as how "put different iterators in the same variable" that you might do in Java doesn't immediately work in Rust, but it does if you use Box<dyn Iterator> instead to use allocated erased types and dynamic dispatch like happens by default in Java.

1 Like

@khimru, @m51, @scottmcm

I am a bit limited in time to discuss all the variety of irrelevant examples, so I just made a minimal reproducible example (MRE) that does not compile:

enum Signed<T> {
    Positive(T),
    Negative(T),
}

fn mre(a: i64, b: Signed<u8>) -> i64 {
    let apply = match b {
        Signed::Positive(ref b) => |a: i64| a + *b as i64,
        Signed::Negative(ref b) => |a: i64| a - *b as i64,
    };
    apply(a)
}

What can go wrong here? The closure is defined in the same scope where it is used as in the original problem. It is also obvious that such a problem can be solved in general without boxing or any other hacks.

The fact that the MRE code looks creepy doesn't mean the logic/arithmetic is wrong.

The fact that dynamic dispatch (including dynamic choice of the closure used) is always explicit. Yes, here its cost will be negligible - but it is non-zero anyway, and there's no way to strictly specify where "negligible" becomes "non-negligible".

What kind of solution do you mean here? I can come up with two (implicit boxing and anonymous enum), and both can be classified as "hacks".

2 Likes

Have you heard the definition of "example" word? Let's discuss this, and not something else.

I thought it was pretty obvious:

fn mre(a: i64, b: Signed<u8>) -> i64 {
    match b {
        Signed::Positive(b) => a + b as i64,
        Signed::Negative(b) => a - b as i64,
    }
}

The question is, how do you define it formally, as an always-correct transformation? In this example this will work, but one specific code can't be the motivation for language-wide changes, since there are much more examples which are not so obvious.

1 Like

Here in this thread we already have two precedents, the first is the original one, where I moved the initial match to the AsRef trait, which is kind of a hack, and the second is the MRE, where I just inlined the closure. Both precedents had the same input and output arguments and types all the time.

I can give many more such examples which will produce the same compilation errors, and they will all be logically correct. However, I think one MRE is enough.

Let me ask a concrete question. Given the example

Example, repeated
enum Signed<T> {
    Positive(T),
    Negative(T),
}

fn mre(a: i64, b: Signed<u8>) -> i64 {
    let apply = match b {
        Signed::Positive(ref b) => |a: i64| a + *b as i64,
        Signed::Negative(ref b) => |a: i64| a - *b as i64,
    };
    apply(a)
}

What should std::mem::size_of_val(&apply) be?

1 Like

The idea that may invent some cool solution for some very narrow case, add that to the language, repeat 100 times and end up with something nice.

What you end up instead is fractal of bad design, language where string "1000" is equal to string "1e7" (or something equally ridiculous).

Yes, but only if you discuss “complicated cases” which don't adhere to your simple reasoning.

If you don't have time to discuss “irrelevant examples” then we couldn't discuss anything at all. Before any feature can be added to the language all tricky corner-cases must be discussed.

1 Like