Is type inference affected by the order in which statements occur?

Why does this compile:

fn main() {
    let mut m = None;
    m = Some("");
    m.unwrap().len();
}

but this does not?

fn main() {
    let mut m = None;
    m.unwrap().len();
    m = Some("");
}

The error is:

error[E0619]: the type of this value must be known in this context
  --> /home/jimb/.emacs.d/rust-playground/at-2017-11-14-162619/snippet.rs:11:5
   |
11 |     m.unwrap().len();
   |     ^^^^^^^^^^^^^^^^

I believe the compiler stops once a type must be resolved unambiguously. In the 2nd example you need to know the type for len() and that’s unknown at that point.

1 Like

I'm really surprised to learn that there are cases where, even though there is a unique assignment of types that could make the program work, Rust's ability to infer that assignment depends on the order in which statements occur.

Is this unavoidable? Do other languages with type classes / traits have the same limitation?

I don't know why but types must be fully known for method receivers, and this is hard to improve.

I've always wondered about this myself, since Rust's type inference is clearly capable of working backwards from the output type of a function... and it's not like calling a method can change the type of the receiver...

I think bailing at the first point where the type needs to be known simplifies things quite a bit, both from an implementation standpoint and from code maintenance perspective. If you were to let inference continue beyond this point, where would you draw the line? What if further calls are made? What if a call escapes into a different crate where the body of a method isn’t visible? What if downstream code changes and inference deduces the wrong type causing compilation failure? I think a lot of questions would surface, and doing what rustc does today makes things simpler IMO.

1 Like

I don't think that wanting the second program to compile opens Pandora's box quite the way you suggest. In the present case, type inference could remain completely local to the function body and still discover the type of m.

For a trivial function like that, yeah - agreed. But suppose you have a much more complex one, with complicated control flow, more method calls, and so on. You’d still need to implement some sort of machinery to "backtrack" back to the first point where the type must be known. I feel like even confining it to a single method/function opens up the Pandora’s box (just not as much as global scope). Unless you do this in some principled manner, you’d need heuristics or some other policy to decide where to stop. And then people will start guessing why some cases work and others don’t.

Also, as a reader of code, if compiler doesn’t know the type at some location then I certainly don’t. I’d hate to maintain code where I need to mentally walk forwards and backwards to figure out a simple thing like the type of a binding. Maybe that’s just me.

I also don’t think the current rules are such an ergonomic hit - inference almost always gets things right and things just work out. For the other cases, restructuring the code or annotating the types seems like a good tradeoff (again, just IMO).

3 Likes

@comex Perfect, that's exactly what I was looking for. Thanks very much!

Yeah, type checking is order dependent, and there's even this funny hack inside the compiler to check closure parameters last https://github.com/rust-lang/rust/blob/d762b1d6c67db12e117186d94d70e46cddb22965/src/librustc_typeck/check/mod.rs#L2548-L2552. That is, you can break type checking by switching from Fn(Foo) -> Bar to (Fn(Foo) -> Bar, ).

I ran into the limitations of that closure hack about a year ago when trying to write a macro which could be used like this:

zip_with!(
    (frac, radii, self.sorted.as_ref(), self.hints.as_mut())
    |x, r, set, hints| {
        hints.start = update_lower_hint(hints.start, &set.keys, x - r);
        hints.end   = update_upper_hint(hints.end,   &set.keys, x + r);
    });

The trouble is that eta expansion pretty much destroys the order-based type checker if you do it anywhere except directly as a function parameter:

// works just fine
expr.map(|((x, y), z)| x.stuff);
// ERROR: the type of x must be known
expr.map(|((x, y), z)| (|x, y, z| x.stuff)(x, y, z));

of course, the latter is what my macro actually expanded into. To work around this, I created a helper type which allowed the user's closure to appear directly as an argument. Basically, the macro was changed to expand into:

obj.map(|((x, y), z)| $crate::FlippedCall((x, y, z)).on(|x, y, z| x.stuff)); // ok

tl;dr: In a macro where $f:expr, thou shalt not write $f($args).