Why is compiler unable to infer the type of `out` here?

Considering the following snippet:

pub fn extract_token_parts(token: &str) -> Result<Vec<&str>, Error> {
    let out = token.split('.').collect();

    if out.len() != 3 {
        return Err(Error::InvalidToken);
    }

    Ok(out)
}

rustc currently fails to infer the type of out as Vec<&str>:

error[E0282]: type annotations needed
   --> crates/niko-auth/src/lib.rs:140:9
    |
    |     let out = token.split('.').collect();
    |         ^^^
    |
    |     if out.len() != 3 {
    |        --- type must be known at this point
    |
help: consider giving `out` an explicit type
    |
    |     let out: Vec<_> = token.split('.').collect();
    |            ++++++++

I am obviously not reluctant to adding an explicit annotation, but given that rustc already does bidirectional inference this case seems like something it should easily handle, and i wonder why it could not?

If out was of type Result<Vec<&str>, Error> like in the return type, then you couldn't call .len() on it.

If it was of type Vec<&str> so you could .len(), then it wouldn't match the return type.

The compiler can't infer the type because both would be wrong. (I'm guessing you meant to add ? after collect().)

I don't know the underlying reason for why it can't, but rustc can't infer types across method calls. As soon as you have a method call, it has to know the type of the receiver, and it (in my experience) has to be able to infer it from the prior code.

I'm not aware of anything that would prevent rustc from inferring it by starting at the return type and working backwards. It could just be that no one has bothered to put the work in, or maybe there's some inherent difficulty I'm not knowledgeable/smart enough to know about.

2 Likes

No, that's wrong. out is explicitly wrapped in an Ok at the end of the function.

1 Like

It's wrapped in an Ok() at the end, and in simpler cases rustc is clever enough to infer the type of it's inner value, so i don't think it's the "issue" here (and .collect() is infallible so i can't imagine that i can put ? after it).

1 Like

Yeah, sounds reasonable. Maybe compiler just doesn't look from bottom to top yet. Thanks

You're right, sorry!

1 Like

There's usually the problem of interaction between not just method resolution and type inference, but also implicit coercions. Both implicit coercions and method resolution are features not found in the kinds of lambda calculus and functional programming languages that the inference algorithm of Hindley and Milner was made for.

In this code example, in the return expression Ok(out), the value out could in principle also he a subtype of, or otherwise implicitly coerced into, the type Vec<&str>. For coercions, contingency mechanisms seem to be in place so that type inference in Rust can make the assumption that coercions just don't happen in places where otherwise types would be ambiguous.

I believe that similar fallbacks to deal with method calls in a way that could allow more code to be type-inferred don't exist. They would probably need to be a different kind of mechanism, not as straightforward as for the coercions where you can simply say "well, no coercion then, done deal", here you would need some way of saying "let's wait out on this method resolution until some-later-point™ and try again then". That would probably require some mechanism to defer the method resolution to a later point, which may or may not be a complication. Also, another possible difficulty, in cases where the method call has additional arguments, maybe even whole expressions: would this mean that inferring the types for those would be deferred, too and how does that interact with the rest of the system? By the way, I don't actually know the true algorithms at play, this is just my intuition as a user of Rust, seeing what works and what doesn't work.

4 Likes

This compiles:

pub fn extract_token_parts(token: &str) -> Result<Vec<&str>, ()> {
    let out = token.split('.').collect();
    if false { return Ok(out); } 
    if out.len() != 3 {
        return Err(());
    }
    Ok(out)
}

For method calls only information before the call in program text order seems to be used for type inference. This seems like a weakness that in theory doesn't have to exist.

6 Likes

Maybe read @steffahn's answer again. It seems very reasonable that infering types backwards could lead to complications, maybe even combinatorial explosions. It's maybe similar to why function argument- and return types have to always be defined explicitly in order to avoid small changes in one place to change the meaning in a different place. And since we tend to think top of the current scope to bottom this way is closer to the way we reason about code.

This is indeed a very frequently discussed limitation of type inference in the compiler, and a papercut that pretty much every rust developer is bound to run into at some point.

Fun fact: The compiler actually contains a hack to make it perform out-of-source-order unification in one very specific (but extremely common) scenario:

  • when calling a function that takes some generic callable, e.g. where F: FnMut(i32, &str)...
  • if the argument you supply is an inline closure...
  • it will unify the types of the closure parameters with the Fn trait bound's parameters before it type checks the closure itself

That's the reason why

vec!["a"].into_iter().map(|x| x.len());

works while

let func = |x| x.len();
vec!["a"].into_iter().map(func);

gives a "Type annotations needed" error.

9 Likes

Sorry, I'm confused. Can you give an example of another type out could be? Thanks!

My point here was more about the general case. For Vec<&str> specifically, I don’t think there’s anything that coerces to it, except for the type Vec<&str> itself, with possibly a longer lifetime. But for other return types, coecions could be possible, and the type inference algorithm must be designed to somehow take this possibility into account.

1 Like

I have the same problem with collect most of the time as well.

IMHO rust std should just add collect_vec like in itertools library as this is probably by far what people want most of the time. I know, I know, collect is made to handle multiple different return types etc. etc. But most of the time people just want to receive vec and I don't see it as a big problem to just add this helper for convenience.

PS: collect::<Vec<_>> will also work but, frankly, my fingers just break typing this every single time :slight_smile:

What’s wrong with using itertools?

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.