Strange type inference with ? syntax

use std::result::Result;

fn print_type_of<T>() {
    println!("{}", std::any::type_name::<T>());
}

fn parse<T: Default>() -> Result<T, ()> {
    print_type_of::<T>();
    Ok(T::default())
}

fn main() -> Result<(), ()> {
    parse()?;
    Ok(())
}

(Playground)

Logically thinking this must be an error cannot infer type of the type parameter T declared on the function parse and it is if replace ? by .unwrap(). But it is using () for type parameter for some unknown reason. This looks like an error in the implementation of ? macro.

4 Likes

You shouldn't jump to drastic conclusions like this. Compiler bugs are rare to come by accidentally. Also, since when does inferring a reasonable default count as a "bug"?

Interesting, I wasn’t aware of this behavior before. Others were, of course; there’s a lot of old issues and some interesting developments and back-and-forth on the topic AFAICT, to start looking into it, the following issue is a good read and links to / is linked by other places.

This may be one of the rare cases:

I remember defaulting to () was required for compatibility, and it blocked defaulting to !:

2 Likes

This certainly is an odd case. What's happening isn't directly because of ?; it happens even when using a match, such as

fn main() -> Result<(), ()> {
    let _ = match parse() {
        Ok(val) => val,
        Err(_) => never(),
    };
    Ok(())
}

fn never() -> ! {
    panic!();
}

It seems to be that if the compiler is required to unify ! (the “never” type, used to indicate the lack of a value, such as via divergence/panicking) and a type inference variable, it will choose (). The fact that this choice happens when the type inference variable would otherwise be a “cannot infer” type error if it weren't unified with ! is pretty clearly unintentional behavior, imho. On the other hand, though, this is also likely behavior we're stuck with to maintain backwards compatibility, at least until someone puts in the justification and effort to have different behavior in the next edition.

For fun, note that if we change to a trait which () doesn't implement, e.g.

fn parse<T: std::fmt::Display>() -> Result<T, ()> {
    print_type_of::<T>();
    Ok(todo!())
}

we get an error that () is not Display

error[E0277]: `()` doesn't implement `std::fmt::Display`
  --> src/main.rs:11:19
   |
11 |     let _ = match parse() {
   |                   ^^^^^ `()` cannot be formatted with the default formatter
   |
   = help: the trait `std::fmt::Display` is not implemented for `()`
   = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead
note: required by a bound in `parse`
  --> src/main.rs:5:13
   |
5  | fn parse<T: std::fmt::Display>() -> Result<T, ()> {
   |             ^^^^^^^^^^^^^^^^^ required by this bound in `parse`

due to type inference's choice of () here.

It might also have something to do with the () requirement for block-like expressions (e.g. match) to be treated as a statement and elide the ;. E.g. given

fn parse<T>() -> T {
    print_type_of::<T>();
    todo!()
}

fn main() {
    if true {
        parse()
    } else {
        never()
    }
    dbg!();
}

the code compiles with T=(). parse(); results in the expected type error.

It's not just this, since ? isn't considered a block-like expression, and match parse() { v => v } causes a type error. But it's likely descended from this; an important case is that if all arms diverge, the block-like expression is still considered a statement, and the ; isn't required. Before ! was a proper type, the way to do this would be to make the type collapse to (). With ! as a proper type, it would make more sense to unify the type as !, but allow ! as well as () for expression statements.

? is a really interesting special case, though. If defined as desugar like a macro, it's a match with a !-valued break arm and a T-valued continue arm... but unifying to the ! type doesn't make any sense semantically. ? would actually prefer that the inference of the continue type is completely unimpacted.

! is just really odd in how it could be expected to behave. The behavior we have on stable is mostly just an organic evolution of what was convenient rather than anything planned.

Right now, I think I would say that unifying ?0 and ! because both are assigned to the same place should place no implications on ?0, but assigning a value of type ?0 to a place of type ! should imply ?0 = !. Assigning a value of type ?0 to a place of inferred type ! should probably change the place to type ?0, unless the place's type is explicitly stated rather than inferred.

This all comes from !'s dual role: eventually, we want to be able to use it deliberately as a hole in generics, but it's also a "weak" type hole for divergence that should both decay to a real type but also not influence the choice of real types. It seems convenient to default to !, e.g. Err(0i32) is Result<!, i32>, but there's a difference between an "explicit !" and a "type inference !," in that the former should influence further type inference (e.g. by unification with !), but the latter probably shouldn't (e.g. a diverging arm making other arms infer to be !-typed rather than a type error).

As a silly example, consider feature(exhaustive_patterns) arm elision; given fn parse<T>() -> Result<T, !>, match parse() { Ok(x) => x } is perhaps surprisingly not a type error because we get T=(). This is just clearly wrong IMHO, and T=! more wrong; there's no ! type we're unifying with involved here at all, except for the deleted arm.

4 Likes

Why though? That seems fundamentally confused. ! is a concrete type, it's not an inference variable. Even T-that-coerced-from-! is completely different from _.

2 Likes

For exactly the reason of the OP; it makes no sense to me that the inference behavior of

// compiles, unifies with !
let x = match parse() {
    Ok(x) => x,
    _ => return,
}

and

// compile error, cannot infer type
let Ok(x) = parse()
else {
    return;
}

should differ, but they do.

The thing is that we can't write -> _ for “returns an inference hole,” we just write -> !. When -> ! is “diverges” (the only stable meaning), it's a “weak” ! that shouldn't impact the inference of other inference variables, imho.

1 Like

This example works because the type of the match subexpression must unify with the return type of main, which is (). Type inference gives a fairly obvious derivation for T=() here. It works the same for other return types in main

I think the same reasoning applies, except we're unifying two branches of a match, instead of a subexpression which is in the return position of a function. Note that the type of return is never, the fact that it does "something special" doesn't really matter to the typechecker. In this case the inferred type of T will be whatever the other match arm returns

2 Likes

Ah, I apparently forgot to test this version with a dbg!() after, to ensure I had a statement-match instead of expression-match.

Perhaps interestingly, match parse() { Ok(x) => x, }; is a type error but match parse() { Ok(x) => x, } dbg!() resolves T=(), and this isn't unifying with the function return type. If it's unifying with some () constraint, it's the “block-like expressions are only valid as a statement if they evaluate to ()”.

I attempted to do some testing that I thought showed that the "statement ()" didn't impact type inference, but whatever test I did was flawed; it clearly does.

My poor attempts at reducing the example notwithstanding, the original example stands:

fn magic<T: Default>() -> Result<T, ()> {
    print_type_of::<T>();
    Ok(T::default())
}

fn main() -> Result<(), ()> {

    let _ = magic()?;

    let _ = match magic() {
        Ok(x) => x,
        _ => return Err(()),
    };

    Ok(())
}

There is no () that is being unified with here. That's trivial to prove: use let _: u8 instead and u8 will be selected instead of ().

My argument is that choosing ! is just as wrong as choosing (). The type inference variable is unbound; we know this must be true, because introducing any new bound will cause that type to be chosen instead.

Type unification in the face of coersion is just odd. Consider:

// T = Box<()>
let _ = match magic() {
    Ok(x) => x,
    _ => Box::new(()),
};

// T = Box<()>
let _ = match magic() {
    Ok(x) => x,
    _ => Box::new(()),
} as Box<dyn Send>;

// T = Box<dyn Send>
let _: Box<dyn Send> = match magic() {
    Ok(x) => x,
    _ => Box::new(()),
};

It's the interaction of a few reasonable enough choices which results in the “bug”:

  • A ! type implication is immediately dominated by any other implication.
  • When only a ! type implication is present, this shouldn't be an error. This is common (any diverging statement, e.g. return;).
    • The default fallback type was chosen to be ().
    • Misfeature, IMHO: that this type fallback was ever observable. IMHO, it should have been a type error anywhere except statements. Since 1.0, it's been observable, as f(panic!()) instantiates f::<()>.
  • ? is simply desugared to a fancy match, meaning the output type is unified with !.
    • Misfeature, IMHO: ? should not put any implication on the continue type; for contrast, let-else doesn't.
  • Thus: parse()?; sees a ?0 type inference variable for the continue type, and the only implication is a unification with !. The default kicks in, and we get ?0 = (), despite nothing in the source putting any bound on it.

I think my position is clear from this. Given fn id<T>(_: T), fn diverge() -> !, fn any<T>() -> T, and fn rand() -> bool, I would like to somehow have

  • id(diverge()) compile as id::<!>; but
  • if rand() { any() } else { diverge() } fails inference, cannot infer type T for any.

I'll freely admit I don't think there's a coherent inference system that will do “what I want” in all cases; I'm not sure “what I want” is even consistent. I personally feel that what any existing diverging fn() -> ! wants is the “type hole” behavior: accepted as a statement, but a 100% unbound type inference variable as an expression. Once ! is a proper type, though, it should be possible to manipulate without requiring ! type annotations everywhere.

If there's any consistent guideline I can draw from this it's that any explicit -> ! (including control flow keywords) should never impact type inference, but the ! type introduced in any other way (including as a return type from generic instantiation) should infer type ! (until dominated by a coercion to a real type).

1 Like

This is certainly very strange. Even these two compile differently:

fn main() {
    match parse() {
        Ok(x) => x,
    } // <--
    dbg!()
}

and

fn main() {
    match parse() {
        Ok(x) => x,
    }; // <--
    dbg!()
}

yet as far as I know these are identical ASTs, in the trailing statement the colon is meaningful, but surely between two statements the colon has no effect (its merely a syntactic seperator)?

This seems very strange to me. It would seem that any block somehow gets a special treatment for type inference. For function blocks with a definite return type its straightforward - the trailing substatement must unify with the function return type. But in this example, surely we should unify the return type of magic (say T0) with the type of the entire block (say X1) and find that X1 = T0 but still that X1 is totally unconstrained? Yet we end up assigning X1 = T0 = ().

Consider another example:

fn main() {
    let _ : (_, _) = {
        magic()
    };
    ()
}

Clearly the inferred type is ambiguous, and the compiler seems to agree here.

It's definitely confusing. Maybe this is needed for some super common situation which would otherwise give ambiguous type errors regularly? Picking an arbitrary assignment for an underconstrained type is "technically correct" but typically confusing. Haskell does it in specific limited cases to solve a particular problem, so here maybe Rust is doing the same to solve specific limited cases which I don't yet understand?

Yes, it seems that in a generic context specifically, any ! in a return type position would mean that some instantiation has caused something to "go wrong" (not technically wrong, but confusing). The typical method of producing generic return types is from parsing some sort of data, and this can never produce !. But there are probably exceptions, and it's hard for the compiler to gauge this sort of intent. Maybe this is the type of thing which is solved via attributes + linting, similair to must_use.

1 Like

(I am a member of the (essentially inactive) grammar working group; this post is my own and not in any way representative of the group.)

The reason that these differ is that a match is not a statement; it's an expression. This is why you can let x = match or have it be the trailing expression/return, etc.

If you write match expr {};, you have an "expression statement" just the same as expr;. The type of the expression can be any type, which is then dropped. The same goes for if/else, as well as every other block-statement in Rust — they're all expressions which evaluate to some value. Even for loops and if-without-else, interestingly enough, despite the fact that they can't evaluate to anything except ().

The fact that everything's an expression is why you'll sometimes see Rust ('s syntax) described as "expression-oriented." (To show this, try a snippet like drop(if true {});.)

The special case is that — because requiring ; after what are statements in other C-like languages seems very strange — for a list of "block like expressions" (very roughly, expressions that have a trailing block; things that would probably be statements in C++ or Java) the semicolon is optional if and only if the expression is ()-valued.[1] (The , between match arms acts the same way, and it's there that you see actual blocks as block-like expressions the most.)

Personally, I'm not super fond of this rule, because it kind of means that the grammatical validity of a Rust program depends on type inference. In practice, it's not actually problematic, because unlike the C "lexer hack" it can't change the interpretation of a program, only the validity. Instead of "to be a statement, it must be ()-typed," the rule is actually closer to "when used as a statement (the ; is elided), the type is constrained to ()."

It's that rule that makes the difference. In the statement magic();, the return type of fn magic<T>() -> T is unconstrained. In the statement (not expression) {magic()}, though, the return type is constrained to be (), because it's not a valid statement if not ()-valued.

error[E0308]: mismatched types
 --> src/main.rs:6:6
  |
6 |     {magic::<u32>()}
  |      ^^^^^^^^^^^^^^- help: consider using a semicolon here: `;`
  |      |
  |      expected `()`, found `u32`

That this "block-like expression statements must have type () unless terminated by a ;" rule impacts type inference may certainly be surprising, but this is at least where this ()-constraint is coming from.


As to the rest of the concepts, I have nothing more to add. The () statement rule explains some behaviors, but not the unification with ! choosing () observed with parse()?;, where no block-like expression is involved. (The match exists in the desugaring, but the ? is not a block-like expression, and we have a semicolon which we've shown removes the () constraint.)


  1. This is almost certainly related to why ! decays to () if not bound further: such that block like expressions which have a diverging tail (have type !) are also allowed to be statements. However, given the () requirement actually impacts type inference, that's not the whole story; it probably has also to do with how diverging was originally handled in early compiler versions (where ! wasn't a proper type yet). ↩︎

5 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.