Why `return 5;` is an expression?

From the Rust book Chapter 3.3 Statements and Expressions
Expressions do not include ending semicolons. If you add a semicolon to the end of an expression, you turn it into a statement, and it will then not return a value.
Link

So a line of code ending with semicolons is a statement, but here is a simple function that returns 5:

fn return_five() -> i32
{
    return 5;
}

The line return 5; ends with a semicolon but it returns 5--an i32 value-- to match the function's definition which seems contradictory to what the Rust book said. I know removing the semicolon from return 5; is fine too.

Why?
Is it because the return keyword is a special case?

Thanks.

3 Likes

return 5; causes the enclosing function to return 5. Considered as an expression, it never returns a value at all because it bypasses the rest of the function.

For example:

fn return_five() -> i32 {
    accept({ return 5; });
}

fn accept(_: i32) {
    panic!("oops");
}

accept won't be called with any i32 value, because the return has no value as an expression, but changes the control flow.

The same is true without the inner block:

accept(return 5);
9 Likes

Small nitpick: the fact that it doesn't "yield" a value, but performs control flow instead, is precisely why it isn't an expression but a statement.
By their very definition, expressions, when evaluated/executed, yield a value.

1 Like

Could this argument apply to an expression which calls a function which panics? Or to an if where one branch produces a value and the other branch does a break, continue, or return?

As far as I understand, return is a special case to allow this.

2 Likes

Sort of, but in a roundabout way, and not because of the panic, not directly at least.

There is such a thing as an expression statement, which is essentially an expression executed for its side-effects. Calling a function that returns () is a good example of this.
Syntactically an expression statement looks like an expression with a semicolon trailing it eg hi_tharr(); is an expression statement.

You should read that more as "produce", since the way it's used here is more like how 2 + 4 can be said to "return" 6, even though there's not specifically a function return.

4 evaluates to the value 4.
4; is a statement where the statement doesn't evaluate to anything (because they never do), though it contains an expression that evaluated to 4, which was discarded.

The fact that a return expression is in the statement doesn't change how statements work. You can (uselessly) let x = (return 4); because return 4 is an expression, but you can't let x = (return 4;); because return 4; is a statement and statements don't evaluate to anything.

(And not evaluating to anything is different to evaluating to an uninhabited type.)

6 Likes

Looks like this is an undocumented special case.

According to the reference, the block is of type ():

The type of a block is the type of the final operand, or () if the final operand is omitted.

But from this it appears that if the last statement of the block diverges (e.g. it's a return or break statement), then the type of the block is actually ! rather than ().

And ! is a special "impossible" type that coerces to i32.

5 Likes

No, that is not correct. Statement vs. expression is a syntactic question, and if return were itself a statement, then you couldn't use it where an expression was expected, only in blocks.

9 Likes

Code after diverging expression is a special case, but it's not ! either. It's more like certain categories of dead code in the tail of a block are ignored I think? (I made a bunch of example to see what works and what doesn't some time ago (years probably) but don't recall the details. Something for Rust trivia night.)

4 Likes

I learned today that return can be used within other exprs, and with that it's an expression, albeit a strange one.

While there's something to be said for the syntactic aspect, that alone is incomplete (and hence impractical) as a definition since it doesn't talk about the role or semantics of that syntax at runtime: it's not an expression if it doesn't yield a value. I say that as someone who's written multiple AST walking interpreters, parsers, a fully-fledged SGLR parser generator, and a compiler/runtime combo.

To take another perspective, for a tool that doesn't have a runtime counterpart eg a code formatting tool, there isn't much of a distinction between an expr, a statement, and any other kind of grammatical construct. It's just productions all the way down, with corresponding formating rules.

I found it.

4 Likes

A common example of using return as an expression, and especially the ability of it to take on any type you want (technically because it’s the “never” type “!” which can coerece into any other type) is with a desugared ?-operator. I.e. you’re writing a function returning some Option<…> type, and inside you want to unwrap some Option<…> type and return early in case it was None.

With ? operator it would be e.g.

fn first_as_string(numbers: &[i32]) -> Option<String> {
    let option_first: Option<&i32> = numbers.get(0);
    let first: &i32 = option_first?;
    Some(first.to_string())
}

The relevant line, let first: &i32 = option_first?;, is desugared (ignoring fancy Try trait generalizations) into

let first: &i32 = match option_first {
    Some(n) => n,
    None => return None,
};

This states that, in case None is encountered, at least syntactically the code states that the value of return None is assigned to first. Of course, such a value is never evaluated; but the type-checker must still handle it somehow, and that’s by giving it the “never” type, (read: “the expression will never have a value”), which the type checker will allow to be coerced to &i32, too.

Never type is a little bit more magical, as it interacts with reachability analysis that can also make the return value of blocks without final expressions turn into !. E.g. let’s print something in the None case, too, using a block-style match arm:

let first: &i32 = match option_first {
    Some(n) => n,
    None => {
        println!("returning early!");
        return None
    }
};

(full code in playground)

Now, without reachability analysis, the block

{
    println!("returning early!");
    return None;
}

would have type (), resulting in type errors. E.g. if we “outsmart” the analysis with a if true, we’d get such an error

error[E0308]: mismatched types
 --> src/main.rs:5:17
  |
5 |           None => {
  |  _________________^
6 | |             println!("returning early!");
7 | |             if true { return None };
8 | |         }
  | |_________^ expected `&i32`, found `()`

If the value of the expression were to be used directly, it’d be this instead:

let first: &i32 = match option_first {
    Some(n) => n,
    None => {
        println!("returning early!");
        return None // <- no semicolon!!
    }
};

however, blocks whose end is unreachable are made to have ! type themselves. You can see unreachable analysis also creating warnings if further code is present:

let first: &i32 = match option_first {
    Some(n) => n,
    None => {
        println!("returning early!");
        return None;
        println!("returned early!");
    }
};
warning: unreachable statement
 --> src/main.rs:8:9
  |
7 |         return None;
  |         ----------- any code following this expression is unreachable
8 |         println!("returned early!");
  |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^ unreachable statement
  |
  = note: `#[warn(unreachable_code)]` on by default
  = note: this warning originates in the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)

And you can see the same behavior reproduced with a function returning !-type (e.g. by panicking) like

fn never_returns() -> ! { panic!() }
let first: &i32 = match option_first {
    Some(n) => n,
    None => {
        println!("returning early!");
        never_returns();
        println!("returned early!");
    }
};

which compiles fine (so the block is !-type, not ()-type), and also gives the unreachable code warning like above.

4 Likes

I don't think this is related to ! implementing the trait or not. playground:

fn works() -> impl Trait {
    let x: ! = return 0;
    x
}

fn does_not_work() -> impl Trait {
    let x: ! = loop {};
    x
}

That () had to be a special case fallback in type inference. This is actually a problem that prevents ! from being stabilized.

4 Likes

I still think it is !. The rule seems to be: if you have a block without the final expression, and it never reaches the end, then the type of the block is ! rather than ().

The type of the ex function is deduced to be i32, the block has type !, and that ! coerces into i32.

The type of the nope function is deduced to be ().

1 Like

Yeah, I guess that example couldn't possibly show what I was trying to suss out. And I now agree with you, though it takes some more understanding of the fallback mechanism to really understand where () is (conditionally) coming from.

Here's the code that backs up your comment.

And the additional parts to understand are

  • ! are coerced into "diverging (inference) type variables" (which is what allows them to coerce to anything)

  • After inference runs, any remaining unconstrained diverging type variables have a concrete fallback, similar to the way that uncontrained integer literals fall back to i32

  • The fallback is () or back to !; things are a bit complicated on this front...

    • ...but I believe on stable the only way to get ! in this context is if the return is !
    • ...generic return are one of the explicit () backwards compatibility cases
    • ...and it seems RPIT acts the same as that
    • (or on nightly you can use #![feature(never_type_fallback)]])

In Rust there are no "statements" except let bindings and item (fn, struct, const etc) definitions. Most everything else is an expression and thus has a type and can appear on the right-hand side of an assignment operator (patterns are one exception). Let bindings could in principle be unit-typed expressions like assignments are, but that would just be confusing wrt the scope of the bound names.

The semantics of ; are similar to the comma operator in C and C++, in that both sequence two expressions, discarding the value of the left-hand side. In Rust there’s additionally a convenience rule that a semicolon can appear after the last expression in a block, in which case it is equivalent to ; ().

1 Like

It is not equivalent, as half or more of this thread has explored.

4 Likes

Actually items aren't statements either. You can tell they behave differently because they don't execute in source order:

let x: u32 = 1;
println!(x); // works
println!(x); // fails
let x: u32 = 1;
println!(x); // works
const x: u32 = 1;

(In fact, items don't really execute at all, they are simply declarations in scope for the current block and are not ordered with respect to statements or each other.)

The only statements in rust are let statements like let x = 1; and discarded expressions like 1;.

2 Likes

“statement” arguably is a syntactical category, and such, it shouldn’t matter whether or not the statement “executes” like other statements. Of course the alternative standpoint is that items are merely allowed in statement position, ultimately, it’s an arbitrary decision of how you want to define the terms. However, at least as far as the Rust Reference is concerned, items are listed as “Statement”s, along with let statements, expression statements, and perhaps macro invocations in statement position count as their own thing, too:[1]

Statements

Syntax
Statement :
;
| Item
| LetStatement
| ExpressionStatement
| MacroInvocationSemi

A statement is a component of a block, which is in turn a component of an outer expression or function.

Rust has two kinds of statement: declaration statements and expression statements.

Declaration statements

A declaration statement is one that introduces one or more names into the enclosing statement block. The declared names may denote new variables or new items.

The two kinds of declaration statements are item declarations and let statements.

Item declarations

An item declaration statement has a syntactic form identical to an item declaration within a module. Declaring an item within a statement block restricts its scope to the block containing the statement. The item is not given a canonical path nor are any sub-items it may declare. […]

let statements

[…]

A let statement introduces a new set of variables, given by a pattern. The pattern is followed optionally by a type annotation and then either ends, or is followed by an initializer expression plus an optional else block. […]

Expression statements

[…]

An expression statement is one that evaluates an expression and ignores its result. As a rule, an expression statement's purpose is to trigger the effects of evaluating its expression.

An expression that consists of only a block expression or control flow expression, if used in a context where a statement is permitted, can omit the trailing semicolon. […]


Edit: On second read, I’m noticing that your reply was to a claim that let statements and items are statements, but expression statements aren’t. I suppose replying to such a claim, I empathise with the claim that items aren’t true “statements” either, because they are really items-as-statements, just like the others we excluded were expressions-as-statements.

On the other hand, on second read of the claim by @jdahlstrom you replied to, maybe their point was a different one, too…

Most everything else is an expression and thus has a type and can appear on the right-hand side of an assignment operator (patterns are one exception).

This sounds like this is more of a “most things in Rust are expressions” kind of claim, and in this contexts, “statements that aren’t also expressions” is the actual thing being called out. So perhaps best one should re-word “In Rust there are no "statements" except let bindings and item definitions.” into some more accurate claim like “In Rust, almost everything is an expression, including most things commonly used as "statements", except for let bindings and item definitions.”


  1. macro invocations in statement position are special (compared to expression position) as they can expand to a sequence of statements [which in turn also allows for items, of course] instead of just expanding to a single expressions; and with {} braces, the trailing semicolon is optional ↩︎

1 Like