Why is the semicolon after the return statement optional?

Hello everyone.
What is the difference between these two pieces of code?

  1. return 3 is an expression, its value is 3
  2. return 3; is a statement, and its return type is unit ()

but the following two pieces of code can be compiled success, what is wrong with my understanding?

fn foo() -> i32 {
    return 3; // type is unit ()
}

and

fn foo() -> i32 {
    return 3 // type is i32
}
2 Likes

The idiomatic form omits the return keyword completely:

fn test_fn() -> i32 {
    3
}

Omitting the semicolon will return the result of the expression on the last line. The Book describes this better than I ever could. Especially leading into the following section "Functions with Return Values".

Thank you for your response, but you did not answer my question, and this book did not tell me the difference between return 3 and return 3;

1 Like

I would say

return 3

should be the same as

{ return 3; () }

But this won't compile:

fn run() -> i32 {
    { return 3; () }
}

fn main() {
    let _i = run();
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
warning: unreachable expression
 --> src/main.rs:2:17
  |
2 |     { return 3; () }
  |       --------  ^^ unreachable expression
  |       |
  |       any code following this expression is unreachable
  |
  = note: `#[warn(unreachable_code)]` on by default

error[E0308]: mismatched types
 --> src/main.rs:2:17
  |
1 | fn run() -> i32 {
  |             --- expected `i32` because of return type
2 |     { return 3; () }
  |                 ^^ expected `i32`, found `()`

For more information about this error, try `rustc --explain E0308`.
warning: `playground` (bin "playground") generated 1 warning
error: could not compile `playground` due to previous error; 1 warning emitted

So I understand your confusion about it.

But let's try:

fn run() -> i32 {
    let _j: String = return 5;
    6
}

fn main() {
    let _i = run();
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
warning: unreachable expression
 --> src/main.rs:3:5
  |
2 |     let _j: String = return 5;
  |                      -------- any code following this expression is unreachable
3 |     6
  |     ^ unreachable expression
  |
  = note: `#[warn(unreachable_code)]` on by default

warning: `playground` (bin "playground") generated 1 warning
    Finished dev [unoptimized + debuginfo] target(s) in 1.04s
     Running `target/debug/playground`

Not an error, but only a warning. So this works. Apparently, the type of return x isn't () but can be coerced into any type, like the never type, I think?

Note that return is a control flow expression. It doesn't have a value within the context it is written in, per se:

// Compiles
fn test_fn() -> i32 {
    let _: String = return 3;
}

The compiler knows that what follows the return is unreachable. And the assignment is unreachable just like everything that follows is. The type of the expression probably technically the never type (!), though I didn't try to confirm this. That way, you can do things like

pub fn test_fn() -> i32 {
    let name = if random() { "Joe" } else { return 13 };
    println!("{}", name);
    42
}

Being able to end with just return value; has been around a long time -- 2011 or before. However, it's actually more permissive than that. Playing with it some, you can apparently have any number of unreachable statements, but no unreachable trailing expression unless it matches the return type. So this is fine:

fn test_fn() -> i32 {
    let _: String = return 3;
    (); (); (); ();
}

And so is this:

fn test_fn() -> i32 {
    let _: String = return 3;
    (); (); (); ();
    7
}

But this will complain about a return type mismatch:

fn test_fn() -> i32 {
    let _: String = return 3;
    (); (); (); ();
    ()
}

I can imagine how the middle one especially is useful in coordination with, for example, macros -- it can be hard to tell if you unconditionally return or not. Similarly, you might end up with something like...

fn f() -> i32 {
    #[cfg(all(A, B))]
    return 3;
    #[cfg(not(A))]
    do_stuff();
    8
}

Which might correspond to any of...

fn f() -> i32 {
//  !A !B          !A B           A !B           A B 
                                                 return 3;
    do_stuff();    do_stuff();
    8              8              8              8
}

Which is a not-great contrived example, but hopefully illustrates some possible utility.

12 Likes

TL; DR:

  • Return expressions are diverging expressions that can coerce to any type
    • (The expressions themselves, not the values they return from the function)
  • Unreachable statements are allowed
    • But not unreachable trailing or return expressions, unless they also match the return type
  • There is, therefore, no practical difference between any of these
fn f() -> i32 { 0 }
fn g() -> i32 { return 0 }
fn h() -> i32 { return 0; }
fn i() -> i32 { return 0; "Whew, made it."; }
6 Likes

I answered the question posed in the topic. To go into more detail, return 3; is an expression. It is not a statement that evaluates to unit.

edit: The fact that it has a trailing semicolon doesn't matter. You can add as many as you like. You will just get a warning about tailing semicolons.

pub fn test() -> u32 {
    return 3;;;;;;
}

Not sure if you meant return 3; or return 3 (without semicolon). The first is a statement (I think?), while the latter is an expression: See Return expression.

That is literally the same link I posted. The semicolon, as far as I understand it from the ref is a "Terminator for various items and statements". The trailing semicolon in return 3; would technically be a statement. But the link we both posted clarifies that the return expression destroys the current function activation frame. Which is why the unreachable things following the expression are unreachable.

Oh sorry, I didn't notice.

I would say adding the semicolon turns the return expression into an expression statement.

So I don't see return 3; as two subsequent things (expression + empty statement) but a single thing (an expression statement); while return 3 is the contained expression.

However, now I'm confused about why

fn foo() -> i32 {
    return 5;
}

works :grinning_face_with_smiling_eyes:. Ah, now I see: It works because return 5; doesn't evaluate to () but can be coerced into any type (including i32). (And of course return 5 will exit the function so the rest becomes unreachable, but as @quinedot pointed out, the last expression of such unreachable code still needs to match the type if there is one.)


P.S.: I try to be nitpicky here on purpose, just to figure out the internals of the language. It's mostly irrelevant for practical programming, of course.

1 Like

Not just "like" the never type, it is the never type! Any diverging expression (return, loop without break, etc.) returns !

1 Like

Yes, and the "EBNF" on that page shows that semicolons are optional! (whoops. Optional for ExpressionWithBlock...)

An ExpressionWithBlock has an optional semicolon, but the ReturnExpression is an ExpressionWithoutBlock (see Reference on Expressions).

Yep ^^

(You corrected while I was typing my reply.)

It seems to me that the never type is the reason for the optional semicolon following return expressions, as pointed out by @Cyborus04. It also explains why the semicolon is optional in this case:

pub fn test() -> u32 {
    panic!("oh no!")
}
1 Like

I thought the never type was unstable, but apparently it's existing already in certain places, as the reference on the never type indicates:

The never type ! is a type with no values, representing the result of computations that never complete. Expressions of type ! can be coerced into any other type.

[…]

NB. The never type was expected to be stabilized in 1.41, but due to some last minute regressions detected the stabilization was temporarily reverted. The ! type can only appear in function return types presently. See the tracking issue for more details.

I didn't find any indication (yet) that return 3 really has the type ! (and it's not a function return type here, even though it does make the function return). But maybe it doesn't matter as it behaves exactly like the ! type, I would assume. And perhaps it's implemented internally as the never type as well, even if not mentioned in the reference as such.

I think it's technically an unbound inference variable, right now. Which falls back to (), if you trick the code into being able to tell that somehow. This is why the stabilization of never type has been turning out to be more difficult than expected.

3 Likes

Is that really true? I think "return 5;" doesn't really evaluate to anything as it's not an expression but a statement.

The function body is a BlockExpression, and in case of

fn foo() -> i32 {
    return 5;
}

that BlockExpression { return 5; } doesn't have a final expression. Now the reference says about BlockExperessions:

The type of a block is the type of the final operand, or () if the final operand is omitted.

Thus { return 5; } would have the type () :scream: and not the never type or whatever "thing" that can be coerced into an i32.

Hence my confusion now about why the semicolon at the end of the return statement is allowed.

The following is valid as well:

fn run() -> i32 {
    let _j: String = { return 5; };
    6
}

fn main() {
    let _i = run();
}

(Playground)

Is this a special rule regarding the return statement. It doesn't look like it is, because we can replace the return expression with an expression of the never type:

fn never() -> ! {
    panic!("never ever");
}

fn five() -> i32 {
    5
}

fn run() -> i32 {
    let _s1: String = { never(); () }; // not allowed
    let _s2: String = { never(); }; // OK
    let _s3: String = { five(); }; // not allowed
    let _s4: String = { never(); five() };  // not allowed
    let _s5: String = { never(); five(); }; // OK
    6
}

fn main() {
    let _i = run();
}

(Playground)

So maybe the reference is not 100% correct with saying:

The type of a block is the type of the final operand, or () if the final operand is omitted.

If that was true, then why can we do the following?

let _s2: String = { never(); };
1 Like

From the experiments shown here, it appears the current rule is something like:

The type of a block is the first of these that applies:

  1. The type of the final operand, if present.
  2. The never type !, if any operand evaluates to !
  3. The unit type ()
1 Like

This theory would also be consistent with the following example (which is basically the same as in the previous examples, but written more nicely and demonstrating this with the BlockExpression as a function body):

fn five() -> i32 {
    5
}

fn never() -> ! {
    panic!("never ever"); five();
}

/*
fn doesnt_compile() -> ! {
    panic!("never ever"); five()
}
*/

fn main() {
    never();
}

(Playground)

If that's true, is that a case for opening an issue to fix the reference?

As pointed out, the actual behavior seems to be:

Any other opinions on this? Should the reference be updated/fixed? Maybe I misunderstood something or we just missed some other part in the reference that explains the behavior?