What is the difference between "break...;" and "break ..."

zhenyu · August 24, 2024, 6:14am

Hello,
I heard that Rust is a very rigorous language. There are two blocks of code, which are the same when ran.

fn main(){
    let mut counter=0;
    let result=loop{
        counter+=1;
        if counter==10{
            break counter*2
        }
    };
    println!("The result is {result}");
}

fn main(){
    let mut counter=0;
    let result=loop{
        counter+=1;
        if counter==10{
            break counter*2;
        }
    };
    println!("The result is {result}");
}

What is the difference between break counter*2; and break counter*2?

paramagnetic · August 24, 2024, 6:37am

There isn't any.

simonbuchan · August 24, 2024, 6:56am

technically, the former is a block ending with a never-typed final expression, making it never typed itself (eg it never evaluates to a value); while the latter is a block without a final expression, which would normally make it unit typed (the empty value ()), but since its unreachable, it ends up being never typed there too.

so yeah, no actual difference.

nickm-BY · August 24, 2024, 7:04am

Try this in OCaml to see what "rigorous" with semicolons is.
Rust will tell you if you have an unintended semicolon:

fn add_one(x: i32) -> i32 {
    x + 1; // error: not all control paths return a value
}

See "expressions" and "statements". Statements and expressions - The Rust Reference

paramagnetic · August 24, 2024, 2:29pm

It absolutely is a value. It's not any less of a value than any other value.

No, that's incorrect – it's divergent, so it's type is "never" (spelled ! although it's unstable). Unit can't coerce to anything else; never/! coerces to all types.

Redglyph · August 24, 2024, 2:45pm

Yes, pedantically, a value of a type that admits only one value. Everything has a value, anyway. A slight abuse of language to make it somewhat clearer: a value that isn't really one.

Are you sure?

It would be divergent in the absence of break, as shown bellow, because there is no return from that loop, but break does the exact opposite by giving a value to it.

To illustrate it explicitly:

let a: () = loop { break };
let b: ! = loop {  };

As stated in the reference,

In the case a loop has an associated break , it is not considered diverging, and the loop must have a type compatible with each break expression. break without an expression is considered identical to break with expression () .

paramagnetic · August 24, 2024, 2:54pm

Yea, 100%.

You are diverting the discussion. In your previous post, you were talking about break itself, which always has type !. Now you are talking about the loop. Those are not the same thing. You wrote above:

This unambiguously refers to the type of the break expression, not that of the loop. And that's never (). It is ! (which can coerce to any other type, including (), but the converse is not true.)

This doesn't illustrate anything, though, because you can assign the (non-existent) result of a diverging loop to () just fine.

Let me show you the actual proof of my assertion: no matter what the type of the argument of break is, the break itself always has type !:

    let () = loop {
        let _: ! = break;
    };
    let () = loop {
        let _: ! = break ();
    };
    let _: i32 = loop {
        let _: ! = break 42_i32;
    };
    let _: f64 = loop {
        let _: String = break 13.37;
    };

Redglyph · August 24, 2024, 3:01pm

That's interesting, thanks for the explanation.

[EDIT, since my post was altered by someone else] To answer your accusation, no, I wasn't trying to divert the discussion, it was a simple misinterpretation from my part which bore no consequence on the main points: the use of semicolon and break returning the value of the loop.

steffahn · August 24, 2024, 7:55pm

While semicolons often make a difference in Rust, this is a case where it turns out to work either way, with the same behavior. Stylistically, I’d usually prefer including the semicolon here… in face, turns out rustfmt will even add it for you here!

Blocks in Rust are usually of the form

{
    first_statement;
    second_statement;
    …
    last_statement;
    final_expression
}

Blocks are expressions, and evaluate to the value of the final expression.

So if you do

let x = {
    some_statement;
    an_expression
}

then some_statement will be executed; afterwards an_expression is evaluated, finally, the result of that becomes the value of the whole block and gets assigned to x.

If a block doesn’t need a resulting value, it can be the “unit” value instead, in Rust syntax the value () of type ()

a block of the type

{
    first_statement;
    second_statement;
    …
    last_statement;
    // no final expression here
}

is usually equivalent to

{
    first_statement;
    second_statement;
    …
    last_statement;
    () // unit-value final expression here
}

The full details on how blocks work are a bit more complex (the so-called “never type” gets involved if the block always “diverges”), but this doesn’t matter here.

An if expression can have 2 blocks with else

if condition {
    // block 1
} else {
    // block 2
}

As block are expressions and evaluate to things, if expressions can do the same.

An if expression without else

if condition {
    // block 1
}

is equivalent to an empty else block

if condition {
    // block 1
} else {
}

which is also equivalent, as discussed above, to a block with unit-value final expression:

if condition {
    // block 1
} else {
    ()
}

It’s this consistency that motivates why if expressions can have final values in the first place. As long as the final value has type “()”. For consistency, it’s allowed then. It is however rarely useful. Hence the stylistic suggestion to usually put the semicolon there rather than not.

To explain all of the syntactical considerations that make this code…

loop {
    counter+=1;
    if counter==10 {
        break counter*2
    }
};

…work, we would also need to talk about why the if expression doesn’t need a semicolon at the end of the loop body; really it’s the same story – consistency For consistency with all the other usages of blocks in Rust, even loop bodies support a final expression, as long as it’s of type (). And if there were further statements after the if, we could even talk about the special semicolon rules that allow certain statements like an if-expression statement, to stay semicolon-free.

So, if you look at it in full detail, Rust syntax is somewhat complex, after all — fortunately, the type system makes sure that it’s almost impossible to have this complexity result in a bug in you program, anyway; you’ll usually get a compilation error that advises you to add or remove a semicolon as necessary.

And then we would also get the the value of break expressions the effect of evaluating such an expression is clear, it breaks the loop and makes the surrounding loop expression evaluate so something. But inside of the loop body, break … also was an expression. That’s what some comments above already explored; it’s another time where the “never type” is technically involved; for the context of if expressions however, all that matters is that the break … itself is considered an expression that can evaluate to () (after a coercion), which is why it is allowed here without the semicolon in the first place, just like if you were to put something like () there directly and unlike the compiler error you get when you put a value of different type.

Topic		Replies	Views
I'm a little confused about this scene of loop and break help	2	64	July 15, 2024
What is the difference between `...` and `..=`? help	3	524	April 16, 2021
Are "break" and "continue" statements or expressions? help	30	575	July 24, 2024
Small questions about using break	4	741	March 30, 2023
What is the difference between b""[..] and &[]? help	5	575	October 25, 2019

What is the difference between "break...;" and "break ..."

Related Topics