Why does the temporary in the last statement is dropped after the local variable?

There is an example in Destructors - The Rust Reference that confused me

let local_var = PrintOnDrop("local var");
// ...

// Dropped at the end of the function, after local variables.
// Changing this to a statement containing a return expression would make the
// temporary be dropped before the local variables. Binding to a variable
// which is then returned would also make the temporary be dropped first.
match PrintOnDrop("Matched value in final expression") /* #1 */ {
    // Dropped once the condition has been evaluated
    _ if PrintOnDrop("guard condition").0 == "" => (),
    _ => (),
}

Why is the temporary created in the match expression at #1 dropped after the local variables? IIUC, Drop scopes says

Each variable or temporary is associated to a drop scope. When control flow leaves a drop scope all variables associated to that scope are dropped in reverse order of declaration (for variables) or creation (for temporaries).

The temporary in the match expression is created after the local variable, hence it should be dropped prior to that of the local variable. For the return case, I also cannot understand why the temporary is dropped prior to the local variable instead.

It's designed to be the case I guess. The explanation is above the example code:

Temporaries that are created in the final expression of a function body are dropped after any named variables bound in the function body, as there is no smaller enclosing temporary scope.

The scrutinee of a match expression is not a temporary scope, so temporaries in the scrutinee can be dropped after the match expression. For example, the temporary for 1 in match 1 { ref mut z => z }; lives until the end of the statement.

I'm curious about where the normative wording for this note is

Temporaries that are created in the final expression of a function body are dropped after any named variables bound in the function body, as there is no smaller enclosing temporary scope.

To illustrate what the following two sentences mean

  • Temporaries that are created in the final expression of a function body are dropped after any named variables bound in the function body ...
  • The scrutinee of a match expression ... lives until the end of the statement.

Normally, we have this: Rust Playground

fn main() {
    let local_var = PrintOnDrop("local var");
    match PrintOnDrop("Matched value in final expression") /* var */ {
        _ if PrintOnDrop("guard condition").0 == "" => (),
        _ => (),
    }; // Note: the `;` here, thus var drops at the `;`
       // i.e. the scrutinee lives until the end of the statement
}
// output
drop(guard condition)
drop(Matched value in final expression)
drop(local var)

Why does the scrutinee can live until the end of the statement?

    /*let var2 = */ match PrintOnDrop("Matched value in final expression") /* var1 */ {
        _ if PrintOnDrop("guard condition").0 == "" => (),
        x => x,
    }; // because var1 can go to x and then var2 if the scrutine is assgined to a variable through match, 
       // so var1 is dropped in the outer of whole match expression

And for the case in the question: Rust Playground

fn main() {
    let local_var = PrintOnDrop("local var");
    match PrintOnDrop("Matched value in final expression") /* var */ {
        _ if PrintOnDrop("guard condition").0 == "" => (),
        _ => (),
    } // Note: no `;` here, so this is the final expression
      // i.e. var are dropped after any named variables (including local_var)
}
// output
drop(guard condition)
drop(local var)
drop(Matched value in final expression)

I just asked, where can I find the formal wording specifying that " Temporaries that are created in the final expression of a function body are dropped after any named variables bound in the function body" in the Rust Reference? In other words, what is the formal wording for which the note is trying to interpret?

It's right in the part of the snippet which you edited out:

Notes:

Temporaries that are created in the final expression of a function body are dropped after any named variables bound in the function body, as there is no smaller enclosing temporary scope.

The match expression in your example specifies the function's return value. The return type of the function is (), and the returned value is thus also (), but a value's a value. This means that the returned value drop rules apply.

If you change the match expression into a match statement by adding ; after its closing brace, the rules for statements will apply, and the drop order will be as you expect.

2 Likes

The temporary in question isn't part of a statement, and hence its drop scope is the entire function:

Apart from lifetime extension, the temporary scope of an expression is the smallest scope that contains the expression and is one of the following:

  • The entire function body.
  • A statement.
    [...]

The scope of the local variable is the function body block, which is contained within the scope of the entire function:

Given a function, or closure, there are drop scopes for:

  • The entire function
  • Each statement
  • Each expression
  • Each block, including the function body
    • In the case of a block expression, the scope for the block and the expression are the same scope.
  • Each arm of a match expression

Hence, the local variable is dropped first:

Drop scopes are nested within one another as follows. When multiple scopes are left at once, such as when returning from a function, variables are dropped from the inside outwards.

  • The entire function scope is the outer most scope.
  • The function body block is contained within the scope of the entire function.
    [...]
1 Like

FWIW, temporary lifetimes are widely considered one of the more confusing parts of Rust, and people are exploring what other options might look like to be more predictable/simple while still being useful. See https://rust-lang.zulipchat.com/#narrow/stream/403629-t-lang.2Ftemporary-lifetimes-2024/topic/stream.20events/near/389265212 if you're interested in getting involved.

2 Likes

According to the syntax

Function :
FunctionQualifiers fn IDENTIFIER GenericParams?
( FunctionParameters ? )
FunctionReturnType ? WhereClause?
( BlockExpression | ; )

BlockExpression :
{
InnerAttribute*
Statements ?
}
Statements :
Statement+
| Statement+ ExpressionWithoutBlock
| ExpressionWithoutBlock

Since the match expression is not a ExpressionWithoutBlock, hence it should be the component in the Statement.

Moreover, where is the part that says a tail expression is dropped in the entire function?

Where is this rule in the Rust reference? In other words, where is the normative rule that says the returned temporary value is dropped after all local variables?

Indeed, this syntax definition is super confusing.

What happens is that this syntax doesn't quite correspond to what is and what isn't a statement. If, according to this syntax, the last "Statement" is an "ExpressionWithBlock" without the trailing semicolon, then in fact it is not a 'statement" despite the production rule being called that. Instead, it is the "final operand" expression.

Statements are usually required to be followed by a semicolon, with two exceptions:

  1. Item declaration statements do not need to be followed by a semicolon.
  2. Expression statements usually require a following semicolon except if its outer expression is a flow control expression.

For instance, in { { 5 } }, the { 5 } is the "final operand" expression, not a statement, despite corresponding to the "Statement" production in the syntax tree.

Here:

Temporary scopes

The temporary scope of an expression is the scope that is used for the temporary variable that holds the result of that expression when used in a place context, unless it is promoted.

Apart from lifetime extension, the temporary scope of an expression is the smallest scope that contains the expression and is one of the following:

  • The entire function body.
  • A statement.
  • The body of an if, while or loop expression.
  • The else block of an if expression.
  • The condition expression of an if or while expression, or a match guard.
  • The body expression for a match arm.
  • The second operand of a lazy boolean expression.

As explained in the already quoted note, for the final expression in the function body, the "entire function body" is the smallest scope in the list (because the last expression isn't a statement, despite the confusing syntax tree definition).

The final expression in Statements must be a ExpressionWithoutBlock, obviously a match expression is not ExpressionWithoutBlock, instead, it is ExpressionWithBlock, so such a match expression must be subsumed to

Statement

  • ExpressionStatement
  • ExpressionWithBlock ; ?

with the trailing comma omitted. So, the match expression is a statement anyway.

The match expression indeed parses as the "Statement -> ExpressionStatement -> ExpressionWithBlock ;?" BNF production, but it is not actually a statement because there is a special exception about this. The last ExpressionStatement in a block isn't a statement when it doesn't end with a semicolon. It then becomes the final operand expression. A statement is, confusingly, not the same thing as "Statement" in the BNF syntax.

The reference is indeed confusing about this. In fact I don't think it actually states this exception unambiguously anywhere. It should be improved.

It would also be useful to either refactor the BNF syntax or rename the "Statement" symbol there to remove the confusion where a Statement isn't actually a statement.

1 Like

Where is a formal wording in Rust Reference says so? However, a statement can be a expression statement, which is described in Statements - The Rust Reference

Rust has two kinds of statement: declaration statements and expression statements.

There isn't, as far as I can tell (as I have already said).

All right. Even if we agree that the final expression is not a statement, that can only say that the smallest scope containing the expression is

  • The entire function body.

which is also the containing scope of the local variables. You said

The temporary in question isn't part of a statement, and hence its drop scope is the entire function:

However, the entire function body(scope) is different from the entire function scope.Or, do you mean the entire function body is not function body block, instead, it means the entire function scope?

IMO, super let seems to make the scope of a temporary object more complex. the result -scope of the temporary seems not to be intuitive anymore according to the introduction in Temporary lifetimes - HackMD

It's pretty intuitive and simple if you'll thing about it. It just really seems that you are reading manuals like compiler may read Rust code.

Like this:

  • Forget common sense… trying… trying… impossible.
  • SUDO forget common sense… attempting… attempting… done.
  • Disable imagination… disabling… disabling… finished.
  • Turn off intuition… trying… trying… doesn't work.
  • SUDO turn off intuition… attempting… attempting… done.
  • NOW let's read the documentation.

That's wrong way of doing that. Compilers don't have imagination, intuition, common sense, but humans do (well, they are supposed to have these, at least). It's wrong to assume that compiler would use common sense when it would “read” your program, but it's equally wrong to assume that human wouldn't do that.

Forget “last expression in a function”. That's red herring. Let's consider the following example without such issues:

fn main() {
    let foo = {
        let _local_var = PrintOnDrop("local var");
        match PrintOnDrop("Matched value in final expression") {
            // Dropped once the condition has been evaluated
            _ if PrintOnDrop("guard condition").0 == "" => "Foo",
            _ => "Bar",
        }
    };
    println!("foo: {foo}");
}

It works in the obvious way:

guard condition
local var
Matched value in final expression
foo: Bar

Is it confusing? No, it's not (at least for me): guard is dropped where expression ends. At the closest closing semicolon, more or less. Like in C++.

But Rust is expression language, not statement language. Means we can take internals and put into the function:

fn internal_function() -> &'static str {
    let _local_var = PrintOnDrop("local var");
    match PrintOnDrop("Matched value in final expression") {
        // Dropped once the condition has been evaluated
        _ if PrintOnDrop("guard condition").0 == "" => "Foo",
        _ => "Bar",
    }
}

fn main() {
    let foo = internal_function();
    println!("foo: {foo}");
}

It still works in the same way. As it should. Why would the same code moved from one place to another behave differently?

Now, if you would phrase the situation it like “yes, these explanations make sense and everything works intuitively, but documentation doesn't correctly explain the situation”, then I may agree with you.

Rust documentation is, famously, not yet finished and suggestions about improvements are welcome.

But you claim that it's not documentation issue, but, rather, Rust definition issue. That's strange: Rust behaves in a way that any normal human (with common sense, imagination and intuition not forcibly disabled) would expect it to behave. Because it would be strange to see temporary variable “dropped before semicolon” and it would be strange to see it dropped differently when code is refactored.

Now, if you look on this example:

fn get_drop(str: &'static str) -> Option<PrintOnDrop> {
    Some(PrintOnDrop(str))
}

fn main() {
    println!("1");
    if get_drop("first").is_some() && get_drop("1.5").is_some() && let None = get_drop("last") {
        println!("2");
    } else {
        println!("2");
    }
    println!("3");
}

It produces:

1
Dropped! 1.5
Dropped! last
2
Dropped! first
3

Which is really not intuitive… and that's why that construct is not stabilized. Not because it doesn't match the documentation, but because the way it's behaving doesn't make any sense to a normal human.

Please try to distinguish cases where Rust behaves strangely and cases where documentation doesn't adequately describe perfectly sane behavior.

These are different things (even if distinguishing between these requires imagination, intuition and common sense… all these qialities which compilers lack… but humans are not compilers!)

1 Like

I just asked why the temporary created in the tail expression in the function body is dropped after the local variable in that same function body. I tried to find out the relevant rule in the Rust Reference, which lacks the normative wording but only a note.