Drop semantics and temporary value lifetimes in while let

Back when I was first learning rust, I filed this issue with the book. In short:

while let Ok(job) = receiver.lock().unwrap().recv() {
    // do something time-consuming with job
}

looks nice, but if we want to not hold the lock during the body of the while, we need to instead write:

loop {
    let job = receiver.lock().unwrap().recv().unwrap();
    …
}

I now understand that the temporary values in the expression which is bound by while let are scoped to the end of the subsequent block. That is, in

while let PAT = EXPR { … }

all the values in EXPR live until the closing brace.

I've read Expressions - The Rust Reference

When a temporary value expression is being created that is assigned into a let declaration, however, the temporary is created with the lifetime of the enclosing block instead, as using the enclosing let declaration would be a guaranteed error (since a pointer to the temporary would be stored into a variable, but the temporary would be freed before the variable could be used). The compiler uses simple syntactic rules to decide which values are being assigned into a let binding, and therefore deserve a longer temporary lifetime.

But I don't understand. In the loop version, if the temporaries (notably the guard) in

let job = receiver.lock().unwrap().recv().unwrap();

can be dropped as soon as the statement ends, yet allow job to be accessed afterwards, it seems like in the context of

while let Ok(job) = receiver.lock().unwrap().recv() {

the temporaries (and guard) could be dropped before entering the block.

What am I missing here?

P. S. I understand that according to the predicate pattern loops docs

while let PAT = EXPR {
    /* loop body */
}

is equivalent to

loop {
    match EXPR {
        PAT => { /* loop body */ },
        _ => break,
    }
}

but my question remains. It seems like temporaries involved in EXPR, but not owned by bindings in PAT could be dropped before /* loop body */.

1 Like

I'm surprised to see no replies here. Is this more appropriate for the internals forum?

No, they can't be dropped as soon as the statement ends, because if they were then the lock guard would be released meaning that the mutex would be allowed to aquire a new lock which creates aliasing unique references. One from this loop and another from the new mutex lock.
The current behavior is correct.

Also a match statement doesn't operate on values, but on places (a very subtle difference), this allows things in EXPR to be alive for the entire match statement. And allows this behavior.

let x : &[_] = &[0, 1, 2];
match *x {
    _ => ()
}

Even though *x is !Sized so it can't be on the stack!

No, the internals forum is for design work of Rust itself or about the compiler, not general questions about the language.

Thanks for the reply!

How is that different than what happens in the

loop {
    let job = receiver.lock().unwrap().recv().unwrap();
    …
}

formulation? Once the execution hits finishes the let statement, the MutexGuard is dropped, right?

Where can I learn more about places and how they differ from values?

My question was originally inspired by this comment:

I'm more and more convinced that locks (or anything else where drop timing matters) should always have explicit drop calls, because the implicit drop mechanics seem to be pretty varied and aren't properly documented anywhere.

I'm not in favor of explicit drop calls and I wanted to reply that the drop mechanics are clear and well-documented, but I admit I'm finding this case to be quite subtle and hard to find clear docs on. What would you say to that?

No, the mutex guard is not dropped. It will be dropped at the end of the enclosing scope.

I'm not sure, I learned about them from reading other people's posts.

Yes, this is quite subtle and that's why people didn't answer this question.

Maybe I'm not being clear. I know that's the case with the while let formulation, but with a loop:

loop {
    let job = receiver.lock().unwrap().recv().unwrap();
    /* mutex guard is dropped here, right? */
    job.call_box();
}

That way other threads can access the receiver and accept work while job.call_box() runs, yes? Or am I misunderstanding?

Nope, the lock will not be released until after call_box. You can check this by creating two threads, one that holds a mutex lock forever and another one that tries to acquire a mutex lock after the first thread, and once it acquires the lock prints something out. You will observe that nothing will be printed.

What? Is this excerpt (based on my submission) from the book incorrect?

By using loop instead and acquiring the lock and a job within the block rather than outside it, the MutexGuard returned from the lock method is dropped as soon as the let job statement ends. This ensures that the lock is held during the call to recv , but it is released before the call to job.call_box() , allowing multiple requests to be serviced concurrently.

Or am I really missing something? I definitely remember the while let and loop versions behaving differently.

It seems that I was wrong, sorry about that. Your call_box code just fails to compile with

error[E0716]: temporary value dropped while borrowed
use std::sync::{Arc, Mutex};

fn main() {
  let mutex = Arc::new(Mutex::new("Hello World".to_string()));
  
  let mutex_clone = Arc::clone(&mutex);
  
  let foo: &str = mutex.lock().unwrap().as_str();
  
  std::thread::spawn(move || {
    std::thread::sleep_ms(100);
      mutex_clone.lock();
      println!("Mutex Lock was acquired");
  });

  println!("{}", foo);
  
  loop {
      std::thread::sleep_ms(100);
  }
}

So, yes the temporary is dropped right after the let

Changing to a match gives this error

error[E0597]: `mutex` does not live long enough
use std::sync::{Arc, Mutex};

fn main() {
    let mutex = Arc::new(Mutex::new("Hello World".to_string()));

    let mutex_clone = Arc::clone(&mutex);

    match mutex.lock().unwrap().as_str() {
        foo => {
            std::thread::spawn(move || {
                std::thread::sleep_ms(100);
                mutex_clone.lock();
                println!("Mutex Lock was acquired");
            });

            println!("{}", foo);

            loop {
                std::thread::sleep_ms(100);
            }
        }
    }
}

But if you don't use the result of the mutex.lock().unwrap().foo, then it compiles because nothing is used so it can be dropped immediately.

(I should really check my work before posting next time)

Note that if you call a function that does not depend on the lifetime of the MutexGuard (like String::len), then that also works even if it is used after the guard is dropped because there is no lifetime dependence.

Both while let and loop versions seem to compile fine in the playground for me. Are you seeing something different?

I see, ok so what's happening is like my last little note about String::len

mspc::Recieber::recv gives an owned value, so returned value's lifetime is independent of the mutex guard. The guard is dropped immediately after the recv finishes, and the mutex can be locked again from a different thread. Note that this only works because recv gives back an owned value, if it returned a reference then it wouldn't compile.

It seems to affect everything "match related" (playground).
I found this issue from 2016, I don't know if a RFC was ever made.

2 Likes

Yeah, that makes sense to me, but it doesn't explain why the while let and loop cases should be different. After looking at the issue @leudz linked (thanks so much for that!), I'm inclined to think it's actually a bug. But it would apparently require an RFC to fix since it would indeed change drop semantics.

Yes that does seem to be the case.

while let $pat = $expr
    $block

is the same as

loop {
    if let $pat = $expr
        $block
    else {
        break;
    }
}

So this is not really about while let and loop being different; it is about let $pat = $expr and if let $pat = $expr (or equivalently a match binding) being different / having different semantics.

So let's go back to a more minimal example, getting rid of loops since they don't play any role:

match vs. let

First of all, let's set up the stuff that replicates a mutex guard behavior:

  • mutex.lock() constructs a value with drop glue, i.e., that mem::needs_drop():

    #[derive(Default)] // default() <=> mutex.lock()
    struct WithDropGlue;
    impl Drop for WithDropGlue { fn drop (&mut self) {} }
    
  • value.recv() borrows this value to, in turn, create an "owned thing", i.e., something that is : 'static (hence value could be dropped while this thing exists without causing any issue whatsoever):

    impl HasDropGlue {
        fn recv (self: &'_ Self) -> () {}
    }
    

match case

fn foo ()
{
    //    |-- mutex.lock() ----|
    match HasDropGlue::default().recv() { _smth => {
        match_body();
    }}
}

gives the following MIR (with panic = abort to remove unwind paths):

The temporary is indeed only dropped at the end of the whole match expression

let

fn foo ()
{
    {
        //          |-- mutex.lock() ----|
        let _smth = HasDropGlue::default().recv();
        body();
    }
}

The temporary is dropped right after the let assignment

Summary

fn foo ()
{
    //    |-- mutex.lock() ----|
    match HasDropGlue::default().recv() { _smth => {
        match_body();
    }} // drop::<HasDropGlue>(__guard__)
}

fn foo ()
{
    {
        //          |-- mutex.lock() ----|
        let _smth = HasDropGlue::default().recv();
        // drop::<HasDropGlue>(__guard__)
        body();
    }
}

what if revc() had returned something tied to the '_ input lifetime ?

Then the the match version would still work, with a temporary held alive as long as needed; whereas the classic let version would fail to compile with a "temporary does not live long enough" error.

  • Note that when raw pointers are involved, such as with CString::new().as_ptr(), Rust does not see the borrow and can thus lead to a use-after-free unsoundness.

What to make of all this

This is one of the "quirks" of Rust, which proves that if explicit RAII semantics are wanted (such as with lock guards), then anonymous temporaries should be avoided. You may submit an RFC to change this, to get a change of temporaries drop placement, but know that:

  1. this would require, at least, a new edition

  2. Having program semantics change with an edition change would be very weird (why should a line in Cargo.toml change the semantics of a program without a compilation error? c.f. CString::new().as_ptr())

Thus this behavior cannot be changed.

On the other hand, I can see the issue of this being a particularly obscure thing for Rust, which goes against its "explicit semantics" philosphy: I thus think that filing an RFC for a warning lint against temporaries with drop glue in match would be a great thing to do, and if it fails, the lint could be added to clippy instead.

  • Actually, it would be great if, in a similar fashion to #[must_use] types, we could have #[time_sensitive_drop] or #[must_drop_explicitely] types.
6 Likes

I have opened an issue for that for clippy. In fact, I also started the implementation and opened a pull request. At the time, the implementation was almost complete, but I did not have the energy to explain the motivation behind this lint and to finish the implementation. Feel free to take my work as a starting point.

If I understand you correctly, this is the same as what is called a "linear type system". There is an interesting blog post explaining the problems that must be overcome to implement this in Rust.

1 Like

Well, I just want that functionality as warning lints, in the same vein that #[must_use] is; hence it wouldn't require true linear types. In other words, #[must_use] is more like #[should_use], and I am suggesting a #[should_drop_explicitely] added lint attribute.

Nice! I'll try to see if I have time to work on the lint thing :slight_smile:

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.