Blog Post: Breaking News: Non-Lexical Lifetimes Arrives for Everyone

I wrote a post that attempts to enumerate the cases where NLL is going to break your code!

Enjoy! Feedback welcome!

15 Likes

Really interesting article, thank you :slight_smile:

Few comments:

unsafe {
    let r = Reader::new(ManuallyDrop::take(&mut self.file));
    std::mem::forget(self);
    r
}

Using both mem::forget and ManuallyDrop is a code smell; In this case it feels like the intended code was

unsafe {
    let r = Reader::new(ptr::read(&mut self.file as *mut _ as *const _));
    std::mem::forget(self);
    r
}
  • which is not exception safe: if Reader::new() panics, there will be a double drop of self.file
unsafe {
    let at_file: *mut _ = &mut self.file;
    mem::forget(self);
    Reader::new(ptr::read(at_file as *const _))
}

The case with the data race on the discriminant of layout-optimized type is quite curious, to be honest:

  • abstraction-wise the code is perfectly fine

    • for instance, with [single-threaded] interior mutability on the innards there is no issue, or when the writes are atomic, since in both cases they are not allowed to write zero (e.g., hypothetical AtomicNonZero... type), so that x's discriminant remains well-defined as Some(_)
  • and the problem lies implementation/machine-wise;

This is quite funny, since it is usually the other way around (code being UB despite it working under the current implementation) :upside_down_face:

2 Likes

fixed as_reader, thanks! (I admit that I may have taken a short-cut in how I handled *mut _ and *const _ in my adaptation of your fix to as_reader. Feel free to point out why I cannot go straight to *const &mut F...)

As for the data-race on the discriminant: Isn't it still a problem to have a read of the non-zero word at the same instant that the other thread writes a different non-zero word? It certainly sounds like a data-race to me, at least under the perhaps simplified abstraction that I keep in my own head.

1 Like

AFAIK, as long as the pointer comes from a &mut _, it is fine.

It is a data race unless atomic operations are used to perform both the read and the write. With a hypothetical AtomicNonZeroUsize, I expected the following code to be fine ...

fn main ()
{
    use ::std::{
        mem,
        num::NonZeroUsize,
        sync::atomic::Ordering::*,
        thread,
        time::Duration,
    };
    use sync::atomic::AtomicNonZeroUsize;

    assert_eq!(
        dbg!(mem::size_of::<AtomicNonZeroUsize>()),
        dbg!(mem::size_of::<Option<AtomicNonZeroUsize>>()),
    );
    
    let ref x = Some(AtomicNonZeroUsize::new(NonZeroUsize::new(42).unwrap()));
    
    match *x {
        | Some(ref inner) => {
            assert!(::core::ptr::eq(
                dbg!(x as *const _ as *const ()),
                dbg!(inner as *const _ as *const ()),
            ));
            ::crossbeam::thread::scope(|scope| {
                scope.spawn(|_| (1 ..= 400).for_each(|n| {
                    inner.store(NonZeroUsize::new(n).unwrap(), Relaxed);
                    thread::sleep(Duration::from_millis(5));
                }));

                scope.spawn(|_| (0 .. 40).for_each(|_|
                    match *x { // READ DISCRIMINANT
                        | Some(_) => {
                            dbg!(inner.load(Relaxed));
                            thread::sleep(Duration::from_millis(50));
                        },
                        | None => panic!("Race condition!"),
                    }
                ));
            }).expect("Some thread panicked");
        },
        | None => unreachable!(),
    }
}

... but now that I come to think about it, the read of the discriminant is not atomic, so it seems indeed unsound; if that's the case, atomic types cannot have niches (cc @RalfJung ?), and there will never be a thing such as AtomicNonZeroUsize.


(In a single-threaded scenario it should be fine though, since the read of a discriminant should have the same semantics as Cell::get).

I agree with this. Atomic types cannot have niches.

3 Likes

Thanks for the post!

Regarding the current as_reader implementation:
I think that ptr::read-ing after the mem::forget is technically UB. The documentation specifically says that it is not guaranteed that pointers remain valid. Instead, perhaps something like:

let file: &mut F = unsafe {
    let f: *const &mut F = &mut self.file;
    std::ptr::read(f)
};
std::mem::forget(self);

Is better? I think this avoids the double-free issue, as the forget is done immediately after the "new" reference is created.

1 Like

Ah you make a good point. I didn't realize that forget didn't specify that pointers to the state in question would remain valid (and in fact I can easily imagine improved versions of the compiler that would corrupt the state in question.)

I think your suggestion looks good, I'm going to try to adapt it.

(I'm also going to check with Miri folks to see if they have plans to try to detect this scenario, since AFAICT they currently do not do so.)

Good point! This complicates everything, sadly :thinking:

For instance, I think your code is UB in that between ptr::read and mem::forget, *self.file is aliased (by file and self) and at least one of them (in this case, both) is pretending to have a unique access (&mut F).

  • (Or is this seen as a reborrow and all is fine?)

Another way around is using *mut:

let file: &mut F = unsafe {
    let raw_file: *mut F = ptr::read(
        &mut self.file // &mut &mut F
            as *mut &mut F
            as *const &mut F
            as *const *mut F
    );
    ::core::mem::forget(self);
    &mut *raw_file
};

Sadly this is quite hard to generalize; maybe if we think of *mut T as MaybeUninit<&mut T> we could have the following generalization:

let inner: Inner = unsafe {
    let raw_inner: MaybeUninit<Inner> = ::core::ptr::read(
        &mut self.inner // &mut Inner
            as *mut Inner
            as *const MaybeUninit<Inner>
    );
    ::core::mem::forget(self);
    raw_inner.assume_init()
};
  • cc @RalfJung: does using MaybeUninit<&mut T> instead of &mut T prevent the aliasing problem?
1 Like

Hmm, interestingly enough an issue regarding this problem has appeared within "real code", (regarding CString):

    /// Bypass "move out of struct which implements [`Drop`] trait" restriction.
    ///
    /// [`Drop`]: ../ops/trait.Drop.html
    fn into_inner(self) -> Box<[u8]> {
        unsafe {
            let result = ptr::read(&self.inner);
            mem::forget(self); // <- Error, breaks uniqueness
            result
        }
    }

Now that the blog post has been updated, I cannot see the wrong as_reader implementation any more. :wink: Does someone have a copy? I'd like to see what you think the problem is.

1 Like

The above issue regarding CString has been fixed using the following pattern:

(General) Solution

// Having
let Wrapper { inner: _, .. } = wrapper;
// this is how to extract `inner` from `wrapper` even when `Wrapper : Drop`
let inner: Inner = unsafe {
    let wrapper_no_drop = ManuallyDrop::new(wrapper);
    ptr::read(&wrapper_no_drop.inner)
};

Thanks for noticing this! I actually realized this myself later, while reading the doc for UnsafeCell, but I hadn't gotten around to posting here :). Aliasing &mut is mentioned there specifically, and UnsafeCell as the "solution"; I believe it should indeed provide another way to solve this issue, I'll try to write code late.

It seems to me though, that the general solution with ManuallyDrop doesn't work. If you have
Inner<'a>(&'a mut RealInner) you would get the exact same aliasing issue as in my code (after the ptr::read and before leaving scope and dropping the ManuallyDrop). The solution of holding a pointer, and converting it to a reference after the mem::forget works, as far as I can tell, but this has to be done "deeply".

The MaybeUninit solution seems like it could work. @RalfJung, which version of the code are you looking for? I believe that the first post here by @Yandros quotes both the original version and his suggestion for improvement; and my first post here suggested an improvement upon that one.

No, mutating & is mentioned there!

There is no legal way to have aliasing &mut, and UnsafeCell makes no difference for &mut. If you find any docs saying anything else, please let me know and I'll see that fixed.

I guess I just don't know which is which, then. But if you can switch to the ManuallyDrop pattern we now use for CString, that should work. :slight_smile:

1 Like

The previous versions can all be accessed via the source code repository for my blog:

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.