Why doesn't this UnsafeCell usage break?

Hi Rustaceans!

I was seeing Jon Gjengset's Crust of Rust on Smart Pointers and Interior Mutability, on the Cell topic, when someone asked what would happen with a wrong implementation. He then intentionally let his Cell::get() method retrieve a reference to its content and implemented a test that held that reference and tried to use it after it was dropped. The objective was to see a SEGFAULT or whatever the effect of trying to use a dangling pointer is. But, to everyone's bewilderment including Jon's, the code worked! :scream:

Here is the video at the exact time, but I've improved the test code here and discovered a few things myself:

  • the Drop is indeed being called on the first content
  • I intentionally set a second String bigger than the first one so it would be required to allocate more space for it (he says in the video the deallocator happened not to have deallocated the first one yet, but for me, it doesn't make sense)
  • bizarrely, the reference returned by get() after the second set() is the same as the previous one

Well, thinking deeper about it, since get() returned a reference with the exact same address as before (the value field of its struct, I suppose) and its inner content has surely changed, does this mean that the stack part of the String (pointer to heap, length, and capacity) was reused into the same UnsafeCell field space? It is repr(transparent) after all... :bulb:
(Awesome! I came to this conclusion while trying to conclude this post :stuck_out_tongue_winking_eye:)

So, does this mean Jon would not be able to demonstrate it breaking, at all?
EDIT: Humm, if that doesn't really break this Cell, why std's Cell can't return a reference to its content? :thinking:

Thanks.

use std::cell::UnsafeCell;

pub struct Cell<T> {
    value: UnsafeCell<T>,
}

impl<T> Cell<T> {
    pub fn new(value: T) -> Self {
        Cell {
            value: UnsafeCell::new(value),
        }
    }

    pub fn set(&self, value: T) {
        unsafe { *self.value.get() = value };
    }

    // this is WRONG, intentionally.
    pub fn get(&self) -> &T {
        unsafe { &*self.value.get() }
    }
}

#[test]
fn why_this_works() {
    #[derive(Debug)]
    struct DropString(String);
    impl Drop for DropString {
        fn drop(&mut self) {
            println!(">>  drop: {}", self.0);
        }
    }

    let x = Cell::new(DropString(String::from("first")));
    let s = x.get();
    println!("addr of s    : {s:p}");
    x.set(DropString(String::from("oops, a new value!")));
    println!("addr of get(): {:p}", x.get());
    println!("print {s:?}");
}

It prints:

cargo test -- --nocapture

running 1 test
addr of s    : 0x16d6b6598
>>  drop: first
addr of get(): 0x16d6b6598
print DropString("oops, a new value!")
>>  drop: oops, a new value!
test why_this_works ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

(I didn't watch the clip.)

You're not trying to print the dropped String. There is still UB from having a live shared reference (no longer involving UnsafeCell) when the DropString location is overwritten. (Run with Miri -- e.g. under Tools in the playground, top right -- to see an UB error.)

You can turn it into trying to print the dropped String like so...

    let s = &*x.get().0;

...and thus even if you ignore the immediate UB, the Cell is clearly broken.

5 Likes

Wow, now I see it! Even though the test passes, it is triggering UB!! :scream:

error: Undefined Behavior: constructing invalid value: encountered a dangling reference (use-after-free)

And we can see it breaking by deref'ing the return!

print "\0\0\0\0\0"

Amazing response, as always... Thank you very much @quinedot!

Another cool thing @quinedot, if I change it back to let s = x.get() (which seemed to work), Miri reports another error, a very interesting one:

error: Undefined Behavior: trying to retag from <2901> for SharedReadOnly permission at alloc1406[0x0], but that tag does not exist in the borrow stack for this location
    --> /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:2392:1
     |
2392 | fmt_refs! { Debug, Display, Octal, Binary, LowerHex, UpperHex, LowerExp, UpperExp }
     | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     | |
     | trying to retag from <2901> for SharedReadOnly permission at alloc1406[0x0], but that tag does not exist in the borrow stack for this location
     | this error occurs as part of retag at alloc1406[0x0..0x18]
     |
     = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
     = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <2901> was created by a SharedReadOnly retag at offsets [0x0..0x18]
    --> src/main.rs:38:13
     |
38   |     let s = x.get();
     |             ^^^^^^^
help: <2901> was later invalidated at offsets [0x0..0x18] by a Unique retag
    --> src/main.rs:15:18
     |
15   |         unsafe { *self.value.get() = value };
     |                  ^^^^^^^^^^^^^^^^^
     = note: BACKTRACE (of the first span):
     = note: inside `<&why_this_works::DropString as std::fmt::Debug>::fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:2382:71: 2382:78
     = note: inside `core::fmt::rt::Argument::<'_>::fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/rt.rs:177:76: 177:95
     = note: inside `std::fmt::write` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:1178:21: 1178:44
     = note: inside `<std::io::StdoutLock<'_> as std::io::Write>::write_fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/mod.rs:1823:15: 1823:43
     = note: inside `<&std::io::Stdout as std::io::Write>::write_fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/stdio.rs:786:9: 786:36
     = note: inside `<std::io::Stdout as std::io::Write>::write_fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/stdio.rs:760:9: 760:33
     = note: inside `std::io::stdio::print_to::<std::io::Stdout>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/stdio.rs:1116:21: 1116:47
     = note: inside `std::io::_print` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/stdio.rs:1226:5: 1226:37
note: inside `why_this_works`
    --> src/main.rs:42:5
     |
42   |     println!("print {s:?}");
     |     ^^^^^^^^^^^^^^^^^^^^^^^
note: inside `main`
    --> src/main.rs:25:5
     |
25   |     why_this_works();
     |     ^^^^^^^^^^^^^^^^
     = note: this error originates in the macro `fmt_refs` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

error: aborting due to 1 previous error

It's my first time running Miri, but I was surprised how it pinpointed the precise unsafe block that originated the error! I have minimal experience with unsafe, but I know debugging them can be a can of worms, and Miri seemed reassuring... :relieved:

Yeah, that's the "you read through a shared reference (with no UnsafeCell)[1] but the memory you read from it was written to after you obtained it" UB.


  1. the returned &DropString involves no UnsafeCell ↩︎

2 Likes

When miri finds the UB, it can be very helpful.
One thing to keep in mind though is that it can not always detect unsoundness/UB[1]. So if a particular code path is not executed during the miri run, it could contain UB that is missed by miri.


  1. If it could, we could just make the compiler error in those cases, making the language "100% safe" ↩︎

4 Likes

You are in epistemic confusion.

Undefined Behavior means "anything may happen".

It does NOT mean "anything WILL happen".

It does NOT mean "something IN PARTICULAR will happen".

It does NOT mean "guaranteed crash".

It does NOT mean "guaranteed incorrect results".

It means what it means. Which is: anything MAY happen.

2 Likes

That reminded me of a talk by Kevlin Henney :smile:

1 Like

And, really importantly, UB can mean "this time, the exact outcome you wanted happened". That's a big part of what makes UB so painful - it's completely legitimate for the system to do what you intended, even though the code contains UB, right up until something changes and the compiler comes up with a new "this is what I'm going to interpret this code as".

2 Likes

Yes, intermittent apparent "correct" behavior is the worst part of it all.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.