Requesting help reviewing unsafe code

Hey there,

I've been working on a UI library, which has an API that requires me to add elements to a HashMap while returning references to its elements. (This is used to implement a version of React's use_state.)

In order to accomplish that, I've written a wrapper type that allows me to do that. A slightly modified version with project-specific types is here.

While I'm reasonably confident in the code and it executes in Miri successfully, this is my first serious use of unsafe and I'd really appreciate some second opinions. Is this code, specifically the method get_or_insert_with, safe?

Thanks in advance!


Note: I have edited the example to add additional safety invariants in the comments.

I'm not sure it is sound. Rewriting the method without unsafe (Playground) leads to the following error:

error[E0499]: cannot borrow `wrapper` as mutable more than once at a time
  --> src/main.rs:20:16
   |
19 |     let val1 = wrapper.get_or_insert_with("1", || Rc::new(RefCell::new(1)));
   |                ------- first mutable borrow occurs here
20 |     let val2 = wrapper.get_or_insert_with("2", || Rc::new(RefCell::new(2)));
   |                ^^^^^^^ second mutable borrow occurs here
...
23 |     println!("{}", val1.borrow());
   |                    ---- first borrow later used here

And indeed, the method returns a borrow over the lifetime 'a generated from a mutable borrow of the same lifetime, which usually means that the mutable borrow must still be considered to be in use.


I do realize that the point of using an Rc is to provide an extra level of indirection in case the HashMap moves its elements around. (Why not simply use a Box for that, by the way?) However, I don't believe the borrow derived from a smart pointer can be used after the pointer itself will have moved, so unfortunately, the lifetime error does seem legitimate to me.

According to the contract of Rc::as_ptr, the pointer derived from an Rc is valid so long as its underlying strong_count is one or greater, which is not affected by whether Rc instances are moved, only if they are Dropped. Coercing to a reference might introduce unsoundness, but I don't see why it would. However, I don't know the rules for these coercions so I might be missing something.

As for why I didn't use Box, it does not have an equivalent as_ptr method. I think the reason it doesn't is because Box has a DerefMut implementation, which would make it rather easy to overlap mut and non-mut borrows, leading to a rather hard-to-use API. An immutable version of Box is something I might look into in the future but I'll take the slight inefficiency of Rc for now.

You can use the into_raw method for boxes.

2 Likes

That is correct, although I would only consider this true for the literal raw pointer returned by as_ptr(), and not necessarily for any references derived from it. The compiler is allowed to make fewer assumptions about raw pointers than it is allowed to make about references (e.g. interior mutability works with raw pointers just as well as with UnsafeCell).

And regular references are very sensitive, so to speak, to moving stuff around. There's also the question of provenance: what is the provenance of the reference you finally obtain? Does it come from the Rc, or does it come from its pointee? If it's the former, the the code is not sound because the reference can be held while the object providing its provenance moves. If, however, it's the latter, i.e. it is considered to come directly from the pointed object, then it should be fine.

I'm not competent enough to decide which of the two
scenarios is actually true. Maybe @RalfJung can chime
in?

1 Like

Thanks for the suggestion, but that doesn't really work in my case because I anticipate this method being called more than once per key, meaning I need multiple references at a time, but I can only call into_raw once. I think the better approach for me is to wrap a Box with only a Deref impl and provide my own as_ptr impl. (However, that makes the unsafe code a bit more suspect, so not sure if I should be going that direction for what's probably only a modest increase in performance/memory efficiency.)

Fun results and a UAF in Miri with

     pub(crate) fn get_or_insert_with(&mut self, key: &'static str, value: impl FnOnce() -> Rc<RefCell<u32>>) -> &'a RefCell<u32> {
+        self.inner.remove("1");
         let rc_elem = self.inner.entry(key).or_insert_with(value);

Edit: I see now I violated your comment on the implementation constraints. I'll keep pondering; the thing I was really poking at is that you're creating references that last over multiple &mut borrows, and I believe each such borrow invalidates the previous references.

2 Likes

Haha, I tried that exact thing when testing this :smile:

Here's a variation more directly related to what I was thinking (stacked borrow violation in Miri). It's not a slam dunk still, you could tighten your restriction to not allow generating any mut references to the inner values whatsoever.

1 Like

You should be able to use this type instead of Rc.

struct ShareBox<T> {
    value: *mut T,
    _marker: PhantomData<T>,
}
impl<T> ShareBox<T> {
    fn new(value: T) -> Self {
        Self {
            value: Box::into_raw(Box::new(value)),
            _marker: PhantomData,
        }
    }
    /// safety: Caller must ensure that the box is destroyed after the end of
    /// the provided lifetime 'a.
    unsafe fn get_ref<'a>(&self) -> &'a T {
        &*self.value
    }
}
impl<T> Drop for ShareBox<T> {
    fn drop(&mut self) {
        unsafe {
            drop(Box::from_raw(self.value));
        }
    }
}

full example

3 Likes

Good catch! I've modified the playground link to contain that additional safety invariant. I wasn't indending on creating such references, but that should be documented.

Thanks for this example, I really appreciate it! It inspired me to create a variant that uses a bit less unsafe:

pub struct SharedBox<T> {
    inner: Box<T>
}

impl<T> SharedBox<T> {
    pub fn new(val: T) -> Self {
        Self {
            inner: Box::new(val)
        }
    }
    
    /// Returns a pointer valid for the lifetime of `shared`.
    pub fn as_ptr(shared: &Self) -> *const T {
        std::ptr::addr_of!(*shared.inner)
    }
}

impl<T> Deref for SharedBox<T> {
    type Target = T;
    
    fn deref(&self) -> &T {
        &*self.inner
    }
}

full usage here

Some would argue that even one unsafe block is one too many, so I'm trying to write as little unsafe as possible. Hopefully I didn't add some subtle UB somewhere :sweat_smile:

The safety gets a lot more complicated if you change inner to hold a Box<T> rather than a raw pointer because any kind of access to a Box implicitly asserts unique access. I would definitely use a raw pointer to be on the safe side.

The number of unsafe blocks does not directly correspond to how clear the soundness is.

2 Likes

Hmm interesting, I'm out of my depth here so I'll take your word for it. Thanks for the guidance. Out of curiosity, what do you mean by "unique access" here? Can't multiple immutable references and const pointers refer to the same block of memory owned by Box?

They can, but they must be created from the Box, and you cannot access the Box until you are done using the references. Touching the box makes them invalid.

2 Likes

I.e. it is exactly the same concern I had about using Rc, but with Box.

So does creating a reference from a pointer derived from Box not count as creating it from Box itself? Is there a place where these provenance rules are documented? (And would this situation change if as_ptr were a method of Box itself?)

The created-from relation generally forms a tree, but references created from a reference created from a Box is generally like a reference created from a Box. I'm mostly worried about the box being moved around by the hash map, as moves assert uniqueness.

The rules are described in the stacked borrows model. One description of it can be found here.

2 Likes

Ah, I see. By using into_raw we don't move the Box after construction, but instead a pointer, which the compiler assumes less things about leading to valid code. That's very clever (and subtle)! Thanks for walking me through that. By using your variant of Shared<Box>, is the Wrapper as you adapted in your example safe to use, or are there any lingering concerns? Again, thanks for the explanations!

I don't have any lingering concerns about the version I posted.

1 Like