`RefMut` → `Ref` conversion

I'm a bit confused about the methods on std::cell::Ref. Are there any use cases for the clone, map, and map_split methods? I understand the value for RefMut (since there can only be one mutable borrow, you can change what it points to split one borrow into multiple references), but for Ref, is there any reason that those operations can't already be done after dereferencing? (i.e. ref_cell.borrow().deref().clone() vs ref_cell.borrow().clone().deref()). It seems like both map methods simply call the provided closure and that clone simply clones the inner reference. Sorry if I'm missing something.

For example, look at the doctests:

Ref::map:

use std::cell::{RefCell, Ref};

let c = RefCell::new((5, 'b'));
let b1: Ref<(u32, char)> = c.borrow();
// Why not: let b2: &u32 = &b1.0;
let b2: Ref<u32> = Ref::map(b1, |t| &t.0);
assert_eq!(*b2, 5);

Ref::map_split:

use std::cell::{Ref, RefCell};

let cell = RefCell::new([1, 2, 3, 4]);
let borrow = cell.borrow();
// Why not: let (begin, end) = borrow.split_at(2);
let (begin, end) = Ref::map_split(borrow, |slice| slice.split_at(2));
assert_eq!(*begin, [1, 2]);
assert_eq!(*end, [3, 4]);

I know that they have to remain for backward compatibility anyway but my question is what is/was their purpose? I can't seem to find anything about it.

Because an object containing a RefCell cannot hand out borrows to things inside the RefCell.

struct Thing(RefCell<HashSet<u32>>);

impl Thing {
    fn foo_ref(&self) -> Ref<'_, HashSet<u32>> {
        self.0.borrow()
    }

    fn foo_borrow(&self) -> &HashSet<u32> {
        // this is impossible to implement.
        // (When would the lock be released?)
    }
}

It follows that, if you want to give a reference to a field of the thing inside a RefCell, you require a way to transform a Ref without turning it into an &T.

struct Thing(RefCell<(HashSet<u32>, Vec<i32>)>);

impl Thing {
    fn foo_ref(&self) -> Ref<'_, HashSet<u32>> {
        Ref::map(self.0.borrow(), |x| &x.0)
    }
}
2 Likes

To expand on this, using .deref().x() gives you a reference borrowing from the Ref, while .map(|v| v.x()) gives you a reference borrowing from the RefCell.

With Ref::map, since the result is still only borrowing from the RefCell, you can return it from a method like in @ExpHP's code. If you used .deref().x(), the resulting value would be borrowing from

At runtime, the difference between these two strategies is who is responsible for decrementing the ref-count. If you use Ref::map, the new Ref will have a different value inside, but it'll be the one which decrements the ref-count when dropped. This is why it's safe to send it off - it is self-contained, and will correctly handle the RefCell's ref-count.

If you use .deref().x(), though, the result is only some &T - this is a value, but it is only a value. It won't decrement the ref-count for the RefCell. Thus, to keep safety, it's tied to the original Ref that deref() was called on. That original Ref is the one which will be decrementing the ref-count. This is also why it's not sound to return that &T from a method of a struct owning the RefCell. The Ref is no longer attached to it, and thus there's no way to decrement the ref-count at the correct time when the &T is no longer used.

4 Likes

Thank you, I see it's a matter of ownership, Ref is unlike a regular reference in that it actually affects future borrows from the RefCell when it's dropped (very similarly to Rc). Is that correct?

Also, in that case, shouldn't there exist a way to map a RefMut into a Ref? Let's say I have a RefCell<HashMap<u8, u8>> and I call:

let ref_mut = hash_map.borrow_mut().map(|m| m.entry(key).or_default());

Shouldn't there be a way to convert that into a Ref? After a quick look I found a PR for that but it seems to have been abandoned?

2 Likes

Shouldn't there be a way to convert that into a Ref ? After a quick look I found a PR for that but it seems to have been abandoned?

This is valid in this particular case, but not in the general case. I think the discussion in that RFC pull request mentions it, but an immutable borrow derived from a mutable borrow is allowed to keep whatever it's borrowed from "mutably".

For instance, if I write a method like the following:

struct X { }

impl X {
    fn method(&mut self) -> &u32 { ... }
}
fn main() {
    let x = X { ... };
    let y = x.method();

Rust will count x as mutably (not immutably) borrowed fro as long as y is kept around.

I believe the reason for this is interior mutability - if I have a &mut u32, I can wrap it in a Mutex or another RefCell, and then pass out an immutable reference to it. The resulting reference being immutable doesn't mean that the whole borrow is immutable.


There's a concrete example of an (&mut T) -> &U mapping being unsound at https://github.com/rust-lang/rust/pull/57401#discussion_r246515764, from that pull request. Or at least, I think it would make this unsound?

If the method existed and returned a Ref, you could write:

let ref_mut = hash_map.borrow_mut().map(|m| m.entry(key).or_default());
let so_a_ref = ref_mut.to_ref(|mutable_ref| Cell::from_mut(mutable_ref));
// since so_a_ref is now a `Ref` and not a `RefMut`, we can make new refs:
let second_ref = hash_map.borrow().map(|m| m.get(key).unwrap());
let val = &second_ref.some_method_borrowing_internally();
let val2 = so_a_ref.set(new_value); // calling Cell::set
// now `val` is totally invalid!
5 Likes

@daboross I agree with the first part of your comment but as for the second part, I'm not 100% convinced. Specifically I think the line: ref_mut.to_ref(|mutable_ref| Cell::from_mut(mutable_ref)) would not compile since the closure still takes a mutable reference to the hash_map. However there are plenty of closures that don't need to do this, such as simply |m| m.get(key) or even |m| m.

In other words what I am saying is that the closure should be given an immutable reference; there should be no closure that "converts" a mutable reference to an immutable one:

ref_mut.map(|m| m.entry(key).or_default()).downgrade();

Whatever mutable things you need to do can be done in the existing ref_mut.map() with a Fn(&mut T) -> &mut U, strictly before converting to a Ref; then you could map further with the existing ref.map() with a Fn(&T) -> &U. I can't think of any valid reason you would ever need to have Fn(&mut T) -> &U.

All of the breaking examples in the PR rely on a function that takes a mutable reference and returns an immutable one (and as for this example, I think the author is mistaken since .borrow().lock().unwrap() would give you an immutable reference). If we trash the downgrade_map method (to_ref in your example), I think this is sound. The reason this difference this makes is that we say if you are given a mutable reference, whatever you give back still needs to be a mutable reference; this takes care of the Cell::from_mut issue. Let me know if you have another example of unsound behavior or if you think these examples are still unsound.

Nope, Mutex::lock gives you an exclusive reference. And that exclusive reference would alias the shared reference from the downgrade. So we would need to hold on to an exclusive reference.

1 Like

Could you clarify what you mean? If I understand right, the code would look something like this:

let ref_cell: RefCell<Mutex<T>> = ...;
let value1_ref: Ref<T> = ref_cell
    .borrow_mut()
    .map(Mutex::get_mut)
    .downgrade();
let value1: &T = value1_ref.borrow();
let value2_ref: MutexGuard<T> = ref_cell.borrow().lock().unwrap();
let value2: &T = &*value2_ref;

I can see how this seems unsound. We called Mutex::get_mut() so now we should be unable to use the mutex, right? But in the end it has the same result as this:

let ref_cell: RefCell<Mutex<T>> = ...;
let value_ref: Ref<T> = ref_cell
    .borrow_mut()
    .map(Mutex::get_mut)
    .downgrade();
// OR: let value_ref = ref_cell.borrow().map(|m| m.lock().unwrap())
let value1: &T = value_ref.borrow();
let value2: &T = value1;

My point is that no matter what, you'll still have to follow the rules that only 1 thread at a time can read the Mutex, and that the inner value still follows the normal borrowing rules: 1 mutable reference or multiple immutable references. It is definitely important to make sure that all of the unsafe code in Mutex still behaves properly for this approach, but I think that this would still be sound in theory without changing anything public (or maybe anything at all). I can't think of any example that would break those rules.

To see problem change to;

let value2: &mut T = &mut *value2_ref;

Under normal safe circumstances it is impossible to call lock() while the result of get_mut() is in use.

(Note: also could instead of Mutex use nested RefCell for same functionality.)

1 Like

Ok that's a good point, here's a new example:

let ref_cell: RefCell<Mutex<T>> = ...;
let value1_ref: Ref<T> = ref_cell
    .borrow_mut()
    .map(Mutex::get_mut)
    .downgrade();
let value1: &T = value1_ref.borrow();
let value2_ref: MutexGuard<T> = ref_cell.borrow().lock().unwrap();
let value2: &mut T = &mut *value2_ref;

You're right, in this case value1 and value2 should not both exist. I didn't realize that MutexGuard<T> is actually a mutable reference but that makes sense. Also you make a good point about RefCell<RefCell<T>>. I guess the issue is that these methods have signatures like Fn(&self) -> &mut T. I wonder if you could get around this issue with a marker trait for structs that aren't interior-mutable? This is getting more and more complex :grimacing:

Let's call the marker trait InteriorImmutable. Like other marker traits, it is implemented for all structs where each field is InteriorImmutable. The only structs that do not implement this trait would be RefCell, Cell, Mutex, etc. and structs containing those. We could do:

impl<'b, T: ?Sized + InteriorImmutable> RefMut<'b, T> {
    fn downgrade(self) -> Ref<'b, T> { ... }
}

Now the question is: is that really worth it...? Downgrading RefMut to Ref is a small use case but I wonder if this is the only way to make it work.

This is an interesting idea. The compiler already has this information, because it can use it to optimize loads around things that don't contain an UnsafeCell (which is the basis for all interior mutability; the other Cells, Mutex, Atomics, etc. are built using UnsafeCell). Even if this particular use is pretty niche, it might be more generally useful to have this distinction available at the type level.

However, something just occurred to me -- hypothetically, if you removed get_mut from Mutex's API, would it not then be safe? It's not necessarily interior mutability that makes downgrade unsound; it's the combination of interior mutability with regular mutability, i.e. the ability to get &mut T from &mut Mutex<T>. If the only way to get a &mut T is by locking the mutex, I think the problem goes away without having to forbid all interior mutability.

1 Like

You're sort of right. There's also RefCell, Rc/Arc (so you don't get a dangling reference), RwLock, probably all of the atomics, (many things that use UnsafeCell) and this gets very confusing with Pin, etc... but I do think that it's only smart pointers like these that are preventing this from being sound at least in std. Still, there are a lot of these types that are not interior-immutable, which is why I think it's best to have a marker trait to distinguish them. Plus this way we don't need to remove any of these get_mut methods, preserving backwards compatibility.

The one main problem I've thought of is that this might break unsafe code that relies on current borrowing rules. Also I'm not entirely sure if this would cause other unintended side effects or unexpected behavior (but I can't think of any).

Right, I should clarify -- I meant, if Mutex::get_mut hypothetically had never existed, downgrade would be safe specifically for RefCell<Mutex<T>>, not for all RefCells generally.

I think InteriorImmutable is a blunter tool than you might want. Concretely, I don't think Arc and Rc pose a problem for RefCell::downgrade, since they don't expose &mut access to their refcounts. You can get a &mut T from a &mut Arc<T>, but doing so doesn't borrow the refcount, only the T. So, if you were to use InteriorImmutable for this bound, it would be overconservative in the case of Arc and Rc.

1 Like

Removing get_mut is a breaking change that cannot be patched over, as such there is no way that we will ever do it. So it is pointless to discuss that.

Also, I think that it is a bigger problem if getting an exclusive reference on a Mutex does not guarantee that we also have exclusive access to the inner value (if we can't go from &mut Mutex<T> to &mut T). This seems like something that should obviously be correct.

3 Likes

What I meant with Rc is that you would be able to call get_mut and downgrade, then later clone, deref, then drop the Rc; that would cause it to drop and leave the 1st reference dangling (this could be fixed internally though by making get_mut behave as deref_mut, and maybe deprecating it too). I admit I'm not 100% sure about the internals for this but I think you're right, conforming to InteriorImmutable should be possible in theory. I do think that using this marker trait is definitely necessary; we just need to make sure we can make Rc and Arc, etc. conform.

I realize you're right, see below.

I think that this would work if we made new types, MutexShared/RefCellShared (with better names), that implement InteriorImmutable and don't have get_mut methods (only lock_mut and borrow_mut). You could convert those types into their corresponding non-shared versions, but not vice versa. Then we could deprecate the non-shared versions. We might also need to do this with RwLock but AFAIK aside from the problem of 3rd party crates that's it. Thoughts?

Cloning the Rc increments the refcount (let's say to 2), and dropping the clone just decrements it back to 1, so that's not a way to create a dangling pointer. Cloning the Ref doesn't, but you can't drop the Rc because you can't get a RefMut while a Ref exists. I don't think RefCell<Rc<_>> leads to unsoundness like RefCell<Mutex<_>> does.

An interesting thought experiment; however, I can't come up with a practical use for it. downgrade seemed like a good idea at first, but I don't see enough benefit to it that it would be worth adding a marker trait to std just for this purpose. You could easily make a crate for this, BTW, with an unsafe auto trait InteriorImmutable and a copy of the RefCell implementation plus downgrade with the appropriate bounds.

If there is a case to be made that downgrade should exist, we'd want to also explore other ways to implement it. Making a crate is a way to measure how useful this feature is and explore potential issues with the API.

That said, I still think InteriorImmutable could have some interesting unexplored properties besides just making downgrade sound. Are there other uses of this trait that might justify adding it?