Why does Miri dislike a function returning NonNull?

Here is the simplest code I could mimick a strange Miri error for. It may look hacky, but it's a lot simpler than the original BTreeMap code already.

use std::ptr::NonNull;

struct S {
    a: i32,
    b: i32,
}

struct Holder {
    s: S,
}

struct Proxy<'a> {
    ref_a: &'a mut i32,
    ptr_s: NonNull<S>,
}

fn point_me_to(s: &mut S) -> NonNull<S> {
    unsafe { NonNull::new_unchecked(s as *mut _) }
}

impl S {
    fn gimme_an_a(&mut self) -> &mut i32 {
        &mut self.a
    }
}

impl Holder {
    fn make_proxy(&mut self) -> Proxy {
        let ref_s = &mut self.s;
        let ptr_s = point_me_to(ref_s);
        let ptr_s = unsafe { NonNull::new_unchecked(ref_s as *mut _) };
        Proxy { ref_a: ref_s.gimme_an_a(), ptr_s }
    }
}

impl Proxy<'_> {
    fn change(mut self) {
        *self.ref_a *= 5;
        unsafe {
            self.ptr_s.as_mut().b *= 7;
        }
    }
}

fn main() {
    let mut h = Holder {
        s: S { a: 2, b: 3 },
    };
    let p = h.make_proxy();
    p.change();
    assert_eq!(h.s.a, 10);
    assert_eq!(h.s.b, 21);
}

(Playground)

Building gives an obvious warning, but that's not what my question is about. Remove the 2nd definition of ptr_s, which is the same as the 1st one inlined anyway, and the warning is gone. But then running with Miri starts complaining: "Undefined Behavior: no item granting read access to tag found in borrow stack.". Regardless of how ptr_s is obtained, the code seems to run fine without Miri's scrutiny, not just this example, but plenty of btree tests mocking about with similar code, so I'm pretty sure that point_me_to is not returning a plainly wrong pointer. So what is wrong about point_me_to?

It doesn't happen when I replace the NonNull with a raw pointer, and not when I cut out the Holder level from the example.

Isn't it just saying you have defined ptr_s but never used it?

If you want to get rid of the warning you can use

_ptr_s

as it suggests.

This is a legitimate warning. You've created a ptr_s variable then immediately created a second ptr_s variable which shadows the first.

What you're seeing is called Variable Shadowing. The idea is you create a second variable on the stack with the same name as another variable making it impossible to refer to the original variable by name after that point, while still leaving the value alive.

Minor nit: this doesn't have anything to do with Miri. Miri is used to evaluate code in a const context, or run your entire program if you use cargo miri run.

My question was not about the warning. It's just what playground pasted when sharing the example. My question is about running this example under Miri, with or without the line the warning is about.

1 Like

The issue is that you can't mix &mut and *mut the way you are.

Disclaimer: miri is not exact -- it may miss actual UB -- and the borrowing model it enforces is not neccesarily the exact superset of safe Rust's that is (will be?) required of unsafe code (though it is quite likely that it's close).

If I'm not mistaken, your code exhibits UB even when you inline fn point_me_to, it's just miri's nonexact checking that causes it to miss the UB in that case.

Desugaring a little should help make this clearer:

fn make_proxy(&mut self) -> Proxy<'_> {
    let ref_s = &mut self.s;
    let ptr_s = point_me_to(&mut *ref_s);
    Proxy {
        ref_a: S::gimme_an_a(&mut *ref_s),
        ptr_s,
    }
}

The underlying issue is that &mut S is a unique reference to the entirety of S. When you use it, that invalidates any raw references derived from it, which includes ptr_s, meaning they can no longer be used soundly.

I believe miri misses the UB when you inline point_me_to because that eliminates the reborrow involved in calling it, but misses that the reborrow to call S::gimme_an_a invalidates the pointer due to inexact tracking.

Without seeing larger context, I believe the solution for your BTreeMap is just to use *mut more and stop trying to mix &mut and *mut to the same "allocated object".

3 Likes

I hope with "is just to use *mut more", you mean it's enough to let gimme_an_a operate on a raw pointer, but that its result can still be a reference to a part of S. Like in this variation that makes Miri quiet again. Because in the bigger picture, that reference to a part of S is the chief reference in NodeRef and I wouldn't want to go over to raw pointers there.

So long as the region covered by the &mut and accessed by the *mut are disjoint, you should be fine.

A &mut region within a *mut accessible region should is fine so long as

  • the &mut is derived from the *mut, and
  • the &mut is not "used" after the *mut is used to access the covered region.

(What a "use" of a &mut is, is still under discussion. It definitely includes dereferencing it, and potentially (probably) involves passing it around (which reborrows it).)

1 Like

(as a side note: when you're writing within the standard library / compiler, it's perfectly "fine" to take advantage of things which are defined as UB for consumers but known to the compiler to not cause problems, so long as it's done deliberately, on purpose, and documented for if the assumption ever changes.)

1 Like

If you ran:

fn main ()
{
    let mut s = 42;
    let _ = {
        let ref_s = &mut s;
        let ptr_s = &mut *ref_s;
        *ref_s = 0;
        (ref_s, ptr_s)
    };
}

you'd get:

error[E0506]: cannot assign to `*ref_s` because it is borrowed
 --> src/main.rs:7:9
  |
6 |         let ptr_s = &mut *ref_s;
  |                     ----------- borrow of `*ref_s` occurs here
7 |         *ref_s = 0;
  |         ^^^^^^^^^^ assignment to borrowed `*ref_s` occurs here
8 |         (ref_s, ptr_s)
  |                 ----- borrow later used here

which expresses that the operation on line 7 is operating on an exclusive &mut reference, which, by definition, allows Rust to exploit that property and would be UB if the reference weren't truly exclusive / unique. So, that ptr_s that is also pointing to s gets invalidated when the &mut ref it was reborrowing is used with plain &mut powers.
So, if we don't use the invalidated ptr_s anymore all is fine, which corresponds to NLL (Non-Lexical Lifetimes), borrow-checker-wise.
However, here, in my example which "reproduces" your code, we do use the invalidated ptr_s (borrow later used here), hence the error.

Now, you may say that you were using raw pointers, thus freeing you from that constraint. But that's not the case, since the problem lies within the &mut-based mutation, which invalidates any pointers that may have been created / produced from it. In a way, using raw pointers (and unsafe) does not give you more freedom, it gives you more responsibility. Indeed, your code compiled, and despite it not following these required aliasing rules, thus exhibiting UB.

How do we do this properly then?

Do not use exclusive references &mut. They are almost impossible to use soundly when raw pointers are involved.

This leads to two options:

  • only use raw pointers when performing mutation:

    fn main ()
    {
        let mut s = 42;
        let (_, ptr_s) = unsafe {
            let ref_s: *mut _ = &mut s;
            let ptr_s: *mut _ = ref_s;
            *ref_s = 0; // no longer assumes non-aliasing; it only assumes non-parallel accesses
            (ref_s, ptr_s)
        };
        unsafe {
            dbg!(*ptr_s);
        }
    }
    

    A way to phrase the difference is that &mut-based mutation requires lack of concurrent pointers, whereas *mut-based mutation requires lack of parallel / simultaneously active pointers.

  • Use the safe Rust abstraction (references) associated to this usage of *mut pointers for concurrent / aliased pointers: &UnsafeCell <_> (the UnsafeCell may be wrapped within other structs).

    Indeed, in a way, &UnsafeCell would be seen as a third kind of Rust reference, provided we renamed &mut to &uniq (for its uniqueness guarantee). If we do that, then &UnsafeCell would be what in our minds &mut is: a mutable reference, which may, or may not be, aliased.

    With that vision, we can then see some obvious patterns emerge:

    • we can downgrade a &uniq to a &mut, but going in the other direction is, again, very dangerous.

      • This lets us better understand why the code with raw pointers passed fine:

        when we did: let ref_s: *mut i32 = &uniq s;, we can see the mut on the left hand side and the uniq on the right hand side: this was implictly doing this downgrade operation:

        let ref_s: *mut i32 =
            &uniq s
              // : &uniq i32
                as &mut i32
                as *mut i32
        ;
        // Or, without the `uniq` notation, and back to the current usage of `&mut`:
        let ref_s: *mut i32 =
            Cell::from_mut(
                &mut s // : &mut i32
            ) // : &Cell<i32> ~ &UnsafeCell<i32>
            as *const Cell<i32> as *const UnsafeCell<i32>
            as *mut i32
        ;
        

        By that I mean that our *mut i32 was a &Cell<i32> in disguise!

        So, without unsafe, we could have simply written:

        let mut s = 42;
        // let (_, ptr_s) = unsafe {
        //     let ref_s: *mut _ = &mut s;
        //     let ptr_s: *mut _ = ref_s;
        //     *ref_s = 0;
        //     (ref_s, ptr_s)
        // };
        // unsafe {
        //     dbg!(*ptr_s);
        // }
        let (_, ptr_s) = {
           use ::core::cell::Cell;
           let ref_s: &Cell <_> = Cell::from_mut(&mut s);
           let ptr_s: &Cell <_> = ref_s;
           ref_s.set(0);
           (ref_s, ptr_s)
        };
        dbg!(ptr_s.get());
        

I hope you can now kind of see what was wrong with your initial code, and realize how dangerous it is, when writing unsafe code, to have that simplistic idea of &mut = mutable reference. If there is a TL,DR, it is that what people currently call &mut is rather an &uniq, which should make it obvious within your code how that asserted uniqueness was a lie.
If you want "intuitive &mut _", use &UnsafeCell<_> (there should be a from_mut for that too).

  • in practice, this ends up being instead of &mut, use & shared references, and for the moments where you need mutation, use a shared mutability wrapper arounds the specific fields that need it, (depending on your needs, a Cell may suffice, but other times a RwLock or a finer-grained AtomicCell will be required. If you really want a Sync wrapper without paying extra synchronization costs, since you know parallel mutations cannot possibly happen, then you may be able to use UnsafeCell, which yields *mut _ raw pointers that do not assert uniqueness when used to perform mutation).
1 Like

Thanks for the clear and elaborate answers. Beyond my initial question, I don't really grasp the possibilities of UnsafeCell or even Cell yet, but it seems they are too flexible for the job. What is happening in this example, and at least some of the BTree code, is stack based and it feels like it should be possible to write it without any kind of ambiguity.

Say I, the end user, have a BTreeMap (Holder in the example above) and invoke entry. This gives me a proxy to an entry in the map. Let's only consider the case where the key existed already. I can then read and write the value of the entry and all the time the borrow checker makes sure I don't touch the map until I'm done with the entry (implicitly - I don't even have to drop the entry or let it go out of scope). If that was the whole API, we wouldn't even need a pointer to implement this (well, plenty of pointers in the tree held by the map, but that's a different story).

Now add a little complication: allow removing the entry from the map. The remove_entry method that does this naturally consumes the entry, so it's obvious the entry cannot be used any longer. When implementing this, we will operate in the same safe(?) order: do stuff with the entry, and then continue to use the whole of the map. But we want to borrow that map reference just a little earlier than the paragraph above, temporarily at the end of the implementation of remove_entry, at a point where it is obvious that the previous borrow (the entry) is gone, or should I say inaccessible. That changeover point is the middle of the change function in the example. Though that example didn't model the fact that the thing storing ref_a is actually consumed at that point (and didn't even touch the a data again, so it could have stored a plain reference to b). So here's a finetuned example. In the middle of the change function, can we exploit the borrow checker to make sure that we didn't touch self.ptr_data up to that point and/or that self.ptr_data is again a unique, borrowed reference from then on?

I'm not really sure what you are asking, but I will answer what I think you are asking. The reason it breaks when you move self.handle.consume(); to the end of change is that the mutable reference inside Handle assumes it has exclusive access until its last use, so any other accesses to a before the mutable reference's last use is UB.

I'm not sure what I was asking either - it wasn't that, but it's good you explain. What I'm asking is: can we organize the code such that if you were to move the consume line, that rustc would stop you immediately? Rather than having to realize it's UB or have Miri tell you.

Drawing the parallel with the simple use of BTreeMap's entry, I guess if someone moved the consume line to the end, rustc should complain at the point where ptr_data is used, that self.handle is still borrowed.

Not directly. You can't really tell the compiler much about field's relation to each other. This kind of thing would have to rely on stuff such as borrowing or consuming the Proxy object. For example:

struct Proxy<'a> {
    pub handle: Handle<'a>,
    /// This field can only be accessed after converting to Proxy2.
    ptr_data: NonNull<Data>,
}
struct Proxy2<'a> {
    pub ptr_data: NonNull<Data>,
    // + some phantomdata stuff for the lifetime (or a real reference, even)
}
impl<'a> Proxy<'a> {
    pub fn to_proxy2(self) -> Proxy2<'a> {
        Proxy2 {
            ptr_data: self.ptr_data,
        }
    }
}

In the case of the entry api, what makes it fail is that handle is a borrow of the map, and the ptr_data thingy is not created until after you are done using the handle. It is not really the use of ptr_data that makes the compiler unhappy, but the fact that creating ptr_data requires accessing the map, which is currently borrowed due to handle.

In our case, both objects already exist when entering the function, so there is not a single object whose borrowed-state we can make use of, unless we use the Proxy object for this purpose. And even then, we have to actually consume it, as allowing access to ptr_data through a borrow would allow you access handle again after the ptr_data borrow ended.

1 Like

That's indeed the key property to build &uniq-based abstractions. Your finetuned example is thus sound (and passes miri!).

If I understand correctly what you are looking for, the change function is an example of what the user of the library may (have to) write, and you'd like to enforce that pattern.

Then indeed what @alice suggested is the way to go: encode transitions within the type system, since you can restrict the APIs of each type.

To further illustrate that idea, I have rewritten the your finetuned example using it, with an added privacy boundary (unsafe code must always be accompanied with an explicit privacy boundary). For the sake of simplicity, I have chosen to have type Proxy2<'holder> = &'holder mut Data, but more complex structs with extra associated methods would of course be possible too:

@alice Sorry, I basically don't understand any point you made.

would have to rely on stuff such as borrowing or consuming the Proxy object.

But consuming is what the change function already does?

struct Proxy2<'a>

I understand this would be useful if there was a need to continue using the whole object, but for the change method (or the remove_entry that it models) that it not necessary. We "just" need a quick, temporary reference to the whole object, clean it up, and return to the end user, with all borrows over.

// + some phantomdata stuff for the lifetime (or a real reference, even)

I don't understand why you would not use a real reference in Proxy2, but I suppose it's necessary when thing get even more complicated.

In the case of the entry api, what makes it fail

You lost me... what is "it", what is "failing" here?

It is not really the use of ptr_data that makes the compiler unhappy

Do you mean Miri instead of the compiler? And in what circumstances, when the self.handle.consume() line is moved?

both objects already exist when entering the function

That's the handle wrapping the reference (to a part), and the NonNull wrapping the raw pointer (to the whole), entering function change?

unless we use the Proxy object for this purpose

Well I'm cool with that, even though I'm not sure what it means. I guess it's impossible in this case, because I want to do the magic in a function consuming the proxy and not after that function?

@Yandros

If I understand correctly what you are looking for, the change function is an example of what the user of the library may (have to) write, and you'd like to enforce that pattern.

It depends on the definition of library of course, but I don't think I got my intention across. The change function (or remove_entry is part of the library, and I'd like to enforce (at build time) the proper implementation of that library function.

For instance, in your example, I'd like to remove the unsafe tag on line 71, and have the borrow checker make sure there is no more borrow of the same thing. Which thing? I'm not sure. Then how's the borrow checker supposed to know which thing, at compile time? I don't know.

One thing that probably causes confusion is the Data type in my example. It's not at all intended to be exposed publicly, it's the private data of Holder (which models BTreeMap itself). It seemed necessary to cause the original Miri error to appear or disappear (but I only tried once to yank out that Data level). I should have named it better and removed it in the refined example. I hope this further refined example is less confusing.

Another thing that confused me for a moment: why doesn't Proxy simply have a single regular reference to the Holder and let change do everything off that? Because in the real code, the handle in the entry contains the result of an expensive key lookup done by the entry method. In the example, you could still stick a and b into an array, and pass an index as the handle, but that trick doesn't scale out to BTreeMap.

Yes, from the caller of change's perspective, but I'm talking about inside the body of the function instead. In that body, we have ownership of a Proxy<'a>. To make sure that you cannot use handle after using ptr_data, you must set up something that makes the compiler prevent you from using handle after using ptr_data. One way to do this is to consume the Proxy<'a>, producing a Proxy2<'a> in a way such that you can only access handle with the Proxy<'a>, and only ptr_data with Proxy2<'a>. Since you can't go back once you convert to Proxy2<'a>, you cannot access handle after accessing ptr_data.

Using a real reference is probably preferable. I didn't because it's a raw pointer in your code, and I just didn't change it.

In this case, I am referring to trying to the following sequence of operations:

  1. Obtain an entry into the map.
  2. Access the map in some other manner.
  3. Access the map through the entry.

I used handle to refer to the entry, and ptr_data to refer to the other manner, as to make an analogy to our current situation where you cannot access handle after accessing ptr_data. In the case of our entry situation, what makes this fail to compile with a borrow error is that you have to actually access the map directly in step two, and the pointer into the map (our analogy to ptr_data) obtained in step two does not actually exist during step one, as it is created in step two.

What makes the borrow-checker detect the invalid use here, and not in our Proxy::change situation with raw pointers, is, among other things, the fact that you have to create ptr_data after you are done using handle. This is not the case in your snippet as ptr_data already exists when you call self.handle.consume().

Both. It's just that miri tells you about it, whereas the compiler silently miscompiles your code (sometimes).

To make the compiler emit borrowing errors, you have to try to borrow a single specific value in multiple different ways. In the case of the entry-api of BTreeMap, this single object is the map, but in the case of Proxy::change, there is no such object since the references were already created before we entered the function, and as such the single object is not visible to the compiler when it borrow-checks Proxy::change.

My suggestion was to use the Proxy value as our single specific value instead, even though it is not the object we actually borrowed the data from.

I have yet to understand all that, but here's what I picked up: we can stash the unsafe pointer down into a layer that hides it entirely, until the handle is consumed. Something like this (4 × updated). To me it looks better than the current code.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.