Self-referential type, must use unique reference (not pointer)

Hi there,

So... I have a use-case where I have a type with an API that I don't control, and it needs a unique reference to a buffer. I would like to store both of them together -- So, IoW I have the classic self-referential type issue, but I can't use pointers because I don't control the API of the referencing type.

So, basically, this (in some make-believe version of Rust where this stuff is simple):


// I do not have control over the API or implementation of this type
struct FiveWriter<'a>(&'a mut [u8]);

impl<'a> FiveWriter<'a> {
    fn write_fives(&mut self) {
        self.0.copy_from_slice(&[5, 5, 5, 5, 5, 5, 5, 5, 5, 5]);
    }
}

pub struct WriterAndBuffer {
    writer: FiveWriter<'_>, // points at buffer below
    buffer: [u8; 10],
}

impl WriterAndBuffer {
    pub fn write_fives(&mut self) {
        self.writer.write_fives()
    }
    pub fn read_buffer(&mut self) -> [u8; 10] {
        self.buffer
    }
}

I have already managed to solve this issue with Box::into_raw(), and that works fine... But I don't want to pay the price for heap allocation! This is replacing existing C++ code, and folks would view heap allocation as a regression from what we already have.

So... After trying to solve this a few simpler ways and having Miri spit at me, I decided to take the most conservative approach I could come up with: wrap my entire struct in MaybeUninit<T> and access every field through pointers only -- Surely that couldn't violate aliasing rules!

(Note that I intentionally left out pinning and the Drop implemention, as they don't affect whether Miri rejects this code or not.)

use std::mem::MaybeUninit;
use std::ptr::addr_of_mut;

// I do not have control over the API or implementation of this type
struct FiveWriter<'a>(&'a mut [u8]);

impl<'a> FiveWriter<'a> {
    fn write_fives(&mut self) {
        self.0.copy_from_slice(&[5, 5, 5, 5, 5, 5, 5, 5, 5, 5]);
    }
}

pub struct WriterAndBuffer {
    inner: MaybeUninit<Inner>,
}

struct Inner {
    initialized: bool,
    writer: FiveWriter<'static>,
    buffer: [u8; 10],
}

// To ensure we don't violate aliasing rules, we will just treat Inner like a bag of bytes and only
// ever access its fields through pointers.
struct InnerPointer {
    initialized: *mut bool,
    writer: *mut FiveWriter<'static>,
    buffer: *mut [u8; 10],
}

impl WriterAndBuffer {
    fn inner_pointer(&mut self) -> InnerPointer {
        unsafe {
            let inner_mut_ptr = self.inner.as_mut_ptr();
            InnerPointer {
                initialized: addr_of_mut!((*inner_mut_ptr).initialized),
                writer: addr_of_mut!((*inner_mut_ptr).writer),
                buffer: addr_of_mut!((*inner_mut_ptr).buffer),
            }
        }
    }
    fn ensure_inner(&mut self) -> InnerPointer {
        unsafe {
            let inner = self.inner_pointer();
            if !inner.initialized.read() {
                inner.buffer.write([0u8; 10]);
                inner.writer.write(FiveWriter(&mut *inner.buffer));
                inner.initialized.write(true);
            }
            inner
        }
    }
    pub fn new() -> WriterAndBuffer {
        unsafe {
            let mut this = WriterAndBuffer {
                inner: MaybeUninit::uninit(),
            };
            let inner = this.inner_pointer();
            inner.initialized.write(false);
            this
        }
    }
    pub fn write_fives(&mut self) {
        unsafe {
            let inner = self.ensure_inner();
            (*inner.writer).write_fives()
        }
    }
    pub fn read_buffer(&mut self) -> [u8; 10] {
        unsafe {
            let inner = self.ensure_inner();
            inner.initialized.write(false);
            // Make sure `writer` is dropped before reading the contents
            // of `buffer`
            drop(inner.writer.read());
            inner.buffer.read()
        }
    }
}

fn main() {
    let mut writer_and_buffer = WriterAndBuffer::new();
    for _ in 0..5 {
        writer_and_buffer.write_fives();
        let buffer = writer_and_buffer.read_buffer();
        println!("{buffer:?}");
    }
}

Unfortunately, Miri is not happy:


error: Undefined Behavior: trying to retag from <2213> for Unique permission at alloc816[0x10], but that tag does not exist in the borrow stack for this location
  --> src\main.rs:75:18
   |
75 |             drop(inner.writer.read());
   |                  ^^^^^^^^^^^^^^^^^^^
   |                  |
   |                  trying to retag from <2213> for Unique permission at alloc816[0x10], but that tag does not exist in the borrow stack for this location
   |                  this error occurs as part of retag (of a reference/box inside this compound value) at alloc816[0x10..0x1a]
   |                  errors for retagging in fields are fairly new; please reach out to us (e.g. at <https://rust-lang.zulipchat.com/#narrow/stream/269128-miri>) if you find this error troubling
   |
   = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
   = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <2213> was created by a Unique retag at offsets [0x10..0x1a]
  --> src\main.rs:47:17
   |
47 |                 inner.writer.write(FiveWriter(&mut *inner.buffer));
   |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
help: <2213> was later invalidated at offsets [0x0..0x20] by a Unique function-entry retag inside this call
  --> src\main.rs:85:22
   |
85 |         let buffer = writer_and_buffer.read_buffer();
   |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   = note: BACKTRACE (of the first span):
   = note: inside `WriterAndBuffer::read_buffer` at src\main.rs:75:18: 75:37
note: inside `main`
  --> src\main.rs:85:22
   |
85 |         let buffer = writer_and_buffer.read_buffer();
   |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If I understand this correctly, it appears that Miri retags buffer when I create the &mut self reference for
read_buffer(). Then, when I re-materialize FiveWriter from the bytes at inner.writer, it tries to ensure the reference is still valid, but it's not anymore.

So... Where can I go from here? Is there some easier way to do this that I'm missing? I don't think I'm actually breaking any aliasing rules because I'm only accessing my inner type through pointers, and I never read/write from buffers memory while writer is holding a unique reference to it.

But it's almost like I need a way to say to Miri "Yeah, I know that I could theoretically access the memory behind MaybeUninit<T> without going through a pointer... but I don't".

Thanks for any help -- I'm pretty stuck here!

I am not knowledgeable enough to comment on the approach you're trying to take, but there is a crate that I believe solves this problem and has been through extensive review and has no known UB: self_cell. As far as I can tell from looking at issues, warnings, etc, it is the safest of the crates that attempt to solve this.

Thank you for the tip! It looks like this crate only allows dependent to borrow owner with a shared reference, though. Since I have no control over the API, there is no way for me to force safe interior mutability using a RefCell<T> though... Or is there another way to work around that?

Pin for !Unpin types does actually affect this, at least as a temporary workaround until something like UnsafePinned is added to the language. Here's an example of how you can expoit it Rust Playground

Do note though that when something like UnsafePinned will be added this workaround may no longer work.

AFAIK these are all the ways you can do this. Either use indirection (e.g. a Box), use the temporary workaround or wait for UnsafePinned to be stabilized.

FYI self_cell does make an allocation under the hood.

2 Likes

Oh wow, thank you! So... IIUC then, Pin<&mut Self> when Self: !Unpin has some special logic right now to say "&mut Self may not actually be a unique reference"?

EDIT: Is this documented somewhere, by any chance?

Thanks, I should have caught that.

This is mostly documented in this issue Self-referential generators vs `dereferenceable` · Issue #381 · rust-lang/unsafe-code-guidelines · GitHub

Sidenote: despite both PRs linked in the issue specifying this special behaviour is only for !Unpin types, my playground seems to pass Miri even without WriterAndBuffer being !Unpin (though that's still required for soundness, otherwise it could be safely unpinned and everything would break)