How can I reference a Pin struct field from another field?

Hi everyone,

I have a use case that, from my understanding, should be possible with Pin. My goal is to have a struct that contains an Box<[u8]> buffer as one field, and another field that is a slice of that buffer.
(actually the slices will be nested deeper inside the type of the other field, but I simplified it for this question).

So something like

struct S<'a> {
    buf: Pin<Box<[u8]>>,
    slice: Option<&'a [u8]>,
}

The non-working short example is here:

I can only find examples on how to reference fields by pinning the whole struct, which should not be necessary, right? The [u8] inside the box cannot move, as the Pin should make it impossible to get to a &mut [u8] (safely). And as the Box is on the heap, I can move around my parent struct S freely without affecting the [u8] location in memory.

How can I get this code to compile (and also work correctly :slight_smile: )?

Pinning the whole struct is necessary, since without pinning it's legal to move out the field buf from the &mut S which invalidates the field slice.

Self-referential struct in general cannot be constructed with safe Rust. You need to add std::marker::PhantomPinned field to your struct to make it !Unpin and store raw pointers instead of references to avoid aliasing issues.

1 Like

Self-referential structs are definitely an advanced topic in Rust, if you want to squeeze out every little bit of performance.

For normal use-cases, just store a range, instead of a shared borrowed subslice and then you don't have to deal with Pin and unsafe.

use std::ops::Range;

// …

struct S {
    buf: Box<[u8]>,
    slice: Option<Range<usize>>,
}

// get the subslice:
// return self
//     .slice
//     .map(|idx| self.buf.get(idx))
//     .unwrap_or_else(|| Some(&*self.buf));
// Type: Option<&[u8]>

I wasn't sure for what you needed the Option, so I just kept it and the example simply returns the whole slice if no range is stored. If it returns None, that's probably a logic error and if you just want to crash the program, you could just use self.buf[idx], instead, drop the Some from the unwrap_or_else and the Option from the return type.

1 Like

I had the option from another try where I wanted to fill the slice in after creating the struct, so here it is not necessary anymore.

The reason why I can't go this way is that I don't have such a simplified version that I want to use, but instead of the pure slice I have a deeper structure containing slices that comes from an external crate.
This is also expensive to compute, so I can't just make a method that creates the referencing substructure on-demand.

My only reason for wanting a self-referential struct was to make a clean interface. I would love to have just

S::from_file(file: Path) -> S

Where the struct owns the data from reading the file. Otherwise I'll have

let buf = std::fs::read(file)?;
let s = S::from(&buf);

everywhere.
But I guess I'll go with the less clean interface :frowning: or try this at a later point in time.

Too bad... so the reference itself also has to be inside the Pin, not only the referenced item?

You'll have to get into more detail, then. We can't help you, if you cannot translate from the simplified to your original code yourself, with the proposed solution, but also don't share the original code.

You cannot write this struct without unsafe code, and Pin cannot reduce the amount of unsafe code you need. Here's how you can build the struct without Pin:

struct S {
    buf: *mut [u8],
    slice: Option<*const [u8]>,
}
impl Drop for S {
    fn drop(&mut self) {
        let b = unsafe { Box::from_raw(self.buf) };
        drop(b);
    }
}

impl S {
    pub fn new() -> Self {
        let buf = vec![0u8, 1, 2, 3].into_boxed_slice();
        let buf_ptr = Box::into_raw(buf);
        let slice = Some(buf_ptr as *const [u8]);

        S {
            buf: buf_ptr,
            slice,
        }
    }
    fn buf(&self) -> &[u8] {
        unsafe { &*self.buf }
    }
    fn slice(&self) -> Option<&[u8]> {
        unsafe { self.slice.map(|ptr| &*ptr) }
    }
}

fn main() {
    let s = S::new();
    assert_eq!(s.slice().unwrap(), s.buf());
}

playground

You may want to read this post of mine, which explains where Pin is useful, namely when two pieces of unsafe code need to communicate guarantees through unknown non-unsafe user code. When everything happens in a single module, Pin does not help you at all, and module privacy can provide all the things you need to write correct (but wildly unsafe) self-referential structs.

1 Like

You're right, my bad.

So here is my actual code that I'm using:

extern crate goblin;

pub struct Elf<'buf> {
    elf: goblin::elf::Elf<'buf>,
    buf: &'buf [u8],
}

impl<'buf> Elf<'buf> {
    pub fn parse(buf: &'buf [u8]) -> Elf {
        Elf {
            elf: goblin::elf::Elf::parse(buf),
            buf
        }
    }
// more methods...
}

The user of my Elf struct has to carry around the Buffer (which is just an elf file's content) in addition to my elf struct. goblin::elf::Elf contains many slices of the parsed buffer.
I would like to have the Elf struct also own the buffer:

extern crate goblin;

pub struct Elf {
    elf: goblin::elf::Elf,
    buf: Pin<Box<[u8]>>,
}

impl Elf {
    pub fn parse(file: &str) -> Elf {
        let buf: Pin<Box<[u8]>> = Pin::new(std::fs::read(file).unwrap().into_boxed_slice());
        Elf {
            elf: goblin::elf::Elf::parse(&buf),  // somehow get this to work
            buf
        }
    }
}

This is interesting! So as long as my public interface is not intended to be messed with in unsafe blocks somewhere else, I could achieve the thing I'm trying with exactly the same amount of unsafe, even without Pin?

In principle. Of course, unsafe code is difficult to get right in practice. There are many non-obvious pitfalls where the code will appear to work, but be subtly wrong.

Privacy rules can ensure this without pinning the struct. If outside code doesn’t have direct access to buf and your code never moves out of buf there isn’t a problem. At that point, though, the existing Pin is only serving as an internal guardrail that prevents you from accidentally violating some (but not all) safety conditions.

I'm not convinced the Pin even really counts as a guard-rail here. The contents of the box is Unpin.

1 Like

You’re right; that always trips me up when trying to think about Pin— It only does anything if the internal type cares about its own memory location; what I usually have, though, is some external type that cares.

The idea of letting types opt-out of Pin’s guarantees feels wrong, and auto-implementing that fir most types doubly so. On the other hand, I suspect there’s some compelling reason for it; I can’t imagine the libs team letting such an odd API through without cause.

I think it makes a lot of sense. The entire purpose of Pin is when you're dealing with a type that cares about itself moving. External types that care are not the reason Pin exists — that's the job of module privacy.

Types auto-implement Unpin because most types don't care about getting moved.

Right; I just forget that a lot because most other adapters provide services to their owner rather than to the inner type.

If you want to think of it in that manner, the service that a Pinned pointer provides to its owner is the ability to call self: Pin<&mut Self> methods on the contained type.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.