Proper way to create self referential structs (with or without `Pin`)

Let's say I have 2 structs, where:

  • A will hold a reference to B
  • B will hold a reference to A<'_> in Vec<*const T>
  • During A's init, A will store a reference to itself in Vec<*const T> of B
use std::{
    cell::RefCell,
    marker::{PhantomData, PhantomPinned},
    pin::Pin,
    rc::Rc,
};

type RcCell<T> = Rc<RefCell<T>>;

struct A<'a> {
    b: RcCell<B<'a>>,
    marker: PhantomPinned,  // Is this necessary?
}

struct B<'a, T: 'a = A<'a>> {
    a: Vec<*const T>,
    marker: PhantomData<&'a T>,
}

My questions are:

  1. Is this the correct way to do this?

    impl<'a> A<'a> {
        fn new(b: RcCell<B<'a>>) -> Pin<Box<Self>> {
            let mut x = Box::pin(Self {
                b: Rc::clone(&b),
                marker: PhantomPinned,  // Is this necessary?
            });
    
            let val: *const Self = &*x.as_ref();
    
            x.as_mut().b.borrow_mut().a.push(val);
    
            x
        }
    }
    
  2. Is the PhantomPinned in A<'_> necessary? Why or why not?

  3. How am I supposed to implement methods on A<'_>?

    One way I know is to use Pin<&Self> and Pin<&mut Self>

    impl<'a> A<'a> {
        fn with_pin(self: Pin<&Self>) {
            // ...
        }
    
        fn without_pin(&mut self) {
            // ...
        }
    }
    

    But this prevents me from using with_pin methods from B<'a, T>'s a

    let first = b.borrow().a.first().unwrap(); // ‣first: &*const A
    // How will I use `first`?
    

    Another way being, using the without_pin methods. Is it a good idea to use that?

  4. I'm very sceptical about all these methods. Are there any other ways to achieve the same goal without resorting to unsafe?

    I tried using Rc<T> but it always ended with reference cycles and memory leak.

1 Like

In general, self-referential structs should not have lifetimes on them.

Did you try changing the Rc in one of A or B to a Weak? Then there's no cycle.

Thanks for the reply.

I tried using Weak. Unfortunately, the struct that I want to use must derive PartialOrd, Ord etc. among other things.

Can you elaborate on this?

You could try to manually implement PartialOrd and Ord. Alternatively you could try adding a destructor to A struct that removes itself from the vector in B.

Lifetimes are unable to talk about references that go into the current struct, as well as structs that reference each other cyclicly.

I'm sorry, but I'm still unable to grasp your point.

Correct me if I'm wrong here:

  • Here, I wanted to track T.
  • But since it is a raw pointer, lifetime tracking is not possible.
  • By using PhantomData, I'm telling the compiler that T (and hence, *const T) will be alive for 'a.

The problem is that if you annotate a struct with a lifetime, then that means that it borrows some other value, and that other value cannot be a field in the struct itself. For example, what does the A struct borrow? It has a lifetime, so it must borrow something. It can't be B, because you are storing B inside A itself.

1 Like

The lifetime parameter in A is there because B stores a "reference" to A. Removing it causes error.

AFAIK, the &'a T tells the compiler that it "stores a lifetime" of some type T, even though it is not actually storing a borrow, but a raw pointer.

Also as far as I can tell, not having a lifetime would remove any guarantees of the object of type T being alive for 'a.

Consider this piece of code:

struct MyA<'a> {
    myb_by_val: MyB<'a>,
}

struct MyB<'a> {
    a_ref: &'a i32,
}

What does MyA store? MyB<'a> by value. Furthermore, MyB stores a reference to
some i32, hence the lifetime.

This is fine, but it's also different from your code because there's no cycle. The i32 is not inside MyA.

I see. So you mean that lifetimes are rendered useless once they are in the context of circular references?

And if they are, how am I supposed to solve this problem without worrying about unsafe, deref to null etc. pitfalls?

Either break the cycle using Weak or do something else entirely.

What about Pin? I've seen it being used mostly in contexts where self-referential structs are involved? How can I use Pin here?

(Because I've almost zero intention of abandoning the idea of pinning, although I'm open to new ideas)

The purpose of Pin is to let you use a self-referential struct safely without putting it on the heap. Writing self-referential structs still requires unsafe, Pin or not. In fact, if the self-referential data is on the heap, then you don't even need Pin at all.

2 Likes

Let's say, hypothetically, that I made up my mind on using Pin, even if it carries certain drawbacks and corner cases.

Now, how am I supposed to solve the problem described above?

It doesn't make sense to use Pin for what you're trying to do.

2 Likes

Pin is more like a type-system warning that there's a dangerous-to-use struct hidden behind it, but apart from limiting access, it doesn't really do anything. It certainly can't help make a self-referential struct safely.

You do need PhantomPinned. Self-referential structs are a hack that Rust compiler doesn't understand and could in theory miscompile. PhantomPinned has been agreed to serve as a signal to the Rust compiler that the struct is self-referential, so that maybe it won't miscompile them.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.