Does borrowing checker forbid the self-referential case?

An imaginary self-referential mode that would work in Rust might be

struct A<'a> {
    ptr: Option<&'a A<'a>>,
}
fn main() {
    let mut a = A { ptr: None };
    a.ptr = Some(&a);
}

However, this code is not compiled. The error is

error[E0506]: cannot assign to `a.ptr` because it is borrowed
 --> src/main.rs:6:5
  |
6 |     a.ptr = Some(&a);
  |     ^^^^^^^^^^^^^--^
  |     |            |
  |     |            `a.ptr` is borrowed here
  |     `a.ptr` is assigned to here but it was already borrowed
  |     borrow later used here

Does the borrowing checker forbid us from writing the self-referential structure in Rust?

NLL RFC seems to cover this case.

&a is borrowed for lifetime 'a, which keeps alive until a deads. According to the RFC: An assignment statement LV = RV is a shallow write to LV;

So, the relevant borrowings for a.ptr are determined by this rule:

For shallow accesses to the path lvalue, we consider borrows relevant if they meet one of the following criteria:

  • [...]
  • there is a loan for some prefix of the path lvalue;

So, &a is that relevant borrowing, which is the in-scope loan at LV = RV. The access to a.ptr is Write, and the relevant borrowing exists. They conflict with each other. So, the borrowing checker reports the error.

Is this the reason why we cannot write out self-referential structures in Rust?

Is this the mechanism of how the borrowing checker forbids us from creating a self-referential structure?

Right. The borrow checker forbids self references unless you do something more fancy.

It does not forbid using multiple instances of the same type, e.g.

struct A<'a> {
    ptr: Option<&'a A<'a>>,
}
fn main() {
    let b = A { ptr: None };
    let mut a = A { ptr: Some(&b) };
}

Is this the reason why we cannot write out self-referential structures in Rust?

Rust depends on being able to perform static analysis (at compile time) to prove that a reference will be valid, and to automatically put the drop code and deallocations in the right place. Self-references don't permit the analysis to happen in advance.

If you're trying to build a complex graph structures where you can legitimately have self references that are checked at runtime, checkout Rc / Arc

The last question should be: Is this the mechanism of how borrowing checker forbids us from creating self-referential structure?

Yes, you can't write a.ptr because a is borrowed in the example.

Any self-referential structure? No, but since you have to borrow the structure forever to construct it, such structures aren't practical (as you can't move it, etc).

There's no "forbid self-referential structures" specific rule. Personally I find it impressive that you can construct self-referential struct soundly with safe code through the mechanisms that do exist (borrow splitting and reborrowing), without special rules. (But the end result is something not practical to use -- because soundness requires that you can't move it, etc.)

2 Likes

Is this the mechanism of how borrowing checker forbids us from creating self-referential structure?

Yeah. Basically. A self-reference can't fulfill all of the borrowing and exclusivity requirements. Unless you use one of the patterns that defers some of the checking to runtime. Like Rc, etc.

Here's a modification of your OP that compiles by using a Cell to avoid direct assignment. (At which point it's shared-borrowed forever and you can't move it or take a &mut to it.)

2 Likes

This is not the self-referential structure of what I was thinking. What I meant here is the field references containing structure, as shown in my example.

@quinedot I have a question in your article.

struct Snek<'a> {
    owned: String,
    // Like if you want this to point to the `owned` field
    borrowed: &'a str,
}
impl<'a> Snek<'a> {
    fn bite(&'a mut self) {
        self.borrowed = &self.owned;
    }
}

You said: The only way to use it at all is via a (reborrowed) return value from the method call that required &'a mut self.

However, if you change the bite to return the self

impl<'a> Snek<'a> {
    fn bite(&'a mut self)->&'a mut Self {
        self.borrowed = &self.owned;
        self
    }
}

The code will get an error that:

error[E0502]: cannot borrow `*self` as mutable because it is also borrowed as immutable
 --> src/main.rs:9:9
  |
6 | impl<'a> Snek<'a> {
  |      -- lifetime `'a` defined here
7 |     fn bite(&'a mut self)->&'a mut Self {
8 |         self.borrowed = &self.owned;
  |         ---------------------------
  |         |               |
  |         |               immutable borrow occurs here
  |         assignment requires that `self.owned` is borrowed for `'a`
9 |         self
  |         ^^^^ mutable borrow occurs here

This result adheres what the RFC rules.

Perhaps it's worth talking about some reasons why the OP is forbidden but the Cell version is sound.

  1. Imagine you had a borrow of &a, and you sent it to a scope thread and continuously did reads through it. Then writing to a.ptr would be a data race (UB).

  2. Imagine the optimizer saw that a was immutably[1] borrowed, and cached the contents behind the reference. The assumptions of such an optimization would be violated by a write, which could lead to UB.

Why does Cell solve these?

  1. Cell isn't Sync or Send.

  2. UnsafeCell (used by Cell) disables the "direct contents of &_ are immutable" rule so such optmizations don't happen. :asterisk:[2]

Here's what I meant.

    fn bite(&'a mut self) -> &'a Self {
        self.borrowed = &self.owned;
        self
    }

The -> &mut Self version doesn't work because the &mut Self overlaps with the outstanding borrow of the owned field. But you can shared-reborrow *self and it only conflicts with outstanding &mut _ borrows, not outstanding &_ borrows.


  1. :asterisk: ↩︎

  2. Normally I'd call &_ a shared reference, not an immutable reference, to avoid confusion. But in this case the presence or absence of an immutability guarantee is important to the conversation. ↩︎

1 Like

Oh, that means we can never use a mutable reference to Snek anymore. Changing the return type from &'a mut Self to &'a Self works is that, the returned value can be rewritten to & mut *self and &*self, the former is a deep write to lvalue *self whereas the latter is a deep read to *self, in both cases, the relevant loan are &self.owned, according to

For deep accesses to the path lvalue, we consider borrows relevant if they meet one of the following criteria:

  • [...]
  • lvalue is a supporting prefix of the loan path

&self.owned is desugared to & (*self).owned, its supporting prefixes are: (*self).owned, *self, and self. So, this loan is relevant due to *self. The deep write conflicts with the relevant reading borrowing while the deep read do not. So, returning self is Ok.

BTW, I didn't find the document that says that &self.owned is desugared to & (*self).owned, even though I know we should do when analyzing the relevant loans.

Correct. Aliasing would be UB, and if you can get a &mut you can move the referent too.

I can't remember if it's explicit in the RFC or not. It's mentioned here:

Do you mean self has type & mut T and the blanket implementation is used

impl<T: ?Sized> const Deref for &mut T {

    type Target = T;
    fn deref(&self) -> &T {
        *self
    }

}

So, self will be automatically dereferenced to access the field owned?

I just meant I can't remember if the NLL points out such desugarings. A quick glance makes me think that it doesn't talk about it, it is just implicit because we're operating at the MIR level.

When the RFC is talking about * it is talking about built-in operations.

Ok, the transformation is not documented. The consensus is that whenever we meet operand.field where operand refers to the containing struct of field, we should first transform it to (*operand).field for loan path analysis, right?

1 Like

Right.