Why does the borrow checker raise an error when I keep a reference to an element in a vector that is appended to within a loop?

I have two structs, Holder and Held. Holder holds a reference to Held. Held holds an i32:

struct Holder<'a> {
    val: &'a Held,
}

#[derive(Debug)]
struct Held(i32);

I want to create 10 Holders in a Vec<_> named holders. Since Holder takes a reference to Held struct, I also create a Vec<_> named heldvals that will store the Held structs for the scope of main function:

pub fn main() {
    // contains the `Holder`s
    let mut holders = vec![];

    // contains the `Held`s
    let mut heldvals = vec![];

    for i in 0..10 {
        heldvals.push(Held(i));

        holders.push(Holder {
            val: &heldvals.last().unwrap(),
        });
    }
}

When I attempt to compile this program, I get an error:

error[E0502]: cannot borrow `heldvals` as mutable because it is also borrowed as immutable
   |
   |         heldvals.push(Held(i));
   |         ^^^^^^^^^^^^^^^^^^^^^^ mutable borrow occurs here
   | 
   |         holders.push(Holder {
   |         ------- immutable borrow later used here
   |             val: &heldvals.last().unwrap(),
   |                   -------- immutable borrow occurs here

As a workaround, I reluctantly decided to use unsafe, which works without any errors. I even implemented the Drop trait to confirm that there is no memory issue.

// ...
impl Drop for Held {
    fn drop(&mut self) {
        dbg!(self);
    }
}

pub fn main() {
    let mut holders = vec![];

    let mut heldvals = vec![];
    let hptr = &mut heldvals as *mut Vec<Held>;

    for i in 0..10 {
        println!("creation");
        unsafe {
            (*hptr).push(Held(i));
        }

        holders.push(Holder {
            val: &heldvals.last().unwrap(),
        });
        println!("replacement");
    }
}

Running the code above gives this (reduced) output:

creation
replacement (10 times)
[src/main.rs:12] self = Held(
    0,
)
... 
[src/main.rs:12] self = Held(
    9,
)

Valgrind shows no memory leaks or issues either:

HEAP SUMMARY:
    in use at exit: 0 bytes in 0 blocks
  total heap usage: 18 allocs, 18 frees, 3,521 bytes allocated

All heap blocks were freed -- no leaks are possible

ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Is there a way to avoid the usage of unsafe? I found out about Vec::reserve(), is that a good idea?

Note that usage of reference counters is impossible for me. I want a simple way to hold references until program exits.

The compiler is correct and your code is unsound. When pusing into heldvals, the buffer will need to be periodically reallocated which may require it to move in memory. All of the references in holders will then be pointing to unallocated memory.

The fact that it happens to work as expected in that specific test is just down to luck.

14 Likes

This is in general called iterator invalidation, and it's the very kind of mistake that Rust is trying to prevent.

In general, if you have a borrow checker error, the compiler is right and you shouldn't circumvent it using unsafe. In 99% of the cases, unsafe is only legitimately useful for doing FFI, and not for second guessing the safety mechanisms built into the language. There are legitimate use cases for optimizing code using unsafe, but you shouldn't try to do that if you don't even understand the memory model of safe Rust yet.

Is there a way to avoid the usage of unsafe?

Yes, use indexes instead.

5 Likes

By indexes, I suppose you mean:

        holders.push(Holder {
            val: &heldvals[some_index],
        });

Isn't that the same as before, just with less extra method calls?

No, I mean you use indexes instead of references. Instead of storing a reference to the element of the vector, you store its index so that you can look it up later.

2 Likes

How am I supposed to rewrite the code to suit to your suggestion of using indices?

Note that this is just an MCVE. The original application that I'm working on requires me to store references and unfortunately there are no alternatives to it.

struct Holder {
    index: usize,
}

pub fn main() {
    // contains the `Holder`s
    let mut holders = vec![];

    // contains the `Held`s
    let mut heldvals = vec![];

    for i in 0..10 {
        heldvals.push(Held(i));

        holders.push(Holder {
            index: heldvals.len() - 1,
        });
    }
}

You can't hold references to the elements of a vector and then push to the vector, period. If for some reason you can't switch to using indices, you will have to work around the problem in some other way. For example, you can clone the original vector, push elements to the clone, and then update the references so that they point into the new vector.

2 Likes

The minimal rewrite of the example is to build the data-holding structure completely and then build the structure holding many references into it:

pub fn main() {
    let heldvals: Vec<_> = (0..10).map(Held).collect();
    let holders: Vec<_> = heldvals.iter().map(|val| Holder { val }).collect();
}

So long as holders is alive, heldvals is immutably borrowed.

5 Likes

You can make heldvals a typed_arena::Arena to write the original program without needing to use indices. (Internally, it works similarly to what you're trying, but uses additional Vecs so that it doesn't reallocate one that's in use.)

4 Likes

This just happens to work because you never actually use the references in heldvars. Your Drop doesn't count as a use because that's called by the vec on the correct Helds, instead you should have implemented it for Holder. If you do that you'll see it prints either garbage or zeroes for the first four items (the ones created before the vec reallocated).

3 Likes

Thanks to everyone for their valuable suggestions! Right now, if I were to produce an MCVE of current application state, that would be:

use std::{cell::RefCell, rc::Rc};

#[derive(Debug)]
struct Held(String, Rc<RefCell<String>>);

#[derive(Debug)]
struct Holder<'a> {
    val: &'a String,
    another: &'a String,
}

fn writer(vals: &[Held]) {
    let mut holders = vec![];

    for each in vals {
        holders.push(Holder {
            val: &each.0,
            another: unsafe { &*each.1.as_ptr() }, // is this safe!?
        });
    }

    // perform a write operation here (actually done with diesel)
    println!("{:#?}", holders);
}

pub fn main() {
    let helds = vec![
        Held(
            "1st val".into(),
            Rc::new(RefCell::new("Rc companion 1".into())),
        ),
        Held(
            "2nd val".into(),
            Rc::new(RefCell::new("Rc companion 2".into())),
        ),
    ];

    writer(&helds[..]);
}

Note that one of fields of Held uses a Rc<RefCell<T>> (which is practically unavoidable in original application). Is this line of code sound?:

    another: unsafe { &*each.1.as_ptr() }, // is this safe!?

My rationalisation would be that since Rc<RefCell<T>> points to a valid value for the lifetime of the program (or context), it should be valid.

However I'm still unable to avoid the usage of unsafe. Is there any way to circumvent this?

No, it's not ok. The following has UB if you run it with miri:

let mut guard = helds[0].1.borrow_mut();
let mut_ref: &mut String = &mut *guard;
writer(&helds[..]);
*mut_ref = "foo".to_string();

Rust requires that when you have a mutable reference, nothing else may access that mutable reference's target between the creation and last use of that mutable reference, which you are violating by reading the value in writer.

This would be sound: (assuming that nobody calls borrow_mut inside the write operation)

for each in vals {
    // this will panic if there are any active borrow_mut's to this RefCell
    // currently
    let _ = each.1.borrow();
    holders.push(Holder {
        val: &each.0,
        another: unsafe { &*each.1.as_ptr() },
    });
}

But I would not exactly call it idiomatic. It would be better to do something like this:

#[derive(Debug)]
struct Holder<'a> {
    val: &'a String,
    another: std::cell::Ref<'a, String>,
}

fn writer(vals: &[Held]) {
    let mut holders = vec![];

    for each in vals {
        holders.push(Holder {
            val: &each.0,
            another: each.1.borrow(),
        });
    }

    // perform a write operation here (actually done with diesel)
    println!("{:#?}", holders);
}
3 Likes

Or sounds a lot like you're attempting to do something inherently unsafe. You seem to need to hold these references and also need to hold opened data with no relationship to the references. Can you explain why you want that?

Using an arena would seem the natural solution, which would ensure that your data outlives references to it. Alternatively, if it turns out that your data isn't going to be freed before program exit, then leaking the data is a natural solution.

What is the constraint that is forcing you into this error? Addressing that problem would be the key, but you aren't telling us what it is.

5 Likes

Here is the issue. I'm using Diesel (+ Sqlite) for Insertion and Read purposes. An "insertable" struct demands that the fields of the struct must hold a reference to the value to be inserted.

Here is an example from diesel's guides:

use schema::users;

#[derive(Deserialize, Insertable)]
#[table_name = "users"]
pub struct UserForm<'a> {
    name: &'a str,
    hair_color: Option<&'a str>,
}

Currently, I'm inserting each value individually, which is very inefficient. (See IncompleteInsertStatement::values)

Now the problem appears when I try to batch insert the values. I cannot manage references properly. I cannot manage the RefCells properly.

And I just want to leverage the efficiency, and avoid any chance of using unsafe and/or slow alternatives.

This is an excellent solution for me because I'm forced to use normal references only, thanks to Diesel.

WRT my previous code (about Held(i32)), it occurred to me that storing the Helds first, and then inserting the references into Holder later is the best way to solve this situation.

#[derive(Debug)]
struct Holder<'a> {
    val: &'a Held,
}

#[derive(Debug)]
struct Held(i32);

pub fn main() {
    // contains the `Holder`s
    let mut holders = vec![];

    // contains the `Held`s
    let mut heldvals = vec![];

    let lim = 10;
    
    // store the Held first
    for i in 0..lim {
        heldvals.push(Held(i as i32));
    }

    // now that they all have stable memory locations, we can insert Helds!
    for i in 0..lim {
        holders.push(Holder {
            val: &heldvals[i],
        });
    }

    // check 'em out
    for cur in &holders {
        println!("val: {:?}", cur);
    }

    // invalidate the holders:
    // Rust quickly detects the potential error (memory invalidation) that
    // might happen, and blocks us!
    // heldvals.push(Held(15));

    println!("{:#?}", holders[1]);
}

Try it in Playground!

1 Like

Another option is to put the Ref objects in a separate vector, then take references into elements in that vector.

That's exactly what my previous comment does (using iterators and collect).

3 Likes

I don't think this is true, is it?

use schema::users;

#[derive(Deserialize, Insertable)]
#[table_name = "users"]
pub struct UserForm {
    name: String,
    hair_color: Option<String>,
}

should be perfectly fine, for example.

2 Likes

The basic idea behind rust borrowing is that you can't borrow something while you are mutating it.

For your use case, you want to batch insert UserForm. This is doable by creating a Vec<UserForm> which must borrow from somewhere else, for example, Vec<UserData>. The key point is that you are borrowing the whole Vec<UserData> not each UserData. If you update Vec<UserData> and Vec<UserForm> simultaneously, Rust compiler can't statically guarantee that the former is not changed while you are borrowing it.

2 Likes

Absolutely!

1 Like