(Not sure it is appropriate to post this here. Also I am no expert so pardon me if I am wrong)
I have been following Rust for a few years and lately I have decided to give it a serious try. So while learning Rust, I was trying to understand why borrow limits are imposed even for single-threaded programs and (I think) I have gained an understanding here. It seems that multiple mutable borrows doesn't have any consequence on memory safety of single-threaded programs, if and only if no structure type have any pointer to heap allocated data[1]. Things only get problematic if a structure type has a pointer to the heap and some function that invalidates the heap memory allocated is called after the object has already lent out a reference.
Update: The following strategy is less expressive and has bad granularity, among other problems. Skip to Update 3.
A strategy I have come up with alternative to borrow checking is to track memory invalidation and reference flow "effects" in terms of format parameters for individual functions. For example, a function that invalidates a heap allocation of some formal parameter p
is said to have a memory invalidation effect (memoryinv
for short) on p
. So, Vec::push
has a memoryinv
effect on the self
parameter. Similarly, a function is said to have an effect (a <- b)
if there is reference flow from b
to a
given a
and b
are reference typed. These effects are built for individual functions and propagated to the root of the call graph. Local "reference flow" effects can be built using points-to analysis. Also, move semantics could be taken into account.
For example,
struct Foo {}
struct FooHolder {
foo: &mut Foo,
}
// This function has the effect list [(y <- x)]
fn func1(x: &mut FooHolder, y: &mut FooHolder) {
y.foo = x.foo;
}
// This function has the effect list [(a <- b)]
fn func2(a: &mut FooHolder, b: &mut FooHolder) {
func1(b, a); // Add (a <- b) to the effect list since 'func1' has effect (y <- x)
}
// This function has the effect list [memoryinv(vec)] since it calls 'Vec::push' on 'vec'
fn func3(vec: &mut Vec<i32>) {
vec.push(3);
}
struct BoxHolder {
b: &mut Box<i32>,
}
// This function has the effect list [memoryinv(bh), memoryinv(b)]
fn func4(bh: &mut BoxHolder, b: &mut Box<i32>) {
bh.b = b;
std::mem::drop(b); // Add 'memoryinv' for both 'bh' and 'b' since both 'bh' has a reference to 'b'
}
fn main() {
let mut outer = FooHolder { foo: Foo {} };
{
let mut inner = FooHolder { foo: Foo {} };
// Error: 'func2' has an effect (a <- b) but 'outer' outlives 'inner'
func2(&mut outer, &mut inner);
}
{
let mut vec = vec![1, 2, 3];
let mut itemref1 = &mut vec[0];
let mut itemref2 = &mut vec[1];
// Error: 'vec' has already lent out a reference but 'func3' has a memory invalidation
// effect on 'vec'
func3(&mut vec);
}
{
let mut b = Box::<i32>::new(2);
let mut bref1: &mut i32 = &mut b;
// Error: 'b' has already lent out a reference but 'std::mem::drop' has a memory
// invalidation effect on 'b'
std::mem::drop(b);
}
}
Is this reasoning correct or am I missing something?
Update 1: Of course, the compiler cannot infer memory invalidation effects. The memory invalidation effect must be manually annotated alongside the function signature. For example, the Vec::push
method must be manually annotated since it might potentially reallocate the allocated memory. And yes, the missing annotations can be a source of bugs, but annotations are needed only for data structures implemented using raw pointers and those cases mostly involve unsafe
blocks.
Update 2:
[1] - As @trentj pointed out, this post originally didn't take into account the type-punning enums. Marking the enum
assignment operator as memory invalidating could solve this problem. This would seem hack-y or "plugging the holes" but actually it is not. The basic definition of the memory invalidation effect is that any the references lent out by an object getting invalidated. So, assigning an enum
object a different type could be considered as memory invalidation. So the premise of this proposal still holds true: multiple mutable references to an object are benign unless the memory invalidation effect this proposal describes is exerted on the object later in the same scope.
Update 3:
After starting this thread, I read some amount of Rust code (and wrote a little of my own), I think I understand Rust now a lot better. I find explicit lifetimes to be a good thing (and not so painful as I thought) for the most part. It seems like the I have been reasoning about lifetimes in C++ as well, but only implicitly. And Rust makes up for the extra thinking/typing with better ergonomics (when compared to C++), plus safety guarantees. That said, I am still uncertain of the exclusive mutability restriction. Title of this thread has been changed to reflect this.
In this post, the memory invalidation thing (maybe "effect" gives the wrong impression) still sounds reasonable. Like, Vec<>::push
being annotated as memory invalidating for self
and other functions that accept a Vec<>
parameter and calling Vec<>::push
on it would be automatically considered memory invalidating for the said parameter. The only invariant is that local variables must not be passed as parameters to functions that invalidates their memory after they have lent out a reference. So, no mutability or borrow limits for thread-local objects. Note that only Vec<>::push
needs to be annotated in the example and functions that are needed to be annotated most likely involve unsafe
. This method eliminates all the errors the exclusive mutable aliasing prevents in single-threaded contexts (like iterator invalidation et al.).
For an example,
mod std {
mod vec {
impl<T> for Vec<T> {
...
#[invalidates_memory(self)]
pub fn push(&mut self, t: &T) {
...
}
...
}
}
}
// This function is inferred to be memory invalidating for 'vec'
// since it calls 'Vec<>::push' on 'vec'
fn push_to(vec: &mut Vec<i32>) {
vec.push(2);
}
fn main() {
let vec = vec![1, 2, 3];
let item_ref = &mut vec[0];
let item_ref2 = &mut vec[1];
// Error since 'push_to' invalidates 'vec' but 'vec' has already lent out
// at least one reference
push_to(&mut vec);
}
An enum
constructor could also be considered memory invalidating. So in @trentj's example,
enum Foo {
Bool(bool),
Int(i8),
}
let mut foo = Foo::Int(10);
let mut i = match foo {
Foo::Int(ref mut i) => i, // 'foo' lends out a reference here
_ => unreachable!(),
};
// Error, since this invalidates the memory of 'foo' but 'foo' has already
// lent out a reference
foo = Foo::Bool(true);
So, wouldn't segregating objects into thread-safe and not thread-safe and enforcing the exclusive mutability restriction just for thread-safe objects and this memory invalidation restriction for the rest reduce a lot of friction?