Workaround for: cannot borrow `*self` as mutable more than once at a time


#21

I think it’s perfectly reasonable to reach for unsafe code if that’s what it takes to achieve the desired functionality (and/or performance). Just need to be careful with that code and not snooze since the compiler won’t help detect violations. In @godmar’s case, that might be a valid approach since the root issue is not being able to convince the compiler that accesses are valid. In particular, since the functions/methods in question are private, the unsafety could be fully self-contained in the impl (or parts thereof, rather). I’d explore safe impl choices first, but if nothing palatable comes of it, I’d go with unsafe (and a quick prayer! j/k :slight_smile:).


#22

Oh, I was not intending to mean that using unsafe would be unnecessary here. My point of view on this matter is currently neutral: I do not understand @godmar’s intent well enough to tell if reaching for unsafe is called for or not, and would personally need more code before forming a firm opinion on this matter.

All I was trying to do is to explain a bit why the Rust language has this peculiar design which often causes much trouble to newcomers from other languages. It was meant as an encouraging thought, in a “we mean no harm, and are not trying to sell you a broken product, everything is working as intended and if you give it a bit of time I hope you will find Rust to become a friendly fellow” kind of way.


#23

I intended that post for @godmar, it wasn’t a rebuttal to your points :slight_smile:

I think the issue there is going to bottom out in borrowck not understanding disjointness of array slots. If it’s the case that conflicting references need to be alive at the same time, it won’t borrow check.

We’ve discussed some ways to deal with that but given this is an allocator, it might not be prudent to pay perf penalties, even if ever so slight, and instead just take the unsafe plunge. But maybe not - I agree that it depends on the exact code.


#24

To benefit from Rust’s promise of efficient and typesafe code, I’d like to keep the amount of unsafe code to a minimum. In addition, I do not understand how the compiler’s understanding of aliases would even interact with unsafe code. In unsafe blocks, does the compiler assume references may be aliased (as in C code) and issue the necessary memory references?

I will admit that I don’t understand why it is apparently so difficult to maintain references to internal fields that do not change (and why the ability to tell the compiler that a field doesn’t change was removed, but I haven’t picked apart the corresponding thread I found online and linked to earlier in this thread). Naively, it seems that the immutability of an object is separate from the immutability of its components (that’s at least how it is in other languages). In Rust, the two seem to be interwoven for reasons that aren’t quite clear. Maybe consider adding a chapter to the 3rd edition of the book.


#25

Here is a modification to your last playground that uses raw ptrs (no unsafe code yet because there’re no derefs of the ptr).

It’s not really unsafe per se but rather raw ptr usage that will allow escaping the sandbox; deref’ing them will require unsafe blocks but that’s separate. A raw ptr isn’t tracked by borrowck. The aliasing and lifetime aspects still need to be enforced, except you do it manually now - compiler assumes you got it right but isn’t able to check it.

It’s not difficult to maintain references to fields as long as they’re disjoint, as we’ve discussed. Moreover, disjointness is important only if you want to hold a unique/mutable reference.

The difficulty is more to do with compiler not understanding disjointness beyond fields.


#26

Thank you. So if I cast as *const _ I avoid the borrow checker, but then have to prefix unsafe a simple dereference. (Better?)

I’m trying the index based approach now. I’ll include a snippet of the actual code for a change (it’s not complete, but compiles)

    pub fn alloc(&mut self, layout: Layout) -> Result<*mut u8, AllocErr> {
        let neededsize = cmp::max(layout.size(), layout.align());
        let clazzi = self.find_size_class_index(neededsize)?;
        //let clazz = &self.classes[clazzi];

        if self.classes[clazzi].free.is_empty() {
            let block = align_up(self.brk, self.classes[clazzi].sz);
            // [self.brk, block] must be added to free list
            let tmp = self.brk;
            self.add_range_to_free_list(tmp, block);
            let newbrk = block.saturating_add(self.classes[clazzi].sz);
            if newbrk > self.end {
                return Err(AllocErr::Exhausted { request: layout })
            }
            self.brk = newbrk;
            Ok(block as *mut u8)
        } else {
            let blk = self.classes[clazzi].free.pop().unwrap() as usize;
            // [blk + layout.size(), blk + clazz.sz] must be added iff size() < align()
            let tmp = self.classes[clazzi].sz;
            self.add_range_to_free_list(blk.saturating_add(layout.size()), blk.saturating_add(tmp));
            Ok(blk as *mut u8)
        }
    }

It’s tough to like that.

As a side note, if I left this statement

let clazz = &self.classes[clazzi];

in there — without any use of clazz anywhere, it would cause the borrow checker to complain. (How do you rationalize that? Dead assignments certainly can’t cause violations.)

Then, I have to replace every occurrence of clazz with self.classes[clazzi] - not exactly pretty and not DRY.

I’ll also admit I don’t understand why I need all these tmp variables. What again is the danger in writing

self.add_range_to_free_list(self.brk, block);

that is avoided by writing

let tmp = self.brk;
self.add_range_to_free_list(tmp, block);

instead?

(I think it’s because the evaluation of method arguments is somehow not considered to occur before the method call, so it appears to the borrowck as though there are 2 mutable borrows simultaneously?)


#27

I think those two things you mention (dead locals + a local to avoid borrowck errors) are fixed with NLL. You can try by putting #![feature(nll)] in your crate root and compiling with the nightly compiler.


#28

Deref needs an unsafe block, yes - that’s a mechanical thing: bigger concern is you need to ensure the validity of what you’re doing through the raw pointers. But that would be the case 100% of time in, say, C or C++. Here you can try to confine these regions to be as small as possible.

There’s a healthy portion of std that uses unsafe internally. You’re writing a low level component (allocator) - I’ll be very impressed if you can pull off a performant and lean one without using unsafe in a few places :slight_smile:

Using indices is safe in terms of memory safety but you gain logical bugs too since an index can potentially become invalid over time. The best approach really depends on circumstances.


#29

This is indeed a key design difference between Rust and C. Rust is heavily opinionated against shared mutable state. In fact, it is considered undefined behaviour to even have an &mut and another reference to a memory location live at the same time. The compiler assumes that references will not be mutably aliased (from a C point of view, everything is restrict), and third-party unsafe code also assumes that they won’t be (often leveraging &mut as a token of unique access to data). It is one of the things that you need to be careful about when writing unsafe code.

Since shared mutable state is occasionally useful (thread synchronization primitives come to mind), Rust provides a way to opt-in to shared mutability, using UnsafeCell and higher-level abstractions built on top of it (Cell, RefCell, Mutex…). With these, you are still not allowed to alias references mutably, but raw pointers will assume mutable aliasing the way you would expect in C. This can be contrasted with the C approach, where the assumption of mutable aliasing is opt-out.


This opinionated part of Rust’s design originates from an old tradition of encouraging programmers to avoid shared mutable state, which can be traced back to two programming language families:

  • Functional programming languages, which discourage mutable state
  • Actor / Communicating processes languages, which discourage sharing

Rust innovates with respect to these two language families by allowing data to switch between a “shared” and a “mutable” state, effectively putting it under a compile-time reader-writer lock, and by generally being more suitable for low-level development thanks to the region/lifetime system avoiding the need for a garbage collector. This enables it to be more easily used in contexts where these concepts were not easily applicable before, such as embedded development.

Now, why so much dislike for shared mutable state, you may wonder?

  • It is the single biggest culprit for multi-threading’s awful reputation. Parallel shared mutation is hard.
  • It hampers many useful compiler optimizations (e.g. assumption of aliasing between input and output is one of the most common reasons for loop auto-vectorization failure)
  • It makes larger codebases harder to reason about even in a single-threaded setting (iterator invalidation is a common example, @Manishearth explored the topic more in depth on his blog)

I do not have a precise answer to this question, you may want to explore the link that was provided above in more details. But intuitively, mutability of fields seems harder to control than mutability of references. For example, in current Rust, if I have a mutable reference to a struct, I can use that to swap the entire struct with another, and that effectively mutates the supposedly immutable fields which other code could have held a reference to. This would have to be prohibited if mutability was to be controlled at the level of individual fields. I suspect that examples are numerous.

Experience with C++ also tells me that immutable fields can be uncomfortable fellows to deal with. It seals the field for further modification, which struct designers do not always have a full picture of. From a real-life example (a codebase that I work on at work), if you are building a library of geometrical objects, and later decide to link them to their nearest neighbors in the modeled shape for faster lookups, you can only do so after the whole geometry has been built, and making this simple feature addition can require removing dozens of const annotations all over the place. This definitely does not feel ideal.

The reverse feature of marking fields as mutable even when they are reached via an immutable reference is worse. It introduces “mutability surprises”, which generally make code harder to reason about, and can easily turn into heisenbugs in multi-threaded code. While immutable fields are occasionally unpleasant, mutable fields make me downright nervous, as they greatly reduce what I can assume from untrusted code. I am glad that Rust generally discourages such interior mutability.

In general, Rust’s inherited mutability seems easier to reason about and work with to me than C++'s field mutability. But I can also understand how it can be felt to be limiting.


In your specific case, as @vitalyd pointed out, you are allowed to borrow individual struct fields mutably in isolation, which can be used to approximate immutable fields. What Rust does not allow to do, at this point in time, is to carry out such disjoint borrows through method boundaries. From Rust’s point of view, methods are opaque abstractions which borrow the entire object. This is true to the spirit of object encapsulation, but can certainly feel limiting at times. Hopefully a more flexible design, that remains easy to understand for users, will be found in the future.

PS: I’ll try to look at your index-based example later on to see if I can think of a more idiomatic way to design the code, but for now I am out of time.


#30

Can you provide the implementation of add_range_to_free_list? Depending on what it does, a variant of this strategy, which @vitalyd suggested previously, may or may not work:

type SizeClassList = [SizeClass; 3];

struct Allocator {
    classes: SizeClassList,
}

impl Allocator {
    /* ... */

    pub fn find_size_class(classes: &mut SizeClassList,
                           needed_size: usize) -> &mut SizeClass
    {
        /* ... */
    }

    // NOTE: Will not compile as-is due to a conflict between "clazz" borrowing
    // self.classes and add_range_to_free_list borrowing all of self.
    pub fn alloc(&mut self, layout: Layout) -> Result<*mut u8, AllocErr> {
        let neededsize = cmp::max(layout.size(), layout.align());
        let clazz = Self::find_size_class(&mut self.classes, neededsize);

        if clazz.free.is_empty() {
            let block = align_up(self.brk, clazz.sz);
            // [self.brk, block] must be added to free list
            let tmp = self.brk;
            self.add_range_to_free_list(tmp, block);
            let newbrk = block.saturating_add(clazz.sz);
            if newbrk > self.end {
                return Err(AllocErr::Exhausted { request: layout })
            }
            self.brk = newbrk;
            Ok(block as *mut u8)
        } else {
            let blk = clazz.free.pop().unwrap() as usize;
            // [blk + layout.size(), blk + clazz.sz] must be added iff size() < align()
            let tmp = clazz.sz;
            self.add_range_to_free_list(blk.saturating_add(layout.size()), blk.saturating_add(tmp));
            Ok(blk as *mut u8)
        }
    }
}

I suspect that it won’t work as-is, because you warned at the time that it might not…

…however, without knowing what you are actually doing, it is hard to tell what should replace it.


When the Rust borrow checker was initially developed, one design goal was to keep its rules simple and easy for Rust developers to understand. In this context, the borrow analysis was purely based on the scope of references: as long as a reference is in scope, even if it is unused, the associated object is considered to be borrowed.

However, as you have observed, this rule is unnecessarily pessimistic. We don’t really care about how long a reference is in scope, we only care about it from the point where it is first used, and until the point where it is last used. Forcing ourselves into a scope-based algorithm thusly makes references live unnecessarily long, increasing the odds that one will attempt an incompatible borrow on the host object. To work around this, people are in turn forced into unwieldy contortions involving the introduction of artificial bindings and blocks.

In addition, the expected ergonomics and learnability benefits of a simple scope-based rule did not materialize. As it turns out, Rust developers had a hard time thinking in terms of reference scopes. So there wasn’t a strong motivation for keeping the existing simple rule, and there was a strong motivation for relaxing the rules of the borrow checker.

For this reason, a new set of more complex borrowing rules, based on actual usage of reference variables (and thus compiler analysis of the program control-flow graph), is now being introduced. This feature, known as non-lexical lifetimes or NLL, is due to be stabilized at some point during this year. As @vitalyd pointed out, you can already try it out on nightly compiler releases.

Yes. This is another issue that will be resolved by NLL. Basically, the Rust compiler currently parses your code as follows:

  • Borrow “self” mutably
  • Reach a method call, and start evaluating the parameters.
  • Try to access self.brk… but wait, that is incorrect: self is already borrowed mutably!

NLL improves upon this by taking into account the point of first use of a reference. Facing this code, the more clever NLL analyzer easily figures out that brk can be read before self is mutated by the add_range_to_free_list method, and that therefore the code does not violate the borrowing rules.


#31

No. First, this is part of a class assignment and I know the instructor reads this forum. I think I’ve already posted more than is kosher. Second, I haven’t actually written this function yet as I haven’t gotten past the borrowing issue. I’ll keep working on it.


#32

One issue is that if you have a mutable reference to a struct, you can replace the entire struct with a different value, e.g.

fn wonky(a: &mut Allocator) {
    *a = Allocator::new();
}

Thus, even if there were some way to declare a field as immutable, it would still be possible to change its value by replacing the whole struct - so the borrow checker still couldn’t allow you to keep an immutable reference.

You might ask, couldn’t the compiler just forbid doing that with structs that contain immutable fields? It could, but that would pose a difficulty for generic code. Today you can write:

fn set<T>(p: &mut T, val: T) {
    *p = val;
}

and the compiler guarantees this works for any T (at least, any T that is Sized, since generic parameters have a hidden Sized bound by default). If there were some types where this wasn’t legal, all functions like the above would need to add some sort of bound to assert that T isn’t one of them. In general, having to declare this everywhere would tend to add a lot of clutter; even if it were hidden and opt-out like Sized is, Responsible Library Code™ would still try to opt out whenever the bound wasn’t necessary, so the clutter just moves somewhere else, and then you’re left with implicitness breeding confusion. And either way, it would add more cases to think about when writing generic code.

This is a real debate, though. There have been proposals to add support for “immovable types” - for the sake of other lifetime-related benefits, but I think the same distinction could work for your use case. However, thus far, the potential benefits have been outweighed by concern about impacting generic code, and it looks like one of the use cases (generators) is going to be solved by a more limited proposal instead.

In any case, for your use case, interior mutability should be fine. One thing to note is that while people usually talk about RefCell, there’s also Cell, which has zero overhead and doesn’t involve a dynamic check that can fail, but provides more limited semantics (can only get and set the contained value, not borrow it).

Oh, and - class is not a keyword in Rust, so there’s no need for the z’s :slight_smile:


#33

If all else fails (and please do exhaust those options) there is always *mut T. If the receiver of the method is *mut self rather than (&mut self) you can effectively give borrowck the finger. The flip side is of course that now you’re responsible for maintaining code validity.