Help understanding this exercise

It's best if you don't access bc for that you'd need to do the exercises, which will skew the statistics, but it's from last part of Ch 8 Ownership Inventory #2 - The Rust Programming Language

they give this fn:

fn remove_zeros(v: &mut Vec<i32>) {
    for (i, t) in v.iter().enumerate().rev() {
        if *t == 0 {
            v.remove(i);
            v.shrink_to_fit();
        }
    }
}

the question is, assuming .iter() is allowed to borrow (bypass the safety of re-borrowing an already &mut vector), whether the calls below produce would undefined behaviour (both do in the solution):

let mut v = vec![1, 2, 0, 3];
remove_zeros(&mut v);
let mut v = vec![5, 5, 0];
remove_zeros(&mut v);
println!("{:?}", v);

I assumed that:

  • The iter() works by calling next() on every iteration until i is longer than the vector's length (so shrink/grow it isn't a problem, assuming the length its updated)
  • I also assume that shrinking a vector does not re-allocate it.

So in my opinion in the first case the result is what one would expect that is 1,2,3 and in the second case it is also 5,5.

I'm unsure what I misunderstood.

It'd be useful if an answer it compatible or at the level of the corresponding section https://rust-book.cs.brown.edu/ch08-01-vectors.html#safely-using-iterators, and not too advanced :slight_smile:

for is the one calling .next() on the return value of .iter().enumerate().rev(). Ultimately this ends up calling .next_back() on the return value of .iter()

iter() stores two pointers to the start and end of the sequence it hasn't yielded yet. Updating the vec (either the length or its backing allocation) won't update these pointers, so those operations will be a problem. Always fetching the length from the vec as you was suggesting is very slow and would only prevent a limited amount of issues related to mutating the vec while iterating.

shrink_to_fit() does reallocate, that's its whole purpose.

You mean the default one, shipped with std? As the docs suggest a bit of a different picture:

The behavior of this method depends on the allocator, which may either shrink the vector in-place or reallocate. The resulting vector might still have some excess capacity, just as is the case for with_capacity. See Allocator::shrink for more details.

Sure, but even a conditional allocation means that you must assume it does, from a safety perspective. It's the safety perspective that's being discussed.

Note that i included a link to the relevant section bc it's nowhere said that it's 2 pointers (unless i missed it), they say it's 1 pointer and im unsure my reasoning is wrong following their explanation.

Also, i assumed one can't use internet/docs, and bc they don't explain that fit_shrink does reallocate I assumed it wouldn't, since it seems strange (why would you move something that shrinks, when you have the room already?)

Agreed, my bad. Though one could use a different A: Allocator in their Vec<i32> (in this case), if not impl a custom one outright. As long as its own shrink() does not to reallocate, the vec's shrink_to_fit() isn't going to, either. Minor nitpick. Not that relevant to the discussion, either.

1 Like

Do they tell you that Vec::iter will borrow the vec? If not, it would be pretty difficult to know why the mutations are incompatible, without the doc.

The high level reason (no need to know implementation details) they're incompatible is that the vec is borrowed during the iteration, so it can't be mutated with remove, even if the shrink_to_fit call is removed.

With the doc we can easily see that it is borrowed, because the return value of iter has the lifetime of &self.

https://doc.rust-lang.org/std/collections/vec_deque/struct.VecDeque.html#method.iter

pub fn iter(&self) -> Iter<'_, T> ⓘ

The _ means that the lifetime of Iter defaults to lifetime of &self according to the lifetime elision rules.

yes, in a previous section, so that was not a problem.

the borrow you are supposed to ignore as i described in OP.

Well, if we're talking about the actual implementation it does use two pointers

1 Like

It's a weird question IMO. "If you ignore this particular part that is UB and imagined things still compiled, would there still be UB?" The Book has an over-emphasis on dangling pointers, but forbidding aliasing has other purposes too. To "imagine fundamental borrow conflicts are allowed" is to imagine a different language.

It's a type of leading question; they want you to make the leap that just enough changed so that you got a dangling reference t after shrink_to_fit reallocated, because that's the point they're trying to make.[1]

I didn't click through their modal so I may be missing some context.


  1. In the actual language, as was pointed out, even remove would be unsound -- that method has the same aliasing requirements, and is allowed to do the same things, as shrink_to_fit, as far as the language is concerned. â†Šī¸Ž

4 Likes

Thanks, I dont want to give a bad impression of the exercise though. For me this book has been amazing, and these exercises are normally very useful and entertaining. I think most likely I may have missed something.

Imagining what happens if the compilers allows a specific borrow is great for understanding even if a bit over simplified

1 Like

The strange thing about the question is that, if it were possible to magically iterate without borrowing the Vec, then it would be equivalent to the following, which is fine (no UB).

fn remove_zeros(v: &mut Vec<i32>) {
    for i in (0..v.len()).rev() {
        if v[i] == 0 {
            v.remove(i);
            v.shrink_to_fit();
        }
    }
}
2 Likes

If we're talking about a useful implementation of the function,

fn remove_zeros(v: &mut Vec<i32>) {
    v.retain(|&t| t != 0);
    v.shrink_to_fit();
}
1 Like

In case it's useful or I didn't describe it properly, this is the rest of the exercise's statement:

Normally if you try to compile this function, the compiler returns the following error:

error[E0502]: cannot borrow `*v` as mutable because it is also borrowed as immutable
 --> test.rs:5:13
  |
3 |     for (i, t) in v.iter().enumerate().rev() {
  |                   --------------------------
  |                   |
  |                   immutable borrow occurs here
  |                   immutable borrow later used here
4 |         if *t == 0 {
5 |             v.remove(i);
  |             ^^^^^^^^^^^ mutable borrow occurs here

(bold it mine)

Assume that the compiler did NOT reject this function. Which (if any) of the following programs would (1) pass the compiler, and (2) possibly cause undefined behavior if executed? Check each program that satisfies both criteria, OR check "None of these programs" if none are satisfying.

(which i tried to describe as: ignoring the v.iter() immutable borrowing)

cc: @jumpnbrownweasel @SkiFire13

The answer is that you'd get undefined behavior any time the line v.remove(i) is reached. At that point v would be re-borrowed twice at the same time:

  • a shared borrow in v.iter()
  • an exclusive borrow in v.remove(i)

You can't have an exclusive borrow while a shared borrow is still being used. And the shared borrow will still be used in the next iteration of the loop. So that's undefined behavior. Hence, if the vector contains any 0 you get UB.

That's irrelevant to the quiz question, but: the memory allocator may want to move something that shrinks in order to avoid memory fragmentation. For example, some memory allocators keep separate memory areas for objects of different sizes.

That's irrelevant to the quiz question, but:

I think this is not correct, and the answer they give themselves proves it:

Context : To violate memory safety, remove_zeros must be called with a vector that contains a zero after the first element. The call to v.shrink_to_fit() will deallocate memory owned by the vector (due to resizing), which will invalidate the iterator v.iter() which contains a pointer to the old data. Note that reading the vector v after calling remove_zeros is not essential to the safety violation, since the issue is internal to remove_zeros .

I guess whether or not remove(i) is UB or not depends on whether the iterator returned by iter() borrows the slice. This is unspecified, so you have to conservatively assume that it may, so it's still UB (library UB). The actual implementation doesn't currently borrow, so it's not a language UB. The same is true for shrink_to_fit: if it doesn't actually reallocate, it's not a language UB (if iter() doesn't actually borrow), but it's still library UB because it is allowed to reallocate.

1 Like

I think you've understood what the exercise wanted you to understand already. I think it would be good to also understand that aliasing an active &mut _ is, in and of itself, UB. And that borrow checking prevents more than dangling pointers.

The exercise is fine from the perspective of "so you're coming from C and want an example of why borrow checking is a good thing". But IMO the example is incomplete and somewhat misleading from a perspective of "this is an introduction to what borrow checking in Rust is and is about".[1]


  1. And very misleading if taken as an example of how to determine if something would be sound with unsafe or such. â†Šī¸Ž

I think I agree with what you're saying @jumpnbrownweasel. And in a way is exactly what is confusing me.

I really don't understand is what happens if the vector is relocated.

Would it be in both cases that they iteration now uses reference v but the address is updated, and now points to a new place in memory ? (and has also new capacity and length)

It is clear though why they are using reverse it is because that makes accesses always within the boundaries of the vector. (I think but please correct?)

In that case this would also happen with iter so there wouldn't be UB (I do understand this is a hypothetical situation where we are bypassing the compiler)

Maybe drawing this would make it simple but it takes quite a lot of time.

It would be useful if we could make the compiler ignore this check but is this possible? (although it might crash my computer)

Could it be because .iter() isn't accessing through v (like in the example you added), but is a direct pointer to the data &v[..]? @jumpnbrownweasel

Although if iter going has &self and not &self[..] then that's wrong.

1 Like