Why does looping over Vectors work in a way that can change ownership of the vector?

So, I'm another Rust newbie. I've been reading the book off and on, playing with Rustlings a bit, and poking at it for a while now, but I have yet to really build anything.

My background is PHP and other web related development (plus some bash and sysadmin related scripting). Very little experience with compiled languages or non web apps. But I do have years of scripting language experience.

Today I'm experimenting with reading a file, parsing some text into a couple kinds of structs, then outputting a table of data. In the process I've run into the way vector ownership changes when you loop over it. Specifically the behavior described here: [SOLVED] Cannot return a value from a method after its been through a for loop? - #6 by L0uisc

From that post, experimenting, and some other results, I think that vectors get turned into an Iterator and then that Iterator is dropped at the end of the loop. Unless you loop over a reference to the vector.

So, this works (when referencing the vector):

let v = vec![1, 2, 3];
for i in &v {
    println!("{}",i);
    for i in &v {
        println!("{}",i);
    }
}

This doesn't work (when not referencing the vector):

let v = vec![1, 2, 3];
for i in v {
    println!("{}",i);
    for i in v {
        println!("{}",i);
    }
}

Error:

error[E0382]: use of moved value: `v`
   --> src/main.rs:5:18
    |
2   |     let v = vec![1, 2, 3];
    |         - move occurs because `v` has type `Vec<i32>`, which does not implement the `Copy` trait
3   |     for i in v {
    |              - `v` moved due to this implicit call to `.into_iter()`
4   |         println!("{}",i);
5   |         for i in v {
    |                  ^ value used here after move
    |
note: `into_iter` takes ownership of the receiver `self`, which moves `v`
   --> /home/reagand/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/iter/traits/collect.rs:271:18
    |
271 |     fn into_iter(self) -> Self::IntoIter;
    |                  ^^^^
help: consider iterating over a slice of the `Vec<i32>`'s content to avoid moving into the `for` loop
    |
3   |     for i in &v {
    |              +

So I can move on in my project now, but I am really curious as why Rust implements loops the way it does.

Would anyone mind explaining? What potential issues are avoided by that method?

Thanks!

Sometimes, you just need the ownership of the items in the collection (for whatever reason). Sometimes, you need references to them. So it's best to have access to both ways of iteration (by value and by reference).

1 Like

So, why not make by reference the default?

Any examples come to mind of when you need to change ownership?

  1. We want to work with values by default.
  2. The collection itself is a value and represents the values. (This is a roundabout way of saying that "values are values"). It would be weird to iterate by reference given a non-reference value.
  3. In the particular case of a for loop, if it automatically took a reference, there would simply be no way syntactically to express that you want to iterate by value, so it has to be the default.
1 Like

Ah, I thought it might be something like that.

Why not restore the original variable after the loop is finished?

I mean, you can pass an immutable var into a function and then use that var after calling the function just fine. At least as far as this code goes:

fn main() {
    let b = 3;
    let c = do_something(b);
    println!("{}",c);
    println!("{}",b);
}

fn do_something(s:i32)-> i32{
   let d = s-1;
    return d;
}

Doesn't ownership of 3 get given to s when b is passed into the do_something function? Or do I need to review the ownership sections of the docs.... :\

Wouldn't it make sense for something like:

let v = vec![1, 2, 3];
for i in v {
    println!("{}",i);
}
for i in v {
    do_something_with_val(v);
}

to work as well?

(FYI, I'm just trying to understand how Rust works a bit better so that I can work through things not working as I expect easier. :slight_smile: I'm not trying to say Rust's way of doing loops is bad.)

No, i32 is Copy, so a copy is passed to do_something. Try changing it to a non-Copy type like, say, Vec<i32>:

fn main() {
    let b = vec![1, 2, 3];
    let c = do_something(b);
    println!("{:?}",c);
    println!("{:?}",b);
}

fn do_something(mut s: Vec<i32>)-> Vec<i32> {
   s.push(42);
   return s;
}

Produces this error:

error[E0382]: borrow of moved value: `b`
 --> src/main.rs:5:21
  |
2 |     let b = vec![1, 2, 3];
  |         - move occurs because `b` has type `Vec<i32>`, which does not implement the `Copy` trait
3 |     let c = do_something(b);
  |                          - value moved here
4 |     println!("{:?}",c);
5 |     println!("{:?}",b);
  |                     ^ value borrowed here after move
  |
note: consider changing this parameter type in function `do_something` to borrow instead if owning the value isn't necessary
 --> src/main.rs:8:24
  |
8 | fn do_something(mut s: Vec<i32>)-> Vec<i32> {
  |    ------------        ^^^^^^^^ this parameter takes ownership of the value
  |    |
  |    in this function
  = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)
help: consider cloning the value if the performance cost is acceptable
  |
3 |     let c = do_something(b.clone());
  |                           ++++++++

For more information about this error, try `rustc --explain E0382`.

Rust could be designed in a way that this worked. But it isn't. Rust generally prefers taking references to be explicit rather than implicit. If you want to iterate over a reference to v, then you write for i in &v. If you write for i in v you are asking to iterate over the value v, so it iterates over the value, just like you told it to.

There are two big exceptions to this: methods (which can "auto-ref") and pattern matches which can also auto-ref. Both of these exist as it was felt that requiring explicit references in these cases was just too onerous. Using &v or v.iter() (the other way of iterating over a Vec or slice by reference) rather than v is not considered too much to ask. Rust's design lives on a knife's edge between "explicit for the sake of readability" and "convenience to keep people from running away screaming".

6 Likes

How would you do that? The whole point of ownership is that you can't access invalidated, consumed values.

1 Like

Ah, and now I remember something about vectors (and other collections?) not being copied due to how much memory they can take, while something like an i32 doesn't take much memory, so it's easy to copy.

Thanks.

Which just circles back around to me not expecting a loop to invalidate or consume a value.

I think I understand better why rust works like this. It'll take some getting used to (I still find it strange), but this discussion has helped solidify some concepts for me.

Thanks @H2CO3 and @DanielKeep for the help. :slight_smile:

1 Like

It's not really about the loop being a loop; it's more general than that. Consider these snippets:

let channel: std::sync::mpsc::Sender<String> = ...;
let strings: Vec<String> = ...;

for string in strings {
    channel.send(string);
}
let channel: std::sync::mpsc::Sender<String> = ...;
let string: String = ...;

channel.send(string);

In both of these cases, ownership of a String is moved to the channel (which might move it to another thread, but in any case will be processing it later and therefore needs to own and not borrow the string) because we passed it by value to the send function. Similarly, in the first snippet, ownership of strings is moved to the for loop. Here's the key thing to understand: if the Vec were taken by & reference, then it would be impossible to take ownership of the Strings in it. Ownership of a value (usually) gives you ownership of its parts; & reference does not.

This is why loops need to work by-value: given ownership, you can borrow, but given a borrow, you cannot take ownership (unless it's an &mut borrow and you're able to leave the borrowed thing empty, e.g. with Vec::drain()). So, just like function calls are by-value until you add an &, loops are by-value until you add an &. Ownership transfer (move) is the default thing because it's the most general thing; you can always make the thing moved be a reference, but you can't (in the general case) make it not-a-reference if you have a reference.

9 Likes

The most obvious is when you want to transform one kind of collection into another. For example:

use std::collections::HashMap;
use std::hash::Hash;

pub fn group_by<K: Hash + Eq, V>(
    items: Vec<V>,
    mut group: impl FnMut(&V) -> K,
) -> HashMap<K, Vec<V>> {
    let mut ret = HashMap::<K, Vec<V>>::new();
    for x in items {
        ret.entry(group(&x)).or_default().push(x);
    }
    ret
}

(See also @kpreid above)

3 Likes

@kpreid and @2e71828 , those are both very good examples. Thanks! That really helps me get a better perspective.

1 Like

It has nothing to do with size. Vec has a destructor (manages a resource), so it can't possibly be Copy. That would be unsound (it would cause double frees).

IMNSHO to understand why the best way is to learn a tiny bit about history of references and affine type system that Rust uses.

It wasn't invented for Rust, it fact that was added to some functional languages first to manage resources.

Then you probably know that you can do this in PHP:

$handle = fopen("http://www.example.com/", "r");
$handle = fopen("ftp://user:password@example.com/somefile.txt", "w");

And if you would do that then you really need to keep track of where and how you access these variables.

You may put them into array and then pass over that array with for, but if you really don't want these to be accessed from two parts of scripts which know nothing about each other.

Strict ownership policy is required — even in PHP. Because you are accessing remote server and couldn't easily rollback stuff.

This wouldn't work with such handles, now would it? You need to start separate FTP or HTTP request if you want to read it parallel (and writing is even more tricky).

And, well, Rust got it's unique ownership-and-borrow story from affine types designed to handle “heavy” types while catching many access errors and at some point people asked themselves the simple question: what if we would treat the majority of variables in our code in this fashion.

Surprisingly enough it worked. That wasn't even the initial plan but it turned out that if treat most variables as “heavy” (means: you can not implicitly clone them) then initial coding becomes slower, but you save so much on debugging then in most (but not all) cases it's worth it.

And that's why Rust treats things that way: types that don't refer other types (inside of your computer or outside of your computer) like simple i32 or (char, f64) get special treatment not because they are “small” but because they don't refer anything and that means that copy may be used simultaneously with original without confusion.

But most types are treated differently because they refer something (piece of data on heap in String or file on disk or maybe even something on remote server).

And for loop is designed to handle these types first and simple arrays of integers second.

In fact before 2021 it wasn't even possible to write for i in v, that was syntax error.

And when it was added… it implemented something that wasn't possible in old versions of Rust.

1 Like

FYI it wasn't possible to do for i in array, but for i in vec (like op tried to do) has always been possible. for i in array technically was possible, it just had two problems that significantly slowed down addition:

  • it initially couldn't be generic over the array length, which prevented the initial implementation;
  • when it became possible to be generic over the array length it became a breaking change due to it changing the resolution of .into_iter() calls on arrays.
6 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.