How to avoid copy an array accidentally?

Consider the following code

let arr = [10, 20, 30];
for v in arr.into_iter() {
    println!("{}", v);
}
println!("{}", arr.len());

My understanding is that a copy of arr is made for into_iter. Am I right? If so, this is usually not what users intend to do. Is there any way to avoid copying an array accidentally? It seems that none of rustc, rust-analyze or clippy complains about this.

1 Like

Copying that array is basically free -- cheaper even than moving a String on my machine.

But you sound like you're looking for the following, which doesn't yet exist:

1 Like

I’m not positive but isn’t this using the IntoIterator Trait? In which case, no, it doesn’t appear to:

However, there is also an IntoIter Struct which does do a transmute_copy call.

2 Likes

The call to into_iter copies the array (in contrast with moving and consuming the array) because the original array is used again after the for loop. The code doesn't compile for non-Copy types like [String; 3].

4 Likes

By "copies the array" you meant "copies the elements", right?

I mean it copies arr, which is a [i32; 3].

(This in turn copies the elements contained within, too, but I don't think that's what you meant.)

Oh, this is surprising. I knew the elements are copied but I wouldn't have expected arr itself is copied, too. Shouldn't there be a warning for this hidden copy?

Like in the following case, the large array gets copied, too?

let arr = Box::new([42; 1024]);
for v in arr.into_iter() {
  println!("{}", v);
}
println!("{}", arr.len()); 

Such a warning is the subject of the issue that @scottmcm linked above.

1 Like

That github issue is to add a lint, right? It would be helpful. But in this particular case, I think it probably should be a compiler warning regardless of array size.

I think you may be seeing a distinction where there is none. There is no difference in memory between "an array" and "all of its elements". An array's representation is just its elements, adjacent in memory.

3 Likes

I knew the distinction. I probably wasn't clear though.

With into_iter(), the elements are moved (i.e. copied) in the iterations, individually. Like in the following example, the closure parameter type is i32. Therefore the argument i is moved (copied) in each iteration:

fn main() {
    let a = Box::new([1,2,3]);
    a.into_iter().for_each(|i: i32| {
        println!("{},", i);
    });
    println!("len: {}", a.len());
}

The extra copy of the array surprised me, unless I am mistaking what everyone was saying.

They are first all moved into iterator, and then individually - out of it. That's what must happen, semantically. However, in many cases the unnecessary copies would be elided by the optimizer.

3 Likes

I am still confused. In the following case, without optimizations, is the total number of bytes copied to finish the iterations 12 bytes or 24 bytes?

    let a = [1,2,3];
    a.into_iter().for_each(|i: i32| {
        println!("{},", i);
    });
    println!("len: {}", a.len());

Assuming that we use i32, i.e. 4-byte numbers:

  • a.into_iter() copies the whole array, i.e. 12 bytes;
  • Iterator::for_each calls next, which moves each value out of iterator, for each value - that's another 12 bytes;
  • then, Iterator::for_each passes the copied value into closure, again moving it - that's another 12 bytes;
  • finally, somewhere inside println! we have to copy value to print it - assuming that it should be done only once, that's another 12 bytes.

So, semantically, that's 48 bytes initially placed in a to be copied around.

2 Likes

Thanks for the detailed breakdown.

What surprised me was the 12 bytes copied by a.into_iter(), the first step. I was expecting the 12 bytes by for_each() in the second step. So these two copies are the overhead of the loop/iteration, 24 bytes in total for the example.

I am not worried by the copies in third and fourth steps because they depend on the "business logic", i.e. what I need to do with the elements, hence are not the overhead of the loop.

I think the optimizer is smart enough to copy nothing in this case. But don't trust my word, check the generated code via godbolt.

I don't know if this adds anything or helps at all, but I haven't seen this stated explicitly, so I thought I would add it.

[T; N] is Copy if T is Copy, so all of the same rules about moving vs. copying apply to the array. If it is moved and then used again later, the previous move becomes a copy. And since .into_iter() takes self by value, it copies the array.

2 Likes

Even when a type does not implement Copy, a move is semantically still a memcpy, just that the borrow checker doesn't allow access to the old location any more. It's up to the optimizer to decide whether that underlying copy really needs to happen, or if it can reuse the old location "as-if" it copied to the same place.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.