Efficient way of copying partial content from Vector?


#1

This is strange but I can’t find better way of implementation Vector data partial copy than looping over items with specific offset and copying each one.

let data = vec![1, 2, 5, 44, 59, 67];
// I want to copy 3 items from "data" vector starting from 2nd position
let mut part = vec![0; 3];

for i in 1..4 {
    part[i] = data[i];
}

This is really not efficient
For example in C++ I would write something like this memcpy(part, &data[1], 3) and it will just copy memory, assuming that each element is just a 1 byte.
Is there any efficient way in Rust for doing something like this ?


#2
use std::iter::FromIterator;
let data = vec![1, 2, 5, 44, 59, 67];
let part = Vec::from_iter(data[1..4].iter().cloned());

playground link

That should compile down to a memcpy in release mode, not guaranteed of course but rust is generally pretty good at inlining iterator related code and simplifying it down.

EDIT: Fixed code to actually copy instead of creating references.


#3

There’s also copy_from_slice.


#4

(Your loop has an indexing bug :slight_smile: )

There’s a method for that:

part.copy_from_slice(&data[1..4]);

You can also use iterators:

for (dst, src) in part.iter_mut().zip(&data[1..4]) {
    *dst = *src;
}

It’s using iterators, but it’s guaranteed to be efficient. (And also is easy to refactor, eg. to do a +1 on each element).

If you’re copying to a new vector (not an arbitrary slice), just use collect (or from_iter, like @Nemo157 is suggesting:

let part: Vec<_> = data[1..4].iter().cloned().collect();

If you’re appending to a vector, you can also use the extend method:

part.extend(&data[1..4]);

#5

Yikes!? Not sure what you mean here.


#6

Great! Now got full picture about Slices and iterators.
Also from Vector documentation https://doc.rust-lang.org/std/vec/struct.Vec.html#method.from_raw_parts found some unsafe implementation

let mut base = vec![1, 2, 5, 44, 59, 67];
let p = base.as_mut_ptr();
unsafe {
   p.offset(1);
   let partial = Vec::from_raw_parts(p, 3, 3);
}

But I’ll definitely wouldn’t use it if there is safe Rusty way :slight_smile:


#7

@bluss We have this codegen test, which checks that such loops optimize to memcpy in release mode. So to be technically correct, we’re not guaranteeing it, but it would be really surprising for this code in test to compile correctly, while the same code somewhere else compiles differently. (edit: I’ve just noticed it was introduced in your PR :slight_smile: )

@tigranbs
Your unsafe code is wrong on many levels:

  • Your p.offset(1) is no-op (it doesn’t modify the pointer), so you’re not offsetting at all.
  • You’re creating two vectors, which own the same part of memory. This would lead to double-free on drop.
  • Moreover, you’ve aliased the mutable and immutable memory, which is illegal in Rust, so the writes to base may be visible partial, may be invisible, or compiler might think your code doesn’t make any sense and throw it all away or do something worse (as in case of any undefined behaviour).

If you indeed wanted the partial to use the same memory as base, you could’ve just done:

let partial = &base[1..4];

#8

Thanks for explanation :+1:
And what about if I have already defined Vector and want to replace partial data there

let mut base = vec![0; 6];
let part1 = vec![2, 3, 4];
let part2 = vec![5, 6, 7];
// copy part1 to base with 0 offset and 3 length
// copy part2 to base with 3 offset and 3 length

In all provided examples we are actually making new vector from partial data, but what about writing actual partial data to existing Vector ?


#9

Only the Vec::from_iter and collect create a new vector, the iter_mut().zip and copy_from_slice use the existing memory as the destination. (extend is different from both these cases, as it uses existing vector, but not its storage). So for example:

let mut base = vec![0; 6];
base[..3].copy_from_slice(&part1);
base[3..].copy_from_slice(&part2);

But I think in this particular case, extend would be nicer:

let mut base = part1.clone();
base.extend(&part2);

edit: Yet another solution to concatenate slices (this also creates a vector):

let base = [&part1[..], &part2[..]].concat();

(using bare part1 unfortunately consumes the vector, it would work if they were just slices)


#10

Ok great :smile:! What I wanted to push in here was that yes, certain things are known to optimize well (we could be forced to remove that test due to changes in llvm, if we are unlucky, so it’s not guaranteed); but there are also misunderstandings regarding iterators sometimes. They are not silver bullets!

  • We have good iterators for slices and vectors; they are great low level data structures that describe contiguous data, the kind of data that is most efficient of all to process.
  • Iterator itself is a trait, it’s an abstraction that applies to all kinds of lazy sequences. The implementor is free to make every iterator step as expensive as they want, and it doesn’t lend itself to having every kind of traversal be efficient.