Efficient ways to initialize array/vector/struct containing all copyable elements?


#1

I’m writing a program that requires high performance therefore I tend to use efficient APIs.
Frequent I find I have to initialize some special data structures with other existing ones.

  1. given arrays/vectors whose underlying elements are of primitive types, initialize with an existing slice.
  2. initialize structs containing fields that are all copyable but values remain unknown in th beginning.

For vectors, there is a way to initialize the vector with 0 length and capacity of the other slice’s length, extend_from_slice.

let mut v = Vec::with_capacity(other.len());
v.extend_from_slice(other);

But for arrays and structs, there seems no way to achieve this without using unsafe APIs.

I am currently trying these APIs below, but not sure whether they are safe.

// array
let mut arr: [u8;SIZE] = unsafe {::std::mem::uninitialized()};
arr.copy_from_slice(other); // other.len() == SIZE at compile time

// vector
let mut vec:Vec<u8> = Vec::with_capacity(other.len());
unsafe {vec.set_len(other.len())};
vec.copy_from_slice(other);

// struct
// s is a field of another struct, and is unknown during initialization of that struct
// but will be guaranteed to be initialized before reading it
let s: MyStruct = unsafe {::std::mem::uninitialized()};
let mut ss = OtherStruct {s:s, ..};
// ...
ss.s = Mystruct{...};

The docs for “mem::uninitialized” mentions “The only way to safely initialize an uninitialized value is with ptr::write, ptr::copy, or ptr::copy_nonoverlapping”. To me it means internally using memcpy/memmove is fine, which I guess is also for copy_from_slice. So is it guaranteed to be safe in my scenario when initializing arrays/vectors?

For struct initialization, is the above code safe and is there more efficient way to achieve the goal (but I don’t like to make MyStruct itself copyable)?


#2
  1. No but your implementation is an acceptable use of unsafe code.
  2. Are you aware of this syntax?
let s = MyStruct;
let ss = OtherStruct {
    s,
    // Leave all other members uninitialized
    .. unsafe { mem::uninitialized() }
};

#3

I didn’t know we could initialize the struct with part of the fields uninitialized, thanks!

Do you mean that for arrays/vectors, the practice of leaving the content uninitialized and then calling copy_from_slice immediately will not cause safety issues as long as the underlying elements are copyable?

Another question: is copyable requirement for the underlying elements/fields necessary?

I’m asking about these since I’m not sure whether the behavior is undefined or implementation specific.
I had one experience that initializing a struct with mem::uninitialized() and later assigned to a concrete value, however caused segfault (but that struct is more complicated, containing one string vector, one string, one boolean, and one hashmap mapping from string to string).

And I’m confused about what the docs says about mem::uninitialized and Vec::set_len. For example,

The only way to safely initialize an uninitialized value is with ptr::write, ptr::copy, or ptr::copy_nonoverlapping.

Does it mean there is no safety guarantee if we use other APIs than ptr::write, ptr::copy, or ptr::copy_nonoverlapping explicitly (in a sense of the specification), or it means as long as we are using the equivalent APIs (e.g., using raw libc::memcpy) it is safe (all the time, regardless of internal implementation changes in future)?


#4

If you have an uninitialized value of a type that implements Drop (so this is not applicable to Copy types, but I’ll mention it anyways), then a normal store to it would attempt to drop the old value - that may segfault. That’s why ptr::write (and similar functions) is needed - it doesn’t do anything to the “old” value at the location - it just copies the new value into it.

The above is a concrete example of where things can go wrong when you mix uninitialized data with “plain” stores.


#5

Thanks and it is clear to me now!

  1. The segfault I met is probably caused by during the “plain” stores of the real concrete values such as vectors, strings etc, the behavior of dropping the uninitialized values is undefined (then the generated runtime code may do something dangerous and or merely raise the signal).
  2. Meanwhile, when one field of a struct is Copy, a normal store invokes drop of the field’s value, which however is a no-op, therefore it is actually safe!
  3. Using ptr::write etc to initialize previously uninitialized field is safe since it will not invoke drop.