Is this use of Copy sound?

I've heard people say that the copy of a Copy type is guaranteed to be just a memcpy. This blog is one such example but I've seen many others as well.
This doesn't sound right to me because copy is a safe trait which means Rust shouldn't be allowed to make such an assumption. The code below compiles which seems to support my intuition but it requires unsafe so I don't know for sure whether this is actually allowed or not.

struct MyString {
	ptr: *const u8,
	len: usize,
	cap: usize,
}

impl Clone for MyString {
	fn clone (&self) -> Self {
		let string = unsafe{String::from_raw_parts(self.ptr as *mut _, self.len, self.cap)};
		let new_string = string.clone();
		let (ptr, len, cap) = (string.as_ptr(), string.len(), string.capacity());
		std::mem::forget(string);
		std::mem::forget(new_string);

		Self {
			ptr,
			len,
			cap
		}
	}
}
impl Copy for MyString {}

Ignoring the leaked strings, is this code actually sound?

As far as I can understand, the real distinction between non-Copy and Copy types is that Copy types are not allowed to have drop, whether it is drop glue (so no fields that are not copy) or an explicit drop. Is this intuition correct?

1 Like

That’s not quite right, because every move of a type is a memcpy too. To be precise, what Copy means is: the compiler allows users of your type to use the value in the place that was moved out of, effectively duplicating the value bytewise. An example of that is:

let x: i32 = 1;
let y = x;         // move value from x to y...
x + y              // ...but `i32: Copy`, so `x` is still accessible

If a non-Copy type were put in this code, you would get the “use of moved value” error, because the value would be moved into y and x would be unusable (considered de-initialized).

Copy makes no assumptions about what Clone does. All Copy does is enable the duplication I described above. It is your responsibility as a library author to be sensible and implement Clone such that it does the same thing as Copy, whenever you implement both. (But no soundness relies on keeping them consistent, other than that of your own unsafe code.)

Another way of looking at it is: When you implement Copy, you’re giving your type’s users permission to bypass your Clone impl.

The leaking is part of the problem here. The code is sound, but hypothetically, if you added the proper Drop impl (and somehow bypassed the restriction on Copy + Drop), the Copy impl would allow users of MyString to create double frees, by copying and then dropping both the copy and the original. So, in order to implement a correct smart pointer of this sort (both sound and doesn’t leak), you can't implement Copy.

7 Likes

By implementing Copy you're explicitly giving the compiler permission to make copies of your type – opting out from affine typing. This is only possible of all the fields are also Copy. You're allowed to additionally implement Clone however you wish, but it's almost always a mistake to do so. If your unsafe code relies on the assumption that a value of a Copy type doesn't have copies, then, well, it's the unsafe code that's wrong.

1 Like

Copying raw pointers is never unsound, even if they came from a String. In fact, you can do almost anything to a raw pointer without unsafe, except for interacting with its pointee. As the author of a type using raw pointers, it is your responsibility that your type does not do anything to the raw pointers that can be used to perform UB.

Your Clone implementation is dubious, but since you have not given us a way that an instance of MyString can be constructed, it's hard to actually make statements about it (other than that it probably assumes things about String's implementation that are not guaranteed to be true).

As a dumb example, if there is no way to construct MyString, then this API is definitely sound.

What you're saying is that any implicit copies generated by the compiler do not use Clone::clone and will directly use a memcpy?

Does that mean that in this code, the count will not be incremented for implicit copies?

static COUNT: AtomicUsize = AtomicUsize::new(0);
struct MyStruct;

impl Clone for MyStruct {
	fn clone (&self) -> Self {
		COUNT.fetch_add(1, Ordering::Relaxed);
		Self
	}
}
impl Copy for MyStruct {}

Yes, that is all correct.

1 Like

The example code wasn't going to be actually used anywhere so i didn't bother with a constructor but it would have been something like this

impl MyString {
	fn new(mut data: String) -> Self {
		let (ptr, len, cap) = (data.as_mut_ptr(), data.len(), data.capacity());
		std::mem::forget(data);
		Self {
			ptr,
			len,
			cap,
		}
	}
}

This is not relevant to the original topic, but just FYI, it is generally better practice to use ManuallyDrop instead of forget():

fn new(mut data: String) -> Self {
    let data = ManuallyDrop::new(data);
    let (ptr, len, cap) = (data.as_mut_ptr(), data.len(), data.capacity());
    ....
}

The advantage of this is that if there is a panic in an unplanned-for position, the failure mode is a leak (because ManuallyDrop disabled the String’s destructor) rather than a double-free (because both String's destructor and the destructor of what you were in the middle of constructing ran).

This isn’t a problem in this particular case because you did the forget before constructing anything, but in more complex cases it can be necessary.

Interesting, apparently that way of taking apart and reconstructing a String is explicitly documented in from_raw_parts. The clone implementation still makes me uncomfortable though. I think it can cause UB if string.clone() panics due to memory exhaustion (I'm not sure if memory exhaustion causing aborts is considered guaranteed behavior but I'm going to guess not)