Std::ptr::read behavior

can someone explain me how std::ptr::read() works ? I read the documentation many times but can't get my head arount it :confused: . It says that it "copies" the value, but do not move it, however it still run the Drop of the read value (while not having ownership).

Plus when I read then modify the field of my struct for examples it does not reflect in the pointer, so it would mean that it does not point to the same memory location (which can be verify by checking adresses).

so I really do not understand how this work compared to * operator that kind of do the same but not really :confused:

1 Like

The ownership story of read can indeed be a bit confusing; this is because it’s an unsafe operation. It does “copy” the value, but for many non-Copy types this means that you can very easily create yourself some unsound results, e.g. a double-free if the original value is still also being dropped … or illegal aliasing if the original value is still being used.

Nonetheless, ptr::read can be quite useful when building some low-level things, especially once you interact with an allocator manually. In such contexts, ptr::read is usually conceptually just a “move” operation, just that you-the-programmer are in charge for upholding the guarantees of moving, i.e. that the moved-out-of value is no longer used and also no longer dropped. One example is a function such as Vec::pop, where you can find ptr::read in the source code. The operation of Vec::pop logically moves the last element of the Vec out, and achieves this soundly by updating the len field accordingly, which the other Vec methods use to determine which part of the vec is considered initialized, effectively marking that last – now moved-out-of – element as “uninitialized” again.


Comparing to * operator is a bit of an open-ended question; you’d need to narrow down what exactly to compare to.

Comparing to * operator on *const T pointers it probably reasonable … still, the * operator creates just a “place expression” so what exactly happens then depends a lot on how you actually use it. For maximal analogy to ptr::read, we can compare to *-operator being used as a value, i.e. something like

let x = *ptr;

now, this operation will only work for Copy types, and for those, it’s the same as let x = ptr::read(ptr);.

Other things the * operator supports are e.g. taking a reference to the target, e.g. &*ptr, this is not something the ptr::read operation can be used for. Or you could create a mutable reference to; or assign to the place, though these are more commonly done with *mut T pointers. [Note however that as long as the pointer was created with mutable access, a *const T can be used for mutation, too.] Other things you can do with a place expression: Take a pointer to the target with add_of! or addr_of_mut!, or project further to a field of slice index, which gives a new place expression, with which you could in turn do any of the things I’ve listed so far. * operator usage is indeed quite versatile. With a dereferenced raw pointer, each of these possible operations could have different safety implications; and besides just reading by-value of the whole target (which – still – only actually works for Copy types), all of these are different from ptr::read.

6 Likes

It's implemented using compiler magic for compile-time performance reasons, but you could write it yourself. It's just this:

pub unsafe fn read<T>(p: *const T) -> T {
    let mut temp = std::mem::MaybeUninit::uninit();
    // The passed-in pointer can never overlap a local
    std::ptr::copy_nonoverlapping(p.cast(), std::ptr::addr_of_mut!(temp), 1);
    // Now that we copied something the caller said is valid, this is valid
    temp.assume_init()
}

TBH, it's also arguably wrong that it works how it does. If we were writing it today it might be -> ManuallyDrop<T> instead, to better reflect what you can do with the result in general. It's replace that makes more sense for giving a full -> T.

1 Like

Other question, why is it UB to use read on unaligned pointer ? Is it because of the underlying ASM instructions ?

and from the implementation you show me (if it basically is that), then read and read_unaligned are the same implementation :confused:

1 Like

read_unaligned uses copy_nonoverlapping with byte pointers, which has an alignment requirement of 1 which is always trivially satisfied, while read uses a typed pointer which can have a bigger alignment requirement.

2 Likes

In practical terms, yes, it’s because the compiler is allowed to emit a simple load opcode or, depending on how the value returned by read() is used, the compiler might not copy the bytes at all and instead access the address directly.

2 Likes

It's because on some architectures you can't do unaligned reads, and sometimes there are more efficient instructions available if things are aligned. For example, movaps vs movups in x64.

For a chip, it's much easier to not have to worry about potentially

  • updating two different cache lines coherently
  • needing two different TLB pages
  • handling two different memory protection levels
  • needing two different RAM sticks
  • needing two different chips on those RAM sticks
  • etc

The big desktop chips have transitors to handle it, but especially smaller (and more power-efficient ones) don't want to bother in the normal case.

No, read_unaligned copies using *mut u8, rather than *mut T.

The actual implementation of read is to lower it to exactly the same thing as a pointer dereference in MIR

Just in a way that bypasses the "wait, you can't do that unless it's Copy" check.

(That fixed a whole bunch of "wait, why is p.read() slower than *p for integers?" kinds of bugs.)

3 Likes