Why is dropping uninitialized memory considered bad?

I see that Uninitialized Memory - The Rustonomicon states that

All runtime-allocated memory in a Rust program begins its life as uninitialized . In this state the value of the memory is an indeterminate pile of bits that may or may not even reflect a valid state for the type that is supposed to inhabit that location of memory. Attempting to interpret this memory as a value of any type will cause Undefined Behavior.

I am assuming that drop somehow interprets the memory - which is UB. If so, why does drop have to interpret the memory at all - initialized or uninitialized?

2 Likes

When a value is dropped, its destructor runs, as well as destructors for its fields (and their fields, etc.). If the value is uninitialized or otherwise invalid, the code in the destructors may access that invalid memory, causing UB.

13 Likes

Consider, for example, what happens when you drop an uninitialized Vec? The memory allocator will try to follow invalid pointers and free memory that wasn't allocated, likely leading to memory corruption and crashing the process.

8 Likes

From Vec documentation it is supposed to be a contiguous buffer. Do you mean that if an item in vector is a compund type and it contains pointers, then those pointers will be followed for recursive dropping?

Vec is a struct that contains a pointer to a contiguous buffer. When it is dropped, its destructor first drops all of the items in the Vec, then it frees the buffer.

If the items in the Vec are uninitialized, and they have destructors, then those destructors will run on invalid values, potentially causing UB including following invalid pointers (as you mention).

In addition, if the Vec struct itself is uninitialized, then its own destructor will have undefined behavior because the Vec's pointer (and length, and capacity) will not point to a valid buffer.

Note that uninitialized pointers are not the only source of undefined behavior. For example, if a type that contains a bool or enum field is uninitialized, code that accesses that field could read bits that are not one of the allowed values for the type. Even integer types can cause undefined behavior when uninitialized.

4 Likes

No. It has nothing to do with recursive dropping or whether things are continuous or not. Vec itself contains a pointer to a heap-allocated buffer. If the value of this pointer is garbage, then that garbage pointer will be passed to the allocator, and that in itself can (and usually does) cause issues.

Allocators usually assume that a pointer that is being freed does in fact point into their internal representation. If that assumption is broken, the allocator has no 100% correct and efficient way of detecting this, and it usually just tries to dereference the garbage pointer anyway, in order to read the metadata necessary for deallocation or some other bookkeeping. It is this usage of the garbage pointer that is erroneous and UB.

In C, there is a difference between unallocated and uninitialized memory. In C, if memory is allocated on heap (and not stored into), pointers to that memory are valid with respect to free but a load through the pointer is UB.
If I understand your answer correctly then allocators load the memory necessarily on dealloc and therefore dealloc of uninitialized memory is UB.

You haven't understood correctly. @H2CO3 is talking about the pointer itself being uninitialised. The allocator in both C and Rust doesn't care what is stored in the memory pointed to, but both will yield undefined behaviour if an uninitialised pointer is passed to be deallocated. For example, the following would be undefined behaviour in C:

int main() {
    int * ptr;
    free(ptr);
    return 0;
}
1 Like

Nope, that's not what I'm talking about. I'm not talking about the contents of the buffer whatsoever. I'm talking about the buffer pointer of the vector. In C, this is also Undefined Behavior:

int *foo = (int *)0x123456789; // garbage, not allocated by `malloc`
free(foo);
// UB because `free()` can only accept a pointer returned by `malloc`

The equivalent of the above happens when Vec::drop() tries to deallocate its own buffer pointer.

Thanks @H2CO3 for the clarification. I am specifically talking about UB because of dropping allocated but uninitialized memory as mentioned in the reference. Here is an example.

fn main() {
    {
    // Allocate vector big enough for 4 elements.
        let size = 4;
        let mut x: Vec<i32> = Vec::with_capacity(size);
    
        // set length but don't initialize elements
        unsafe {
            x.set_len(3);
        }
    } // x dropped, UB?
}

This is my understanding.

  1. x is a valid reference
  2. When the life time of x ends, the vector object it points to - will be dropped
  3. If the semantics of drop are just free then no UB since memory was allocated correctly.
  4. If the semantics of drop include load of the memory, then following C semantics, it will be UB since we will read 0...3 indices which are uninitialized and attempt to drop them individually.

Based on the answers above I believe point 4 is true for the reason I mention.

Well, the Vec owns its data so it does run the destructor of each element, otherwise it would leak memory if the destructor of the item type is nontrivial. So, if your elements are uninitialized, that is also UB, of course.

To me, this is not called an 'uninitialized vector', though – it's more like a vector with some items initialized and others not. An uninitialized vector would be a vector where the (pointer, length, capacity) triplet is itself uninitialized, as opposed to the actual memory region behind the buffer.

1 Like

The documentation of set_len explicitly say the following:

The elements at old_len..new_len must be initialized.

So you triggered UB when you called set_len. It doesn't matter what happens later.

In practice, since the item type is an integer, the given example probably doesn't trigger any miscompilation, but if you had e.g. an Vec<Vec<u8>> we would be back to the issue from before.

6 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.