Producing invalid primitive values == undefined behavior?


#1

Here are a couple of things Rustonomicon says are undefined behavior:

  • Reading uninitialized memory
  • Producing invalid primitive values:
  1. dangling/null references
  2. a bool that isn’t 0 or 1
  3. an undefined enum discriminant
  4. a char outside the ranges [0x0, 0xD7FF] and [0xE000, 0x10FFFF]
  5. A non-utf8 str

Notice it says “reading uninitialized memory”, not “producing uninitialized memory” and also that it says “producing invalid primitive values”, not “reading invalid primitive values”. I interpret this so that it’s undefined behavior to create an invalid primitive value even if you never read the value in its invalid state. Please clarify if I’m interpreting this correctly or not.

For example, the following things would be considered undefined behaviour according to my interpretation of the Rustonomicon:

// May produce an invalid char value
let mut a: char = unsafe { std::mem::uninitialized() };
let a = 'A';

enum Enum { Foo, Bar }
// May produce an invalid Enum value
let mut e: Enum = unsafe { std::mem::uninitialized() };
e = Enum::Foo;

#2

These are the kinds of questions that are still being hashed out in the work to formalize the unsafe guidelines and memory model, but I believe the usage here of uninitialized is valid. The docs specifically talk about the bool case, for example: https://doc.rust-lang.org/std/mem/fn.uninitialized.html#undefined-behavior.

There’s an important distinction between uninitialized values and invalid values - invalid values can’t exist, while uninitialized values can’t be read.


#3

A couple of interesting implications of this:

It is sometimes UB to use mem::zeroed but not mem::uninitialized if the value you’re creating is invalid when zeroed (e.g. a reference or Box type).

It is (maybe?) UB to instantiate an uninhabited type like ! via mem::uninitialized since it has no valid values.