UB means that there has been a contract violation with the compiler; there is no such thing as "non-issue UB". Maybe some compiler does not exploit something resulting in an implementation-based platform-based definition of behavior. Meaning that you would have a "safe" crate for a specific version of rustc
and a specific architecture. In other words: you might as well just be sharing a binary release of the program.
Taking the uninitialized integer example, maybe some version of the compiler and on some architecture the function always returns true
because the compiler does not exploit uninitalized integers even though it could. Later on, a new version of the compiler realises it can exploit it for more efficient binaries, resulting in the always_return_true
function breaking. Whose fault is that? The programmer's.
Now, coming back to mem::uninitialized
, when the type is inhabited, using only ptr::write
on it could be seen as fine (e.g., for integer types this is being deliberated),
but there are cases when even this is clearly not fine.
Generic mem::uninitialized<T>
is unsound
Take, for instance, ::array-init
, with a generic (over Array
and its Array:Item
) usage of mem::uninitalized
:
- EDIT: this comment targeted the version of
::array-init
as of its writing: 0.0.4
.
::array-init
has since been patched to correctly use MaybeUninit
pub
fn array_init<Array, F> (mut initializer: F) -> Array
where
Array : IsArray,
F : FnMut(usize) -> Array::Item,
{
let mut ret: NoDrop<Array> = NoDrop::new(unsafe { mem::uninitialized() });
// <At this point Rust knows that **we have elements of type Array::Item**>
for i in 0 .. Array::len() {
Array::set(&mut ret, i, initializer(i));
}
ret.into_inner()
}
-
Knowing that in Rust it is perfectly valid to define:
enum Uninhabited {}
then there is literally no value that can be of type Uninhabited
(this is not something you can know coming from a C background, since C does not have uninhabited types).
Meaning that if some code were to witness such an element, then that code cannot possibly be reached. So, if reaching that branch was based on some condition, then Rust is allowed to skip checking the condition altogether.
But the code shown above is able to create values of type Uninhabited
:
enum Uninhabited {}
fn trust_me_this_cannot_be_false (condition: bool)
{
if !condition {
let unreachable: [Uninhabited; 1] = ::array_init::
array_init(|_| -> Uninhabited {
loop {} // an infinite loop typechecks with everything
})
;
}
}
If the above function is given a false
condition, you may think that it may loop indefinitely. But it so happens that Rust is allowed to instead assume that the condition is never false
, without even checking it (c.f., the code of array-init
: the closure is called after having created uninitialized inhabitants, i.e, too late).
And now we can have memory unsafety:
fn main ()
{
let slice: &mut [u8] = &mut [];
trust_me_this_cannot_be_false(slice.len() == usize::MAX);
for i in 0 .. usize::MAX {
// Since array.len() == usize::MAX, and i < usize::MAX, bound checking can be skipped
slice[i] = 0x42; // memory corruption
}
}
I am not saying that the above program will always corrupt the memory (it could loop indefinitely, abort, or whatever), I am just saying that it would be legal for the compiler to corrupt the memory with it.
All this just because mem::uninitialized
was used on a generic type. I wouldn't call this "non-issue UB"
The difference with MaybeUninit
, by the way, is that MaybeUninit<Uninhabited>
is inhabited.
Only when calling assume_init
, after the closure is called, would he have unreachable code.
Which is fine, since a closure forging an element of an uninhabited type cannot possibly return (it must loop {}
indefinitely, or die / end the thread of execution (e.g., abort
ing)).