How to create a long array with non-copyable element?

UB means that there has been a contract violation with the compiler

No, it means that behavior is not strictly defined and compiler can assume whatever it wants.
Since Rust is not strictly standardized as C or C++ it would be difficult to find definition of UB in Rust, but I doubt that UB would be different from UB in C or C++ (since it uses LLVM)

Whose fault is that? The programmer’s.

Yes, so what?
If programmer used UB then he knew what he wanted

But the code shown above is able to create values of type Uninhabited

Impractical examples are not the best to use.
If value size is 0 then obviously compiler wouldn't write anything and actually should optimize away usage of such array at all (aside from when it's side effects are used i.e. looping)
I of course don't know how compiler behaves with arrays of T where size_of::<T>() == 0

Nevertheless, it doesn't I don't mind deprecated uninitialized which I use by habit instead of MaybeUninit, but my point is that UB and unsafe are fine if you know what to do

@RustyYato
I'm perfectly well aware of Rust differences when it comes to unsafe.
Not to mention I use Rust since it's first stable release.
But such common things as uninitialized value is all the same as in C++ (well ok excluding zero sized values)

By the way, I've checked your example in the playground (there's a lot of boilerplate down here, since I've almost copied literally two extern crates). It seems that now it goes down to mem::uninitialized and panics on trying to create uninhabited values, even in release mode. I'm not trying to underestimate the danger, just saying about the concrete example.

1 Like

Cool, @RalfJung made an article yesterday about uninitialized memory. Given the current topic, I thought I would mention it here

No, it means that behavior is not strictly defined and compiler can assume whatever it wants.
Since Rust is not strictly standardized as C or C++ it would be difficult to find definition of UB in Rust, but I doubt that UB would be different from UB in C or C++ (since it uses LLVM)

A lot of people don't know what UB mean's, even in C, The best definition is that UB is UB. When your program trigger any UB, you can't expect anything after this point. The compiler optimization can make your program do thing you will never expect. See a very good exemple: Krister Walfridsson’s old blog: Why undefined behavior may call a never-called function. The best article about UB I know is What Every C Programmer Should Know About Undefined Behavior #1/3 - The LLVM Project Blog.

Be aware that UB is not something "we deal with" like you seem to think in C.

A lot of people don’t know what UB mean’s, even in C, The best definition is that UB is UB.

Do not try to be smart
Definition of UB is pretty simple - it is behavior that is not defined by language specification.

Read Regehr. In C the compiler is allowed to assume that the programmer has ensured that UB will never ever happen, and reasoning based on that assumption, the compiler may eliminate or optimise accordingly, creating all kinds of unexpected scenarios. As I said, read Regehr's blog to see why that is a problem.

UB is not the same as implementation-defined.

If you argue purely for the sake of arguing then you can come to conclusions like: difficult == pretty simple...

I only leave quote from standard which I was paraphrasing:

http://eel.is/c++draft/intro#abstract-5

A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible executions of the corresponding instance of the abstract machine with the same program and the same input.

However, if any such execution contains an undefined operation, this document places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation).

I never mentioned it is implementation defined, but it is actually is because it is up to compiler to treat UB code whatever it wants.
The approach to assume that UB is not present is only existing for the sake of optimization, it doesn't strictly define UB, just most common way to treat UB.

Anyway it is off-topic

If the programmer uses uninitialized memory, this is UB, and as you say, "compiler can assume whatever it wants." If that's what the programmer "knew what he wanted", then the programmer must want to be surprised, because it's not safe to assume anything about what the compiler will do with this.

So sure, if that's your attitude with UB, then go for it. But most people come to Rust to write safer code with predictable behavior, and UB is an anathema.

3 Likes

UB is a contract violation. This is the same in Rust, C and C++. Please make sure you understand that before continuing discussion about it.

UB means behavior is not defined at all. Not just "not strictly defined". But literally not defined. The compiler assumes that your program has no UB. Literally any statement you could make about the final program only makes any sense after establishing that there is no UB.

It's like a statement that "on Wednesdays, the shop is open". If it's not Wednesday, that statement is useless and tells you nothing. The corresponding statement for C/C++/Rust is "if the program has no UB, then the compiler will produce assembly that matches the source code you wrote". If you program has UB, this statement tells you nothing, and indeed the compiler promises you nothing.

This is exactly what a contract is: two parties (programmer and compiler) mutually agreeing to some terms where each side has its obligations (programmer: not have any UB in the program; compiler: generate machine code that matches what the programmer wrote). If the programmer violates that contract, there is nothing that can be said about what the compiler does. If you ever intend to run your program, having UB is never okay.


@pearzl The MaybeUninit docs contain an array initialization example. Does that help?

16 Likes

I'm learning, didn't reach that part yet.

Since you only need 26, you can use

let dict: [Vec<usize>; 26] = ArrayTools::generate(Vec::new);

which is done with no unsafe code:

2 Likes

Just curious here, why does:

x : T = ...;
let dict: [T; 26] = [x; 26];

require T : Copy, when it seems to me that T : Clone should suffice?

If nothing else, Copy makes things easier -- that also means no Drop, and therefore no partial drops. Clone could panic, and then you'd have to drop whatever was filled so far.

Let's start with some hyper-realistic example. In Rust, bool is a 1 byte sized and should only contains 0 or 1 bit pattern, otherwise it's UB. And this is an innocent-looking icecream-making code for your naval crew.

#[inline(never)]
fn crypto_set_false(flag: &mut bool) {
  flag = false;
}

fn main() {
  let mut madness: bool = unsafe { std::mem::uninitialized () };
  crypto_set_false(&mut madness);
  if madness {
    launch_nuke();
  } else {
    make_icecream();
  }
}

Can this code launch the nuclear missile to people's head? Well in crypto_set_false() function it's totally safe to generate assembly that only sets flag's LSB as 0, because it should only contains bit pattern 0 or 1. But mem::uninitialized() doesn't care about it, so after crypto_set_false(&mut madness); it can contains non-zero-byte as a value, so evaluated as true in machine code, and BAAM.

If the array initializer expression is const, there's also an accepted RFC to make that work. Tracking issue:

https://github.com/rust-lang/rust/issues/49147

Making Vec::new a const fn seems realistic; it does not allocate after all.

1 Like

Wow, I just had a look at that example and it's almost identical to the solution I proposed at the start. It makes me really happy that I independently found the same solution for initializing unsafe memory as the docs team :blush:

2 Likes

Nice. :slight_smile: The only difference sticking out to me is that you did a transmute where a assume_init would have been enough.

I didn't think you could use assume_init() to go from [MaybeUninit<Vec<f32>>; 64] to [Vec<f32>; 64]?

Oh, good point. The example in the docs also uses transmute.

The hope is to get nicer APIs for arrays eventually, but that is blocked on const generics.