How to create a long array with non-copyable element?

I’m trying to do this:

``````let dict: [Vec<usize>; 26] = [vec![]; 26];

``````

However it doesn’t works because `vec` is not implementd `Copy`.
So I have to write like this:

``````let dict: [Vec<usize>; 26] = [vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![],vec![]];
``````

is there a better way? thanks

The answer probably depends on your context, but by far the easiest way is to just use a `Vec<Vec<usize>>` with `26` elements and initialize it in a loop. Arrays in Rust aren’t as useful as in other languages.

I think the problem is that there’s no (safe) way to create an array of `Vec<usize>` on the stack in one operation… Normally to initialize a `[T; n]` array it’d use a `memcpy` to create `n` copies of a valid bit pattern of `T`, but because `Vec<usize>` isn’t `Copy`, that `memcpy` operation wouldn’t be sound. That means you’ve got to initialize your array of items one-by-one which isn’t really a safe operation to do (i.e. safe Rust assumes all variables contain a valid copy of the type they contain, but if only half your array is initialized then the other half isn’t, and that assumption is invalid).

Another answer is to use `std::mem::MaybeUninit` and some `unsafe` code. This is my attempt (playground) at initializing an array using non-`Copy` items.

I know @nikomatsakis is coordinating the unsafe code workgroup, so hopefully he may know a better way of doing this or be able to point you in the right direction, because my answer using `unsafe` code kinda sucks…

1 Like

There’s the array-init crate.

5 Likes

use `mem::uninitialized` and properly fill in array.

Sadly there is no good way to do it otherwise, other than doing some macro

That’s what I was getting at with `MaybeUninit`. From what I’ve heard `mem::uninitialized()` is insta-UB

3 Likes

it’s enlightening!
Although I can’t use third-part lib now.
Thanks anyway.

`::array-init` crate is not sound, since it relies on a generic and unbounded `mem::uninitialized`.

Now that `MaybeUninit` is stable, working with generic uninit buffers can finally be done soundly (although it still requires great care).

See the following stand-alone solution:

``````macro_rules! array {(
\$closure:expr; \$N:expr
) => ({
use ::core::{
mem::{
forget,
MaybeUninit,
},
ptr,
slice,
};

const N: usize = \$N;

#[inline(always)]
fn gen_array<T> (mut closure: impl FnMut(usize) -> T) -> [T; N]
{
unsafe {
let mut array = MaybeUninit::uninit();

struct PartialRawSlice<T> {
ptr: *mut T,
len: usize,
}

impl<T> Drop for PartialRawSlice<T> {
fn drop (self: &'_ mut Self)
{
unsafe {
ptr::drop_in_place(
slice::from_raw_parts_mut(
self.ptr,
self.len,
)
)
}
}
}

let mut raw_slice = PartialRawSlice {
ptr: array.as_mut_ptr() as *mut T,
len: 0,
};

(0 .. N).for_each(|i| {
raw_slice.len += 1;
});

forget(raw_slice);
array.assume_init()
}
}

gen_array(\$closure)
})}

fn main ()
{
// init by providing a FnMut closure mapping each index to each value
let dict: [Vec<usize>; 26] = array![|idx| vec![]; 26];
dbg!(&dict[..]);
}
``````

AFAIK as long as you correctly use `ptr::write` to initially write value it is safe. The same as `MaybeUninit`.
It is just `MaybeUninit` makes a clear distinguish between uninitialized value and regular one as you can use `assume_init` to retrieve initialize `T`

But both require careful write as you must not replace value inside uninitialized memory (i.e. requires to use pointers to write into)

`mem::uninitialized` will in fact be insta-UB, if the type in question has incorrect values, regardless of how carefully you’re initializing this memory afterwards.

1 Like

Insta UB! No matter what. That’s the reason it has been deprecated in favor of `MaybeUninit`.

1 Like

Then initialize with correct value, I don’t understand the problem.

You can wrongly initialize even `MaybeUninit` and then just assume it is correct, and get the same UB.
So to me it is the same shit, you need to write raw pointers to initialize such memory.

It is only question of safer semantics with `MaybeUninit`, it’s nature is the same.

P.s. I’m coming from C background so don’t scare me with non-issue UB

Compared to C, Rust requires far much stricter requirement for the sake of aggressive optimization. Those are auto-proved by compiler in safe context, but in unsafe context it’s your responsibility to provide all of them. Unsafe Rust is more unsafe than C.

2 Likes

https://www.ralfj.de/blog/2019/07/14/uninit.html

`@RalfJung` posted a very neat article about this. Maybe give it a try

Look, pal, I started with C and C++ so I’m pretty well aware of what are risks of uninitialized memory, especially in C++ with its object model.
So you don’t need to explain it to me, I just want people stop disregarding any solution that has `unsafe` or UB

P.s. especially when there is no safe option

“pal”…

You can’t tell a person new to rust: “Don’t worry about unsafe, just use this `mem::uninitialized` and you’ll be fine”. That’s not how it works. That’s not how it ever should work!
Sure, you might have a C background, but OP doesn’t. He may now think: unsafe? Seems like it is the solution to all my problems! Let’s get starting!
Hell no! `unsafe` should only be the solution, if you know how to deal with a summoned daemon.
Also: Always disregard a solution that has UB, because it is not defined behavoir. Look at the very first example of ralfs code:

``````fn always_returns_true(x: u8) -> bool {
x < 150 || x > 120
}

fn main() {
let x: u8 = unsafe { mem::uninitialized() };
assert!(always_returns_true(x));
}
``````

this always returns false no matter what. You can’t explain this to a person new to rust. (This returns false every time in this specific environment with the exact same compiler). This is UB. Period.
When you are using `unsafe` you have to make certain guarantees to the compiler and if you fail to do so, it is UB. There is nothing like: Let’s take this UB and take it to our advantage. That’s not how it works (Also not in C).

5 Likes

Between two solutions, which are both `unsafe`, but:

• the first can trigger insta-UB in several cases, and it’s not entirely clear when it will and when it won’t,
• the second can be proved (not by compiler, but by programmer) not to trigger UB at all,

I think the second is clearly preferable, and that’s the point of insisting on `MaybeUninit` over `mem::uninitialized`.

The problem is that Vec implement drop so if the program is stop for whatever reason (multi treading) your array of not initialized vec will make your program enter in UB state. Maybeunit is a tool that prevent Drop implementation to be call. That why it’s much better and not UB.

Sure, you might have a C background, but OP doesn’t. He may now think: unsafe? Seems like it is the solution to all my problems! Let’s get starting!

People learn from their mistakes, eventually we all would need to use unsafe code.

Hell no! `unsafe` should only be the solution, if you know how to deal with a summoned daemon

If that’s the case, then Rust’s unsafe is one big flaw and not feature.
But that’s false, and you know it.
So suggesting unsafe is fine.

this always returns false no matter what. You can’t explain this to a person new to rust.

Not true?
UB means it can be false or true.
But that’s not the problem actually, because code is perfectly safe with uninitialized integer as it doesn’t have dtor.

@Cerberuser

Between two solutions, which are both `unsafe` , but:

If you mean between `MaybeUninit` then yes, of course.

It has better semantics to avoid obvious pitfals

@Stargateur

Sure, but you can properly initialize it and there will be no problem.
`MaybeUninit` is not good for usage as type since it is needed only for one time initialization.
So use case would be initialize `MaybeUninit<[T; N]>` and then call `assume_init` to get proper array

UB means that there has been a contract violation with the compiler; there is no such thing as "non-issue UB". Maybe some compiler does not exploit something resulting in an implementation-based platform-based definition of behavior. Meaning that you would have a "safe" crate for a specific version of `rustc` and a specific architecture. In other words: you might as well just be sharing a binary release of the program.

Taking the uninitialized integer example, maybe some version of the compiler and on some architecture the function always returns `true` because the compiler does not exploit uninitalized integers even though it could. Later on, a new version of the compiler realises it can exploit it for more efficient binaries, resulting in the `always_return_true` function breaking. Whose fault is that? The programmer's.

Now, coming back to `mem::uninitialized`, when the type is inhabited, using only `ptr::write` on it could be seen as fine (e.g., for integer types this is being deliberated),

but there are cases when even this is clearly not fine.

Generic `mem::uninitialized<T>` is unsound

Take, for instance, `::array-init`, with a generic (over `Array` and its `Array:Item`) usage of `mem::uninitalized`:

• EDIT: this comment targeted the version of `::array-init` as of its writing: `0.0.4`.
`::array-init` has since been patched to correctly use `MaybeUninit`
``````pub
fn array_init<Array, F> (mut initializer: F) -> Array
where
Array : IsArray,
F : FnMut(usize) -> Array::Item,
{
let mut ret: NoDrop<Array> = NoDrop::new(unsafe { mem::uninitialized() });
// <At this point Rust knows that **we have elements of type Array::Item**>
for i in 0 .. Array::len() {
Array::set(&mut ret, i, initializer(i));
}
ret.into_inner()
}
``````
• Knowing that in Rust it is perfectly valid to define:

``````enum Uninhabited {}
``````

then there is literally no value that can be of type `Uninhabited` (this is not something you can know coming from a C background, since C does not have uninhabited types).
Meaning that if some code were to witness such an element, then that code cannot possibly be reached. So, if reaching that branch was based on some condition, then Rust is allowed to skip checking the condition altogether.

But the code shown above is able to create values of type `Uninhabited`:

``````enum Uninhabited {}

fn trust_me_this_cannot_be_false (condition: bool)
{
if !condition {
let unreachable: [Uninhabited; 1] = ::array_init::
array_init(|_| -> Uninhabited {
loop {} // an infinite loop typechecks with everything
})
;
}
}
``````

If the above function is given a `false` condition, you may think that it may loop indefinitely. But it so happens that Rust is allowed to instead assume that the condition is never `false`, without even checking it (c.f., the code of `array-init`: the closure is called after having created uninitialized inhabitants, i.e, too late).

And now we can have memory unsafety:

``````fn main ()
{
let slice: &mut [u8] = &mut [];
trust_me_this_cannot_be_false(slice.len() == usize::MAX);
for i in 0 .. usize::MAX {
// Since array.len() == usize::MAX, and i < usize::MAX, bound checking can be skipped
slice[i] = 0x42; // memory corruption
}
}
``````

I am not saying that the above program will always corrupt the memory (it could loop indefinitely, abort, or whatever), I am just saying that it would be legal for the compiler to corrupt the memory with it.

All this just because `mem::uninitialized` was used on a generic type. I wouldn't call this "non-issue UB"

The difference with `MaybeUninit`, by the way, is that `MaybeUninit<Uninhabited>` is inhabited.
Only when calling `assume_init`, after the closure is called, would he have unreachable code.
Which is fine, since a closure forging an element of an uninhabited type cannot possibly return (it must `loop {}` indefinitely, or die / end the thread of execution (e.g., `abort`ing)).

11 Likes

I would like to point out the obvious (I don’t mean to offend, it just seemed to get lost in this discussion). Unsafe Rust is not the same as C. So even if somethings look similar (uninitiated memory, pointers) that doesn’t mean that what appies to C also applies to Rust. Rust makes more guarantees about its types to allow more aggressive optimizations. Since you seem to be new to Rust, you will need to learn some more about Rust to understand the differences from C and how they help. For example, how uninhabited types interact with control flow as @Yandros pointed out.

2 Likes