Is there a way to copy &[T] into &mut [MaybeUninit<T>] without unsafe?

mgeier · November 10, 2020, 8:50pm

There is copy_from_slice() to copy all elements of a slice into another slice, which panics if the two slices have different lengths.

Is there a similar thing to copy an initialized slice (i.e. &[T]) into an uninitialized slice (i.e. &mut [MaybeUninit<T>])?

In addition to copying the elements (and panicking on different lengths), this should also do something like assume_init() and finally return an initialized slice (i.e. &mut [T]).

All this assuming T: Copy of course.

I think I know how to do that with unsafe code, but I would feel much more comfortable with a safe wrapper.
I think that should be possible, or am I missing something?

I had a look at the docs but I didn't find anything.
The closest I could find was the nightly feature maybe_uninit_slice with slice_assume_init_mut(), which would probably take care of the last step, but I couldn't find the combined functionality in a safe package.

alice · November 10, 2020, 9:27pm

The actual copying is easy, but I don't know of a safe way to get a non-MaybeUninit slice of the copied data afterwards.

scottmcm · November 11, 2020, 12:15am

The problem is that there's fundamentally no way for the compiler to know that a function that was called actually initialized all the MaybeUninits just because it was given a &mut [MaybeUninit<T>]. There has to be something using unsafe to make the "yes, they're definitely all initialized now" promise.

Can you say more about the context in which you're doing this? There are a bunch of things that internally do this, like Vec::extend_from_slice, so there might be a way to avoid the &mut [MaybeUninit<T>] in the first place...

mbrubeck · November 11, 2020, 12:42am

I think this would likely be a useful safe function to add to the standard library. For now, as you said, you can make your own safe wrapper around the unsafe functions:

#![feature(maybe_uninit_slice)]

use std::{mem::MaybeUninit, ptr};

pub fn write_from_slice<'a, T: Copy>(src: &[T], dst: &'a mut [MaybeUninit<T>]) -> &'a mut [T] {
    assert_eq!(src.len(), dst.len());
    unsafe {
        ptr::copy_nonoverlapping(src.as_ptr(), MaybeUninit::slice_as_mut_ptr(dst), src.len());
        MaybeUninit::slice_assume_init_mut(dst)
    }
}

(On stable Rust, you'd need to add some raw pointer casts and slice::from_raw_parts_mut.)

Yandros · November 11, 2020, 1:36am

https://docs.rs/uninit/0.4.0/uninit/out_ref/struct.Out.html#method.copy_from_slice:

pub
fn copy_from_slice (
    mut self: Out<'out, [T]>,
    source_slice: &'_ [T],
) -> &'out mut [T]
where
    T : Copy,
Initialize the buffer with a copy from another (already initialized)
buffer.

It returns a read-writable slice to the initialized bytes for
convenience (automatically
assume_init-ed).

Panic

The function panics if the slices' lengths are not equal.

Guarantees (that unsafe code may rely on)

A non-panic!king return from this function guarantees that the input
slice has been (successfully) initialized, and that it is thus then
sound to .assume_init().

It also guarantees that the returned slice does correspond to the input
slice (e.g., for ReadIntoUninit's safety guarantees).

Example
use ::uninit::prelude::*;

let mut array = uninit_array![_; 13];
assert_eq!(
    array.as_out().copy_from_slice(b"Hello, World!"),
    b"Hello, World!",
);
// we can thus soundly `assume_init` our array:
let array = unsafe {
    mem::transmute::<
        [MaybeUninit<u8>; 13],
        [            u8 ; 13],
    >(array)
};
assert_eq!(
    array,
    *b"Hello, World!",
);

mgeier · November 11, 2020, 10:17am

Thanks for all the answers!

Well I was thinking by copying a same-sized slice of initialized values into an uninitialized slice we can guarantee that all items will be initialized afterwards (if the sizes don't match, there's a panic and we can of course make no such guarantee).

And @mbrubeck has actually shown a concrete implementation and confirmed what I was suspecting. And @Yandros has linked to another implementation.

I'm working on a wait-free ring buffer (GitHub - mgeier/rtrb: A realtime-safe single-producer single-consumer (SPSC) ring buffer, not yet published on crates.io) which initially has a fixed amount of uninitialized storage. I provide a safe function for writing to that storage, that first uses Default::default() to initialize the items and then provides a &mut [T] for writing. However, for improved performance, I provide an additional function that returns a &mut [MaybeUninit<T>]. If users want to copy an already existing (and initialized) slice of data into that slice, why should they pay the performance cost of Default initialization when they will overwrite every single item anyway?

Right now, I provide them an efficient way to copy their data, but I would also really like to provide them a safe way to do so. And maybe even a convenient way!

Yes, please!
Would you like to propose this?
How can I help?

Yes, I was thinking of something like that, but I couldn't think of a good name nor a good function signature. You have provided both, that's great!

I was thinking about adding a method to either uninitialized or initialized slices (which may not even be technically possible):

let init_slice = uninit_slice.copy_from(other_slice);
let init_slice = other_slice.copy_to(uninit_slice);

But it seemed strange to return something from such a method. I think your signature looks better:

let init_slice = std::slice::write_from_slice(other_slice, uninit_slice);

Thanks for the link to the uninit crate!

It looks like it does what I want. I'm just not quite sure how to use it with &mut [MaybeUninit<T>]. Is the following correct?

uninit::out_ref::Out::from(uninit_slice).copy_from_slice(init_slice);

The whole Out thing seems a bit clumsy, but maybe there is a way to do this without mentioning Out?

2e71828 · November 11, 2020, 10:55am

Your assertion is stricter than it needs to be. I modified your code to allow a smaller src than dst:

#![feature(maybe_uninit_slice)]

use std::{mem::MaybeUninit, ptr};

pub fn write_from_slice<'a, T: Copy>(
    src: &[T],
    dst: &'a mut [MaybeUninit<T>],
) -> (&'a mut [T], &'a mut [MaybeUninit<T>]) {
    assert!(src.len() <= dst.len(), "Not enough space in dst");
    let (buf, extra) = dst.split_at_mut(src.len());
    (
        unsafe {
            ptr::copy_nonoverlapping(src.as_ptr(), MaybeUninit::slice_as_mut_ptr(buf), src.len());
            MaybeUninit::slice_assume_init_mut(buf)
        },
        extra,
    )
}

alice · November 11, 2020, 11:05am

I'd like to point out that the copy itself can be written in safe code, since it appears that nobody has pointed it out yet.

fn copy_data<T: Copy>(from: &[T], to: &mut [MaybeUninit<T>]) {
    assert_eq!(from.len(), to.len());
    
    for (from, to) in from.iter().zip(to) {
        *to = MaybeUninit::new(*from);
    }
}

Yandros · November 11, 2020, 12:59pm

I think @mbrubeck followed the convention from .copy_from_slice in the stdlib, which panics "overzealously" to make sure no silent "truncation" happens.

mgeier:

It looks like it does what I want. I'm just not quite sure how to use it with &mut [MaybeUninit<T>] . Is the following correct?
uninit::out_ref::Out::from(uninit_slice).copy_from_slice(init_slice);
The whole Out thing seems a bit clumsy, but maybe there is a way to do this without mentioning Out ?

https://docs.rs/uninit/0.4.0/uninit/out_ref/struct.Out.html#method.copy_from_slice:

Example

use ::uninit::prelude::*;

let mut array = uninit_array![_; 13];
assert_eq!(
    array.as_out().copy_from_slice(b"Hello, World!"),

As you can see, there is an .as_out() helper.

And here comes the very important part, and the reason why, if such helper functions were to be added to the stdlib, I think they should be added with &out references instead:

Casting &mut [T] to &mut [MU<T>] is unsafe / unsound / can allow non-unsafe code to trigger UB

Which leads to &mut [MU<T>] and the other variations of it (&mut MU<[T]>, &mut MU<T>) to be far less useful when exposed in non-unsafe APIs (the main objective): we'd like &mut [T] and &mut [MU<T>] to "behave the same", as an out reference, but the former does not allow uninit-writes (they're UB) whereas the latter does

Hence the &out _ / Out<_> abstraction, which offer the same capabilities that a &mut MU<_> does, but for the ability to perform uninit-writes. And the only "cost", from a library perspective, is to use that .as_out() adapter (if the language were to support &out references, then an implicit coercion could even take place).

w.r.t. the OP about a non-unsafe way to do this, indeed (barring the "assume_init" part which does need the programmer to ask the compiler to trust them and thus requires unsafe), the write-only part does not need unsafe code, as the simple loop from @alice showed, or as my own code from ::uninit showcases (it uses the stdlib .copy_from_slice! ).

To be completely honest, in order to feature a bulk copy rather than a loop-ed one, a from_ref cast is used, which did require a bit of unsafe, in order to perform a cast that is currently not featured, that of &T to &MU<T>.

Although that may look like a cast that could be blessed by the standard library, such (generic) cast can technically suffer from the same issues of the &mut T -> &mut MU<T> cast, unless the language team decides to commit to MU and UnsafeCell not commuting, which sadly they are not doing, even though the "safe" equivalent of MU, i.e., Option, does clearly not commute with UnsafeCell
- Luckily, there is no non-unsafe way to make these wrappers "commute" yet, so my crate has made the opinionated choice to blame that commute, in case of triggered UB.
- The other option would be for the language to expose their Freeze internal auto-trait, and add a T : Freeze bound to that cast (cc @RalfJung which approach do you think is the best one, here?)

scottmcm · November 11, 2020, 1:56pm

That reminds me of a recent RFC looking at a similar problem with the Read trait. Maybe its ReadBuf would be helpful to you? (Or maybe something like it, since it looks like it might be u8-only.):

github.com

rust-lang/rfcs/blob/master/text/2930-read-buf.md#summary

- Feature Name: read_buf
- Start Date: 2020/05/18
- RFC PR: [rust-lang/rfcs#2930](https://github.com/rust-lang/rfcs/pull/2930)
- Rust Issue: [rust-lang/rust#78485](https://github.com/rust-lang/rust/issues/78485)

# Summary
[summary]: #summary

The current design of the `Read` trait is nonoptimal as it requires that the buffer passed to its various methods be
pre-initialized even though the contents will be immediately overwritten. This RFC proposes an interface to allow
implementors and consumers of `Read` types to robustly and soundly work with uninitialized buffers.

# Motivation
[motivation]: #motivation

## Background
[motivation-background]: #motivation-background

The core of the `Read` trait looks like this:

This file has been truncated. show original

mgeier · November 11, 2020, 3:19pm

Thanks, that's good to know, I wasn't aware of this!

So this is "safe", it may or may not be as "efficient", but it's definitely not "convenient" enough for me to recommend it to users of my API.
I would still like to have an easier way, e.g. by calling a single function/method.

OK, thanks, so I think this would change my little example to:

let init_slice = uninit_slice.as_out().copy_from_slice(init_slice);

This is still too complicated for my taste.

I think I understand the problem now, thanks to the linked example, but I don't think it applies in my case. And I don't think I actually need something like the Out abstraction.

You are talking about what happens when I get a &mut [T] from somewhere out of my control and transmute() it willy-nilly to a &mut [MaybeUninit<T>].
However, I never get a &mut [T] and I completely control all my internal storage.
When I return a &mut [MaybeUninit<T>] from my API, I know that it's not referencing memory where a "living" object is stored. It may or may be initialized, but if it's initialized, it just contains garbage values where nobody else has a reference to. Therefore, users are free to assign MaybeUninit::uninit() as much as they want. They will only ever cause their previously written objects to leak, but I think they will not be able to cause the above-mentioned problem.

In the meantime I've come up with a possible solution in my concrete case:

This will allow users to write:

other_slice.copy_to_uninit(uninit_slice);

... where uninit_slice is the &mut [MaybeUninit<T>] they got from my API and other_slice is a "normal" &[T] from wherever they want.

Without further steps this would still leak their objects, and they indeed have to call a separate unsafe function (provided by my API) that will mark their initialized items as "initialized", which will lead to them being dropped properly at a later time.

While writing this comment I've already found a problem with my initial implementation and I have removed the return value.

I think that's the problem with my original question: I was asking for returning a &mut [T] after copying, but I now think that's bad! I only actually need the copying to happen, I don't really need the return value. I was just asking for it "for completeness", because it looked like a nicely symmetric thing to do. It looked harmless, but now I think it isn't!

I don't think it's officially unsound (but I'm of course not sure), but it could lead to leaks.
I think this also applies to any function that may be added to the standard library: It should not return a &mut [T]!

When writing to &mut [MaybeUninit<T>], the contract is that the written objects may or may not be dropped at some point. Further guarantees might be given by the library implementer, but not by the language itself.
However, when assigning to a &mut [T], the contract is clearly that the previous inhabitant will be dropped immediately and the new object is guaranteed to be dropped at some later point. AFAIK, that's guaranteed by safe Rust.

The suggested function write_frome_slice() cannot guarantee that, therefore it shouldn't return a &mut [T]!

The copying should still be fine, AFAICT, and it would still be a very useful function!

Thanks, I've seen that. I think that's indeed a very similar situation, but as you say, it's u8-only and I would like to be able to use a more generic T.

alice · November 11, 2020, 4:43pm

Copy types can't have destructors, so you can't actually leak anything unless you start allowing Clone.

mgeier · November 11, 2020, 5:22pm

Oh right, I forgot we are talking about copying ...

But what I said above would be true if we were allowing Clone. I could theoretically implement an extension method like this:

other_slice.clone_to_uninit(uninit_slice);

Returning a &mut [T] from this method would be problematic, as described above.

Aaaanyway, I found yet another "safe" way to achieve the copying/cloning. If we had a way to cast a &[T] into a &[MaybeUninit<T>], something like:

fn uninit_slice<T>(s: &[T]) -> &[MaybeUninit<T>] {
    unsafe { transmute(s) }
}

... we could just use the already existing method copy_from_slice(): Rust Playground.

The same would also work with the existing method clone_from_slice().

Would this make sense?

It looks a bit strange to do this:

dst.copy_from_slice(uninit_slice(&src));

Could the argument be somehow automatically coerced from a &[T] into a &[MaybeUninit<T>]?

Or could there be an impl From<&[T]> for &[MaybeUninit<T>]?

Yandros · November 11, 2020, 5:42pm

mgeier:

Aaaanyway, I found yet another "safe" way to achieve the copying/cloning. If we had a way to cast a &[T] into a &[MaybeUninit<T>] , something like:
fn uninit_slice<T>(s: &[T]) -> &[MaybeUninit<T>] {
    unsafe { transmute(s) }
}
... we could just use the already existing method copy_from_slice() : Rust Playground.

The same would also work with the existing method clone_from_slice() .

Would this make sense?

It looks a bit strange to do this:
dst.copy_from_slice(uninit_slice(&src));
Could the argument be somehow automatically coerced from a &[T] into a &[MaybeUninit<T>] ?

Or could there be an impl From<&[T]> for $[MaybeUninit<T>] ?

RalfJung · November 11, 2020, 10:00pm

That is an interesting proposal, I have not considered this before. As you noted this requires some guarantee that &MU<T> may not be written to, regardless of the T. I think my main concern here is less whether we want that guarantee, and more how to precisely document it... this would be part of the "safety invariant for the shared typestate" of MU, and while I know pretty well how to make that safety invariant formally precise, I have no good ideas for making it precise in a way that is comprehensible without studying modern concurrent separation logics...

(I view this as orthogonal to the question of whether Stacked Borrows considers MU and UnsafeCell to commute. After all, the question here was not "does this unsafe code trigger UB", the question was about exposing &MU<T> to safe code, so at that point the safety invariant matters way more than the validity invariant.)

mgeier · November 13, 2020, 10:15am

TBH, I don't understand any of the technical mumbo-jumbo in the previous two entries, but I assume this doesn't have any influence on my current options. In a future version of the language I might have more options, though. I can't add anything to this discussion, so I'll leave that to the language design professionals.

I would like to summarize the answers, as far as I understand them, to my original question:

Is there a way to copy &[T] into &mut [MaybeUninit<T>] without unsafe?

with the standard library: yes, but only element-by-element and IMHO in a somewhat convoluted way (see also Is there a way to copy &[T] into &mut [MaybeUninit<T>] without unsafe? - #8 by alice):
```
assert_eq!(other_slice.len(), uninit_slice.len());
for (from, to) in other_slice.iter().zip(uninit_slice) {
    *to = MaybeUninit::new(*from);
}
```
with an external crate: yes, for example with uninit::Out::copy_from_slice(), which can be used like this:
```
uninit_slice.as_out().copy_from_slice(other_slice);
```
with a hand-written (hopefully safe) wrapper, like I've tried myself over there: Add CopyToUninit extension trait by mgeier · Pull Request #13 · mgeier/rtrb · GitHub, which can be used like this:
```
other_slice.copy_to_uninit(uninit_slice);
```

Can someone please check whether my solution from the third point is sound?

Does anyone know any further answers?

I hope we can get a simple way to copy a whole slice at once with the standard library in some future version of Rust!

alice · November 13, 2020, 10:50am

Yes, your hand written wrapper is fine.

mgeier · December 25, 2020, 10:34am

I've just seen that the functions MaybeUninit::write_slice() and MaybeUninit::write_slice_cloned() have been added to nightly Rust in #79607 using the feature flag #![feature(maybe_uninit_write_slice)]. See also tracking issue #79995.

system · March 25, 2021, 10:34am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Using MaybeUninit with arrays	10	4904	January 12, 2023
MaybeUninit from mutable reference help	6	942	February 28, 2020
How to avoid copying data with MaybeUninit? help	2	866	October 18, 2019
How to create a long array with non-copyable element? help	45	5125	October 25, 2019
How to correctly copy to uninitialized byte array or vec (example code wanted)?	3	546	August 22, 2023

Is there a way to copy &[T] into &mut [MaybeUninit<T>] without unsafe?

Casting &mut [T] to &mut [MU<T>] is unsafe / unsound / can allow non-unsafe code to trigger UB

Related topics

Casting `&mut [T]` to `&mut [MU<T>]` is `unsafe` / unsound / can allow non-`unsafe` code to trigger UB