Is this Iterator::chunks impl unsound?

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=4a92717f4ea11aca8da9ae249db6b893

I'm aware that it doesn't run destructors when there aren't enough elements to fill the chunk, which is unwanted but technically safe.

I'm not sure about my use of the pointer since it looks like I'm dereferencing uninitalized data.
(Tried doing the same with [MaybeUninit<...>; N] but encountered some errors around transmuting of a const dependent type)

This is the unsound code.

ptr::write(&mut ((*ptr)[i]),

This code produces &mut T pointing the uninitialized memory. This is UB.

You can either entirely avoid references and use .offset(idx) instead, or utilize [MaybeUninit<T>; N] and transmute it, or stay safe with arrayvec crate.

1 Like

.offset needs *mut T, not *mut [T; N], right?
What's the best way to get the pointer to the first element, type-wise?
Just transmute?

You're definitely invoking UB with *ptr in ptr::write(&mut ((*ptr)[i]), […]).

You want to initialize each element separately. In that case, you need a *mut I::Item from *mut [I::Item; N] without creating an intermediate reference. You can simply cast the pointer with as in this case in order to get a pointer to the first element, that is also valid for writes to every other element in the uninitialized array. For pointers, instead of [i], you have add(i):

Calculates the offset from a pointer (convenience for .offset(count as isize) ).

count is in units of T; e.g., a count of 3 represents a pointer offset of 3 * size_of::<T>() bytes.

Your code would look like this:

[…]
let ptr = chunk.as_mut_ptr() as *mut Self::Item;

for i in 0..N {
    unsafe { ptr.add(i).write(self.iterator.next()?) };
}

[…]
1 Like

AFAIK, this is also a reasonable example for using the new ptr::addr_of_mut, just turning

ptr::write(&mut ((*ptr)[i]), …)

into

ptr::write(ptr::addr_of_mut!((*ptr)[i]), …)

or, using method syntax, into

ptr::addr_of_mut!((*ptr)[i]).write(…)

(This relies on the fact that …[i] indexing for arrays / slices doesn’t go through the Deref(Mut) trait but is a primitive operation.)

2 Likes

I also want to second the suggestion to use some existing safe API such as arrayvec.

fn next(&mut self) -> Option<Self::Item> {
    ArrayVec::from_iter(self.iterator.by_ref().take(N))
        .into_inner()
        .ok()
}

(playground)

It also immediately solves the dropping problem.

5 Likes

I wouldn't bet on addr_of_mut! converting the code into a sound operation in this context. There is no example of explicit dereferencing followed up by an indexing operation in the documentation. If the macro allowed the otherwise non-allowed syntax ptr[i] (no dereferencing going on here), then I'd be inclined to believe it's sound.

It might indeed not be documented anywhere, but I think that it’s sound. I know that standard library can do whatever it wants but the implementation of <[T]>::swap at least indicates that something like this should probably be sound, i.e. that primitive slice indexing (and thus probably also array indexing) is a reasonable “place” to be used in the addr_of_mut macro.

impl<T> [T] {
    /// Swaps two elements in the slice.
    ///
    /// # Arguments
    ///
    /// * a - The index of the first element
    /// * b - The index of the second element
    ///
    /// # Panics
    ///
    /// Panics if `a` or `b` are out of bounds.
    ///
    /// # Examples
    ///
    /// ```
    /// let mut v = ["a", "b", "c", "d"];
    /// v.swap(1, 3);
    /// assert!(v == ["a", "d", "c", "b"]);
    /// ```
    pub fn swap(&mut self, a: usize, b: usize) {
        // Can't take two mutable loans from one vector, so instead use raw pointers.
        let pa = ptr::addr_of_mut!(self[a]);
        let pb = ptr::addr_of_mut!(self[b]);
        // SAFETY: `pa` and `pb` have been created from safe mutable references and refer
        // to elements in the slice and therefore are guaranteed to be valid and aligned.
        // Note that accessing the elements behind `a` and `b` is checked and will
        // panic when out of bounds.
        unsafe {
            ptr::swap(pa, pb);
        }
    }
}

This case might appear significantly different because the slice is always initialized, but it does seem to rely on addr_of_mut to return a pointer based on the original &mut self reference; if intermediate references were involved then creation of pb would invalidate pa.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.