How to get value with lifetime it's referenced from?

I have a struct which references data of vertices in SoA layout:

pub struct Vertices<'src> {
    pub len: usize,
    pub u: &'src [f32],
    pub v: &'src [f32],
    pub ng_x: &'src [f32],
    pub ng_y: &'src [f32],
    pub ng_z: &'src [f32],
    pub p_x: &'src mut [f32],
    pub p_y: &'src mut [f32],
    pub p_z: &'src mut [f32],
}

And I want to create an mutable iterator to facilitate the access:

impl<'src> Vertices<'src> {
    pub fn iter_mut<'vert>(&'vert mut self) -> VerticesIterMut<'src, 'vert> {
        VerticesIterMut {
            inner: self,
            cur: 0,
        }
    }
}

pub struct VerticesIterMut<'src, 'vert> {
    inner: &'vert mut Vertices<'src>,
    cur: usize,
}

impl<'src, 'vert> Iterator for VerticesIterMut<'src, 'vert> {
    type Item = ([f32; 2], [f32; 3], [&'src mut f32; 3]);

    fn next(&mut self) -> Option<Self::Item> {
        if self.cur < self.inner.len {
            let u = [self.inner.u[self.cur], self.inner.v[self.cur]];
            let ng = [
                self.inner.ng_x[self.cur],
                self.inner.ng_y[self.cur],
                self.inner.ng_z[self.cur],
            ];
            // problem here, these three values will have lifetime from `&mut self` but not `src`
            let p = [
                &mut self.inner.p_x[self.cur],
                &mut self.inner.p_y[self.cur],
                &mut self.inner.p_z[self.cur],
            ];
            self.cur += 1;
            Some((u, ng, p))
        } else {
            None
        }
    }
}

Although I marked the returned reference &'src mut f32 to carry the 'src lifetime, but the compiler always complains that it carries the lifetime from &mut self, not the lifetime of data origin. If I change the signature of next, then it will not match the declaration of the trait method. How could I mitigate this?

1 Like

There’s two fundamental issues here.

Firstly, the iterator is a &'vert mut …struct containing… &'src mut [f32] and your item type promises &'src mut [f32] items. That isn’t possible (unless you want to mutate Vectices itself and give away the references in the p_x, p_y, p_z fields), because a &'short_lived mut &'long_lived mut T reference-to-reference can only give you a &'short_lived T access to the inner data.

The other problem is that the iterator will, during iteration, produce multiple references to the inner f32s which can co-exist during iteration (e.g. if all items are .collected), and while you aren’t producing any aliasing mutable references to the same items in the slices, this is only prevented due to run-time logic on the indices, i.e. because self.cur always increments and never overflows. The compiler cannot know this logic is correct, so the code cannot possibly compile like that.

Approaches to fix situations like this in Rust are always the two alternatives of

  • use unsafe code, or
  • use pre-existing safe abstractions.

The pre-existing safe abstractions are of course usually preferable because unsafe code can often easily have subtle soundness bugs in the interface. The pre-existing safe abstraction would be the mutable iterator on slices from the standard library. On the other hand, the downside of using existing iterators of course is that that would duplicate the iteration “index”. These iterators don’t actually work with true “indices” but rather directly increment pointers to the current element, but still, the iterator would become a bit larger, and the result is, probably (?), not the absolute optimum that could be achieved with unsafe code. Compared to your current approach there’s the (presumably significant) advantage that the index-checks from indexing operators can be avoided using standard library iterators.

So let’s try consistently using those iterators, and we’ll get something like

pub struct Vertices<'a> {
    pub len: usize,
    pub u: &'a [f32],
    pub v: &'a [f32],
    pub ng_x: &'a [f32],
    pub ng_y: &'a [f32],
    pub ng_z: &'a [f32],
    pub p_x: &'a mut [f32],
    pub p_y: &'a mut [f32],
    pub p_z: &'a mut [f32],
}

impl Vertices<'_> {
    pub fn iter_mut(&mut self) -> VerticesIterMut<'_> {
        VerticesIterMut {
            u_it: self.u.iter(),
            v_it: self.v.iter(),
            ng_x_it: self.ng_x.iter(),
            ng_y_it: self.ng_y.iter(),
            ng_z_it: self.ng_z.iter(),
            p_x_it: self.p_x.iter_mut(),
            p_y_it: self.p_y.iter_mut(),
            p_z_it: self.p_z.iter_mut(),
        }
    }
}

// I liked the short lifetime names,
// but this `'a` is like `'vert` in your original code
// so the `Item` type below also uses `'vert`, not `'src`
pub struct VerticesIterMut<'a> {
    u_it: std::slice::Iter<'a, f32>,
    v_it: std::slice::Iter<'a, f32>,
    ng_x_it: std::slice::Iter<'a, f32>,
    ng_y_it: std::slice::Iter<'a, f32>,
    ng_z_it: std::slice::Iter<'a, f32>,
    p_x_it: std::slice::IterMut<'a, f32>,
    p_y_it: std::slice::IterMut<'a, f32>,
    p_z_it: std::slice::IterMut<'a, f32>,
}

impl<'a> Iterator for VerticesIterMut<'a> {
    type Item = ([f32; 2], [f32; 3], [&'a mut f32; 3]);

    fn next(&mut self) -> Option<Self::Item> {
        Some((
            [*self.u_it.next()?, *self.v_it.next()?],
            [
                *self.ng_x_it.next()?,
                *self.ng_y_it.next()?,
                *self.ng_z_it.next()?,
            ],
            [
                self.p_x_it.next()?,
                self.p_y_it.next()?,
                self.p_z_it.next()?,
            ],
        ))
    }
}

Glancing back at the number of early-return opportunities from the .next calls, there certainly are possibly good improvements to possibly be made here if performance really matters; maybe even some using safe code, and probably a lot more opportunity using unsafe.


Edit: I’m noticing that I haven’t used the len field in the code above. There’s an argument to be had – if efficient code really matters – that the Vertices struct will contain way more copies of the “length” (one for each slice) than you might probably want to have, so there’s our first inefficiency already in the first place. The additional len field you have beyond that does hover appear to be quite clearly overkill. I assume, the code would often assume that all slice lengths are simply the same; if no unsafe code is used, this assumption could be enforced by the API, and would result in panics or logic errors if there was a bug that could result in different-length slices; if unsafe code is used, eliminating all these copies by using raw pointers would be a possible optimization in the representation or Vertices, too, though from what I remember from watching some videos online that shortly talked about SOA, there might be even more compact representations, if more assumptions are made how the different slices can be layed out relative to each other.

3 Likes

Thanks for the help! Following your explanation, I came up with the unsafe version:

impl<'src> Vertices<'src> {
    pub fn into_iter_mut(self) -> VerticesIterMut<'src> {
        VerticesIterMut {
            u: self.u.as_ptr(),
            v: self.v.as_ptr(),
            ng_x: self.ng_x.as_ptr(),
            ng_y: self.ng_y.as_ptr(),
            ng_z: self.ng_z.as_ptr(),
            p_x: self.p_x.as_mut_ptr(),
            p_y: self.p_y.as_mut_ptr(),
            p_z: self.p_z.as_mut_ptr(),
            cur: 0,
            len: self.len,
            marker: Default::default(),
        }
    }
}

pub struct VerticesIterMut<'src> {
    u: *const f32,
    v: *const f32,
    ng_x: *const f32,
    ng_y: *const f32,
    ng_z: *const f32,
    p_x: *mut f32,
    p_y: *mut f32,
    p_z: *mut f32,
    cur: usize,
    len: usize,
    marker: PhantomData<&'src mut Vertices<'src>>,
}

impl<'src> Iterator for VerticesIterMut<'src> {
    type Item = ([f32; 2], [f32; 3], [&'src mut f32; 3]);

    fn next(&mut self) -> Option<Self::Item> {
        if self.cur < self.len {
            unsafe {
                let u = *self.u.add(self.cur);
                let v = *self.v.add(self.cur);
                let ng_x = *self.ng_x.add(self.cur);
                let ng_y = *self.ng_y.add(self.cur);
                let ng_z = *self.ng_z.add(self.cur);
                let p_x = self.p_x.add(self.cur);
                let p_y = self.p_y.add(self.cur);
                let p_z = self.p_z.add(self.cur);
                self.cur += 1;
                Some((
                    [u, v],
                    [ng_x, ng_y, ng_z],
                    [&mut *p_x, &mut *p_y, &mut *p_z],
                ))
            }
        } else {
            None
        }
    }
}

I would probably use PhantomData<Vertices<'src>>, which adds the additional benefit of the iterator becoming covariant again. Other than that, this looks fine to me upon first inspection, if you make absolutely sure that all slices in Vertices always have length len (or larger), and it’s impossible to use the API of Vertices to violate this property.

2 Likes

This struct is used inside of a callback function from C/C++. It will give me the length and pointers for each field. I can optimized it to avoid repeating length in each slice (created from_raw_parts) as following. In this case, what about the returned [&mut f32; 3], here a lifetime is required for &mut f32, but pointers come from C/C++ side, how would I describe it here?

pub struct Vertices {
    len: usize,
    u: *const f32,
    v: *const f32,
    ng_x: *const f32,
    ng_y: *const f32,
    ng_z: *const f32,
    p_x: *mut f32,
    p_y: *mut f32,
    p_z: *mut f32,
}

impl Vertices {
    pub fn into_iter_mut(self) -> VerticesIterMut {
        VerticesIterMut {
            inner: self,
            cur: 0,
        }
    }
}

pub struct VerticesIterMut {
    inner: Vertices,
    cur: usize,
}

impl Iterator for VerticesIterMut {
    type Item = ([f32; 2], [f32; 3], [&mut f32; 3]);

    fn next(&mut self) -> Option<Self::Item> {
        if self.cur < self.inner.len {
            unsafe {
                let u = *self.inner.u.add(self.cur);
                let v = *self.inner.v.add(self.cur);
                let ng_x = *self.inner.ng_x.add(self.cur);
                let ng_y = *self.inner.ng_y.add(self.cur);
                let ng_z = *self.inner.ng_z.add(self.cur);
                let p_x = self.inner.p_x.add(self.cur);
                let p_y = self.inner.p_y.add(self.cur);
                let p_z = self.inner.p_z.add(self.cur);
                self.cur += 1;
                Some((
                    [u, v],
                    [ng_x, ng_y, ng_z],
                    [&mut *p_x, &mut *p_y, &mut *p_z],
                ))
            }
        } else {
            None
        }
    }
}

Given this description, the strict presumably still doesn't own its contents but access must be limited to the duration of the callback. Thus, for soundness, in the Rust bindings, the callback must have a higher-ranked signature such as e. g. the function pointer type for<'a> fn(Vertices<'a>), otherwise, fhe callback defined with safe Rust code would be able to leak the Vertices into global state and keep it there until after its dead, and then use-after-free it.

For adding a lifetime like this, you'll need a PhantomData in the Vertices struct as well. Something like e. g. PhantomData<&'a mut f32> would seem appropriate, though in this case, the precise type isn't too important because for references to (slices of) f32, there's no caveats with variance or auto traits.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.