Turning mutable grid access into a thread-safe grid access (stencils)

kornel · January 14, 2023, 3:22am

Alternatively, you could simplify PostStencilMut to have only public fields, no getters, and include regular reference to the mutable cell and the neighbors:

PosStencilMut<'a> {
   pub cell: &'a mut C,
   pub neighbor1: Option<&'a C>,
   pub neighbor2: Option<&'a C>,
}

and deal with the unsafe way of making these references inside the iterator. Iterator can use the lifetime of the data source, because it guarantees that next() always returns a new, unique reference to a different cell. This way there's no cell_mut to call twice to get the same &mut twice.

gabyx · January 14, 2023, 11:02am

Thanks you so much for your time and explanation, really appreciated!
I was wondering if there are compile-tests possible to check that such things do not compile, somehow with rust.

gabyx · January 14, 2023, 11:03am

Is there a reason you chose two neighbor member types instead of and array [&mut C; 2] ?
probably it will lock again the whole array when we extract only one ref. Is there simple array-data structure which does not do that?

H2CO3 · January 14, 2023, 11:45am

Write a compile_fail doc-test:

/// ```compile_fail
/// let mut stencil = Stencil::new();
/// let mut x = stencil.get_mut(0);
/// let mut y = stencil.get_mut(1); // cannot borrow mutably
/// ```

There's no such array, but you can use a 2-element tuple.

gabyx · January 14, 2023, 5:06pm

I implemented a solution as suggested, but my test highligths that test compiles which shouldnt
What is wrong with my implementation with the lifetimes? The drop(v) should make s expire resulting in a borrow error (??).

#![feature(result_option_inspect)]

use criterion::{
    criterion_group, criterion_main, AxisScale, BenchmarkId, Criterion, PlotConfiguration,
};
use itertools::Itertools;
use rayon::prelude::*;
use std::{marker::PhantomData, time::Duration};

type Index2 = nalgebra::Vector2<usize>;

#[macro_export]
macro_rules! idx {
    ($x:expr, $($y:expr),+ ) => {
        Index2::new($x, $($y),+)
    };
}

#[derive(Clone, Debug)]
pub struct C {
    pub i: usize,
    pub index: Index2,
}

#[derive(Clone, Debug)]
pub struct G {
    pub cells: Vec<C>,
    pub dim: Index2,
}

pub trait GetIndex {
    fn index(&self) -> Index2;
}

impl GetIndex for C {
    fn index(&self) -> Index2 {
        return self.index;
    }
}

pub struct PosStencilMut<'a, T> {
    cell: &'a mut T,
    neighbors: [Option<&'a mut T>; 2],
}

impl<'a, T> PosStencilMut<'a, T> {
    fn new(
        data: *mut T,
        dim: Index2,
        index: Index2,
        phantom: PhantomData<&'a mut T>,
    ) -> PosStencilMut<'a, T> {
        debug_assert!(dim > idx!(0, 0), "Wrong dimensions.");
        debug_assert!(index < dim, "Index out of bounds.");

        let cell = unsafe { data.add(index[0] + index[1] * dim[0]) };

        // Here the unsafe part happens.
        // Get two non-aliasing mutable references for the neighbors.
        return PosStencilMut {
            cell: unsafe { &mut *cell },
            neighbors: Self::get_neighbors::<'a>(cell, dim, index),
        };
    }

    #[inline(always)]
    fn get_neighbors<'cell>(cell: *mut T, dim: Index2, index: Index2) -> [Option<&'cell mut T>; 2] {
        assert!(dim.x >= 1 && dim.y >= 1);

        let mut nbs = [None, None];

        for dir in 0..2 {
            if index[dir] >= dim[dir] - 1 {
                // No neighbor possible.
                continue;
            }

            // For x-direction : offset = 1, for y-direction: offset = dim[0],
            // general: for n-direction: offset = dim[0]*dim[1]*...*dim[n-1]
            let offset = dim.iter().take(dir).fold(1, std::ops::Mul::mul);
            nbs[dir] = Some(unsafe { &mut *cell.add(offset) });
        }

        return nbs;
    }
}

pub trait PosStencil<'a, T: 'a> {
    fn positive_stencils_mut(
        &'a mut self,
        dim: Index2,
        min: Option<Index2>,
        max: Option<Index2>,
    ) -> Box<dyn Iterator<Item = PosStencilMut<'a, T>> + '_>;
}

impl<'a, T: 'a> PosStencil<'a, T> for Vec<T> {
    fn positive_stencils_mut(
        &'a mut self,
        dim: Index2,
        min: Option<Index2>,
        max: Option<Index2>,
    ) -> Box<dyn Iterator<Item = PosStencilMut<'a, T>> + '_> {
        return Box::new(pos_stencils_mut(self, dim, min, max));
    }
}

fn pos_stencils_mut<T>(
    data: &mut Vec<T>,
    dim: Index2,
    min: Option<Index2>,
    max: Option<Index2>,
) -> impl Iterator<Item = PosStencilMut<'_, T>> {
    assert!(dim > idx!(0, 0) && dim.iter().fold(1, std::ops::Mul::mul) == data.len());

    let min = min.unwrap_or(Index2::zeros());
    let max = max.unwrap_or(dim - idx!(1, 1));

    assert!(min >= Index2::zeros() && max < dim);

    return (min[0]..max[0])
        .step_by(2)
        .cartesian_product((min[1]..max[1]).step_by(2))
        .map(move |(i, j)| PosStencilMut::new(data.as_mut_ptr(), dim, idx!(i, j), PhantomData));
}

fn test() {
    let mut v = vec![1, 2, 3, 4];
    let s = PosStencilMut::new(v.as_mut_ptr(), idx!(2, 2), idx!(0, 0), PhantomData);
    drop(v); // This should invalidate the life-time of `s` but it does not???
    *s.cell += 3;
}

// Safety: `PosStencilMut` can only be created by `pos_stencils_mut`
// which guarantees non-aliased mutable references.
// Therefore it can safely be transferred to another thread.
unsafe impl<'a, T> Send for PosStencilMut<'a, T> {}

// For PosStencilMut to be Sync we have to enforce that you can't write to something stored
// in a &PosStencilMut while that same something could be read or written to from another &PosStencilMut.
// Since you need an &mut PosStencilMut to write to the pointer, and the borrow checker enforces that
// mutable references must be exclusive, there are no soundness issues making PosStencilMut sync either.
unsafe impl<'a, T> Sync for PosStencilMut<'a, T> {}

fn stencil_compute<'a>(stencil: &mut PosStencilMut<'a, C>) {
    stencil.cell.i += match stencil.neighbors[0].as_deref() {
        Some(c) => c.i,
        None => 0,
    } + match stencil.neighbors[1].as_deref() {
        Some(c) => c.i,
        None => 0,
    };

    if let Some(n) = stencil.neighbors[0].as_deref_mut() {
        n.i += stencil.cell.i;
    }
}

fn run_grid_parallel(n: usize, grid: &mut G) {
    // Should not compile.
    test();

    let mut stencils: Vec<_> = grid
        .cells
        .positive_stencils_mut(grid.dim, None, None)
        .collect();

    for _ in 0..n {
        stencils
            .par_iter_mut()
            .for_each(|stencil| stencil_compute(stencil))
    }
}

fn run_grid_single(n: usize, grid: &mut G) {
    let mut stencils: Vec<_> = pos_stencils_mut(&mut grid.cells, grid.dim, None, None).collect();

    for _ in 0..n {
        stencils
            .iter_mut()
            .for_each(|stencil| stencil_compute(stencil))
    }
}

fn criterion_benchmark(c: &mut Criterion) {
    let dim = idx!(100, 100);

    let cell_generator = (0..dim[0]).cartesian_product(0..dim[1]).map(|(i, j)| C {
        i: i + j,
        index: idx!(i, j),
    });

    let mut grid = G {
        cells: Vec::from_iter(cell_generator),
        dim,
    };

    let mut group = c.benchmark_group("Grid Single vs. Parallel");
    for i in [10, 40, 100, 500, 1000].iter() {
        group.bench_with_input(BenchmarkId::new("Single", i), i, |b, i| {
            b.iter(|| run_grid_parallel(*i, &mut grid))
        });

        group.bench_with_input(BenchmarkId::new("Parallel", i), i, |b, i| {
            b.iter(|| run_grid_single(*i, &mut grid))
        });
    }

    let plot_config = PlotConfiguration::default().summary_scale(AxisScale::Logarithmic);
    group.plot_config(plot_config);

    group.measurement_time(Duration::from_secs(20));
    group.finish();
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

H2CO3 · January 14, 2023, 5:09pm

I'm not sure what you are asking there. You (obviously) shouldn't just copy-paste my example into your own tests – it's merely an example, to show you the structure of the test. The actual code you have to write depends on what you want to test – I don't know that.

Furthermore, if your compile_fail test code fails to compile, then the test succeeds. Perhaps you though that it would fail in this scenario?

gabyx · January 14, 2023, 5:17pm

Its not about the test its about my implementation where there is something wrong with the lifetimes. Shouldnt *s.cell += 3 in test not be allowed?

H2CO3 · January 14, 2023, 5:44pm

Sorry, can you please paste the complete doc-test? I don't see it anywhere in this thread.

gabyx · January 14, 2023, 9:11pm

In the code in

fn test() {
    let mut v = vec![1, 2, 3, 4];
    let s = PosStencilMut::new(v.as_mut_ptr(), idx!(2, 2), idx!(0, 0), PhantomData);
    drop(v); // This should invalidate the life-time of `s` but it does not???
    *s.cell += 3;
}

My further question was more to understand why s 's lifetime is not bound to the borrow of v in v.as_mut_ptr.

I solved the issued by the following complete code which compiles where I eliminated the PosStencilMut::new methode which was kind of bogus (?):

use criterion::{
    criterion_group, criterion_main, AxisScale, BenchmarkId, Criterion, PlotConfiguration,
};
use itertools::Itertools;
use rayon::prelude::*;
use std::time::Duration;
type Index2 = nalgebra::Vector2<usize>;

#[macro_export]
macro_rules! idx {
    ($x:expr, $($y:expr),+ ) => {
        Index2::new($x, $($y),+)
    };
}

#[derive(Clone, Debug)]
pub struct C {
    pub i: usize,
    pub index: Index2,
}

#[derive(Clone, Debug)]
pub struct G {
    pub cells: Vec<C>,
    pub dim: Index2,
}

pub struct PosStencilMut<'a, T> {
    cell: &'a mut T,
    neighbors: [Option<&'a mut T>; 2],
}

impl<'a, T> PosStencilMut<'a, T> {
    #[inline(always)]
    fn get_neighbors<'cell>(cell: *mut T, dim: Index2, index: Index2) -> [Option<&'cell mut T>; 2] {
        assert!(dim.x >= 1 && dim.y >= 1);

        let mut nbs = [None, None];

        for dir in 0..2 {
            if index[dir] >= dim[dir] - 1 {
                // No neighbor possible.
                continue;
            }

            // For x-direction : offset = 1, for y-direction: offset = dim[0],
            // general: for n-direction: offset = dim[0]*dim[1]*...*dim[n-1]
            let offset = dim.iter().take(dir).fold(1, std::ops::Mul::mul);
            nbs[dir] = Some(unsafe { &mut *cell.add(offset) });
        }

        return nbs;
    }
}

pub trait PosStencil<'a, T: 'a> {
    fn positive_stencils_mut(
        &'a mut self,
        dim: Index2,
        min: Option<Index2>,
        max: Option<Index2>,
    ) -> Box<dyn Iterator<Item = PosStencilMut<'a, T>> + '_>;
}

impl<'a, T: 'a> PosStencil<'a, T> for Vec<T> {
    fn positive_stencils_mut(
        &'a mut self,
        dim: Index2,
        min: Option<Index2>,
        max: Option<Index2>,
    ) -> Box<dyn Iterator<Item = PosStencilMut<'a, T>> + '_> {
        return Box::new(positive_stencils_mut(self.as_mut_slice(), dim, min, max));
    }
}

fn positive_stencils_mut<T>(
    data: &mut [T],
    dim: Index2,
    min: Option<Index2>,
    max: Option<Index2>,
) -> impl Iterator<Item = PosStencilMut<'_, T>> {
    assert!(
        dim > idx!(0, 0) && dim.iter().fold(1, std::ops::Mul::mul) == data.len(),
        "Wrong dimensions."
    );

    let min = min.unwrap_or(Index2::zeros());
    let max = max.unwrap_or(dim - idx!(1, 1));

    assert!(min >= Index2::zeros() && max < dim);

    return (min[0]..max[0])
        .step_by(2)
        .cartesian_product((min[1]..max[1]).step_by(2))
        .map(move |(i, j)| {
            let index = idx!(i, j);
            let cell: *mut T = &mut data[index[0] + index[1] * dim[0]];

            // Here the unsafe part happens.
            // Get two non-aliasing mutable references for the neighbors.
            return PosStencilMut {
                cell: unsafe { &mut *cell },
                neighbors: PosStencilMut::get_neighbors::<'_>(cell, dim, index),
            };
        });
}

fn test() {
    let mut v = vec![1, 2, 3, 4];
    let s = positive_stencils_mut(v.as_mut_slice(), idx!(2, 2), None, None)
        .next()
        .unwrap();
    // drop(v); // This should invalidate the life-time of `s` but it does not???
    *s.cell += 3;
}

// Safety: `PosStencilMut` can only be created by `pos_stencils_mut`
// which guarantees non-aliased mutable references.
// Therefore it can safely be transferred to another thread.
unsafe impl<'a, T> Send for PosStencilMut<'a, T> {}

// For PosStencilMut to be Sync we have to enforce that you can't write to something stored
// in a &PosStencilMut while that same something could be read or written to from another &PosStencilMut.
// Since you need an &mut PosStencilMut to write to the pointer, and the borrow checker enforces that
// mutable references must be exclusive, there are no soundness issues making PosStencilMut sync either.
unsafe impl<'a, T> Sync for PosStencilMut<'a, T> {}

#[inline(always)]
fn stencil_compute(stencil: &mut PosStencilMut<'_, C>) {
    stencil.cell.i += match stencil.neighbors[0].as_deref() {
        Some(c) => c.i,
        None => 0,
    } + match stencil.neighbors[1].as_deref() {
        Some(c) => c.i,
        None => 0,
    };

    if let Some(n) = stencil.neighbors[0].as_deref_mut() {
        n.i += stencil.cell.i;
    }
}

fn run_grid_parallel(n: usize, grid: &mut G) {
    let mut stencils: Vec<_> =
        positive_stencils_mut(grid.cells.as_mut_slice(), grid.dim, None, None).collect();

    for _ in 0..n {
        stencils
            .par_iter_mut()
            .for_each(|stencil| stencil_compute(stencil))
    }
}

fn run_grid_single(n: usize, grid: &mut G) {
    let mut stencils: Vec<_> = grid
        .cells
        .positive_stencils_mut(grid.dim, None, None)
        .collect();

    for _ in 0..n {
        stencils
            .iter_mut()
            .for_each(|stencil| stencil_compute(stencil))
    }
}

fn criterion_benchmark(c: &mut Criterion) {
    let dim = idx!(100, 100);

    let cell_generator = (0..dim[0]).cartesian_product(0..dim[1]).map(|(i, j)| C {
        i: i + j,
        index: idx!(i, j),
    });

    let mut grid = G {
        cells: Vec::from_iter(cell_generator),
        dim,
    };

    test();

    let mut group = c.benchmark_group("Grid Single vs. Parallel");
    for i in [10, 40, 100, 500, 1000].iter() {
        group.bench_with_input(BenchmarkId::new("Single", i), i, |b, i| {
            b.iter(|| run_grid_parallel(*i, &mut grid))
        });

        group.bench_with_input(BenchmarkId::new("Parallel", i), i, |b, i| {
            b.iter(|| run_grid_single(*i, &mut grid))
        });
    }

    let plot_config = PlotConfiguration::default().summary_scale(AxisScale::Logarithmic);
    group.plot_config(plot_config);

    group.measurement_time(Duration::from_secs(20));
    group.finish();
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

quinedot · January 14, 2023, 10:43pm

Let's look at the function signatures -- those are the contract between caller and callee, including lifetimes.

// Problematic
impl<'a, T> PosStencilMut<'a, T> {
    fn new(
        data: *mut T,
        dim: Index2,
        index: Index2,
        phantom: PhantomData<&'a mut T>,
    ) -> PosStencilMut<'a, T> {

If I call PosStencilMut::<'_, i32>::new(v.as_mut_ptr(), ...), where does the 'a come from? It can't come from any of the inputs, as none of those have lifetimes. *mut T in particular has no lifetime -- by using raw pointers, you've taken on the responsibility to uphold Rust's borrowing rules yourself. The call to v.as_mut_ptr() created a borrow only long enough to create the lifetimeless *mut. There's no way to "pass on" the lifetime and have it end up in your PosStencilMut.

The answer to the question is that caller gets to choose the lifetime ('a), and in this case, the compiler picked a lifetime long enough to last through all your use cases... including beyond the drop. There was nothing tying the lifetime to v because, as per the function signature, that lifetime came from out of nowhere. There are no constraints on what it could be.

Lifetimes that come out of nowhere are a red flag. You could have also written this:

let s = PosStencilMut::<'static, _>::new(v.as_mut_ptr(), idx!(2, 2), idx!(0, 0), PhantomData);

Elsewhere, when you dereference the *mut in unsafe -- that code is unsound, because there's nothing guaranteeing that the pointer isn't dangling, or that your exlusive borrow has been invalidated in some other way (e.g. that there's an observing shared reference now).

The things you have to uphold yourself when using unsafe to mutate through a *mut are documented here. Using unsafe doesn't turn off Rust's guarantees or invariants, it puts the onus of proving that they are upheld on you. You're telling the compiler "don't worry, I got this", and it will trust that you do.

So if you want to pursue this approach, you'll have to figure a way to carry the appropriate lifetime around, or to otherwise uphold the safety requirements.

In contrast, we have

// Less surprising
fn positive_stencils_mut<T>(
    data: &mut [T],
    dim: Index2,
    min: Option<Index2>,
    max: Option<Index2>,
) -> impl Iterator<Item = PosStencilMut<'_, T>> {

And in this case, as per the lifetime elision rules, the returned iterator lifetime did not come out of nowhere -- it is tied to the input &mut [T]. This API informs the compiler that the borrows must be tied together.

quinedot · January 14, 2023, 11:55pm

I'm still curious about your exact algorithm. I'm seeing the stencil_compute and it looks like for a given sencil

+---+---+     +---+---+
| a | u |     | b | x |
+---+---+ ==> +---+---+
| v |         | y |
+---+         +---+

We have

NORMAL CASE                BOTTOM EDGE           RIGHT EDGE    SOLO CORNER
b = a + u + v              b = a + u             b = a + v     b = a
x = u + b = a + 2u + v     x = u + b = a + 2u
y = v                                            y = v

Or in ordered mutating form

NORMAL CASE                BOTTOM EDGE           RIGHT EDGE    SOLO CORNER
a += x + y                 a += x                a += y        
x += a                     x += a

Correct?

Next, since your iterators step by two in both dimension, it seems like every stencil is completely independent for a given iteration:

Stencil `iter.nth(i)` is labeled `i`
+---+---+---+---+---+---+---+---+---+
| 0 | 0 | 1 | 1 | 2 | 2 | 3 | 3 | 4 |
+---+---+---+---+---+---+---+---+---+
| 0 |   | 1 |   | 2 |   | 3 |   | 4 |
+---+---+---+---+---+---+---+---+---+
| 5 | 5 | 6 | 6 | 7 | 7 | 8 | 8 | 9 |
+---+---+---+---+---+---+---+---+---+
| 5 |   | 6 |   | 7 |   | 8 |   | 9 |
+---+---+---+---+---+---+---+---+---+

Correct?

If those are both correct and you don't need odd offsets or other interoperability, I'd say just store a Vec of stencils (2x2 cells). But let's assume you need odd offsets, or just want to keep the typical data layout: it's still quite approachable with safe code!

Split the entirety of data into slices covering two rows (chunks(2 * width))
- Split each of those in half (top and bottom -- the latter need not be mut even)
- And discard any leading offset you need to from top and bottom
  - Take chunks of size 2 from top and bottom and zip them together
  - These are your sencils

You may want to use chunks_exact and handle the edge cases separately.

Playground.

gabyx · January 15, 2023, 9:13am

@quinedot : Good stuff!
Your assumptions are almost correct, altough the computation I used in the compute_stencil
was arbitrary: what need to compute is:

For the divergence (only a needs to be mutable) it seems your iterator chaining for the full stencil could be nice. One needs to know I need to shift this stencil by its width so 4 times, should not be a problem, if I have boundary cells I never access for the sake of having only one FullStencil. The update calculation in 2) needs full mutability of cells in the stencil. Because the code at the end is really cumbersome because all the logic is in the boundary handling, I really want to strive for only one stencil, one type.

I am just wondering, your solution is certainly better, however is it right to say that my solution
could be approached if one needs other fancier non-aliased mutable splits, altough here as you showed its approachable with iterators.

Using unsafe doesn't turn off Rust's guarantees or invariants, it puts the onus of proving that they are upheld on you.

I am really fascinated that even tough I did not uphold the invariants, rust still chose a longer liftime, but was that a coincidence? and in some circumstances the runtime could just crash because of a segfault instead of gracfully continuing (because the life time of my v was magically extended) or a n other words if using unsafe and not upholding the invariants rust doesnt guarantee that memory lifetime is correct = UB ?

quinedot · January 15, 2023, 11:35pm

The non-overlapping stencils is the larger point; you can adapt the example to be fully mutable or what-have-you.

My instinct is to pull the boundary checks out of the iteration loop. If you have a 2,001 x 2,001 grid, that's 1,000,000 full stencils and 2,000 partial stencils. Maybe the optimizer can remove those checks, but if not, handling the full stencils separately saves you 1,000,000 avoidable checks per plane iteration.

But you could always leave it for potential future optimization and have something like

// Full: `[tl, tr], Some([bl, br])`
// Row: `[tl, tr], None`
// Col: `[tl], Some([bl])`
// Corner (optional?): `[tl], None`
struct Stencil<'a> {
    top: &'a mut [T],
    bottom: Option<&'a mut [T]>
}

Although maybe an actual custom iterator an an enum would be better. Either way you'll need something other than zip.

Miri doesn't like your compiling version (Tools Miri), I think because you keep hitting &mut data after creating previous *muts, and invalidating those raw pointers. So long as you're dealing with non-overlapping regions, I think you could just start with a *mut to the entire slice and then use pointer arithmetic on that to produce your per-cell *mut.

That is, make one *mut to everything, and then all mutable access after that is an ancestor of that *mut. Making a &mut invalidates everything below that place in the tree of borrows. So it's fine at the leaves, but not above that.

Miri seems happier after making that change.

However, if there's a safe way to do things, especially if it doesn't involve much or any performance hit, go with the safe way. Getting aliasing right in unsafe Rust is very tricky.

I don't know how to give a concise reply to the final unsafe question, so I'll put that in a separate post.

quinedot · January 16, 2023, 12:58am

I don't know how to give a concise answer here, so apologies in advance for the incoming wall of text.

I think the actual core of your question is "why did test compile with the drop(v) in it?" To attempt and answer this, I'm going to go a bit more into what the borrow checker actually does.

To try and make sure we don't talk past each other, let me first note that

Lifetimes are the output of a static (compile time) analysis, and are erased by run time
As such, there's no such thing as "extending a lifetime" really
Liveness scopes aren't lifetimes, though it's true a reference to an object must have a lifetime less than the liveness scope of the object. So v doesn't have a lifetime per se.
The borrow checker is more of a constraint solver than a lifetime-assigner

Roughly how it works today is that, within a given function, it looks at any code involving a lifetime, and calculates a set of constraints. For example,

// N.b. this is a `&mut` using function, not `*mut`, e.g.
//   `some_borrowing_function<'a>(&'a mut [i32]) -> Thing<'a>`, aka
//   `some_borrowing_function(&mut [i32]) -> Thing<'_>`
// Note how the lifetimes are connected
fn test() {
    let mut v = vec![1];                     // L1
    let s = some_borrowing_function(&mut v); // L2: Takes 'v, returns 's
    drop(v);                                 // L3
    *s.cell += 3;                            // L4: 's used here
}

This could generate the constraints that

'v: 's (the borrow in s cannot outlive the borrow of v that it came from)
- Both must be alive at L2 for the assignment
's must be alive at L4 where it is used
's must be alive at L3 for continuity from L2
Altogether, 'v and 's must be alive at L2, L3, L4
- 'v due to the v: 's constraint

Once it's created all these "borrow must be alive (valid)" constraints, it looks at how each line of code is accessing or mutating data, and checks for conflicts. If it doesn't find any conflicts, the program is accepted.

It this case it would see that v gets dropped at L3, but the borrow 'v must be live at L3, and this is a conflict.

How about when you have an unbounded lifetime? ^[1]

// N.b. `some_unbounded_function<'a>(_: *mut [i32]) -> Thing<'a>` or so
// Note how there's no longer an input lifetime to connect to
fn test() {
    let mut v = vec![1];             // L1
    let s = some_unbounded_function( // L2a
        v.as_mut_ptr()               // L2b: 'v created to call method
    );                               // L2c: returns unbounded 's
    drop(v);                         // L3
    *s.cell += 3;                    // L4: 's used here
}

This could generate the constraints that

'v and 's must be alive at L2
- But there is no 'v: 's constraint! as_ptr_mut returns a lifetimeless *mut _
's must be alive at L4 where it is used
's must be alive at L3 for continuity from L2
Altogether, 's must be alive at L2, L3, L4 and 'v must be alive at L2

v gets dropped at L3 still, but that's fine for this analysis -- there was no constraint that 'v would be alive there, and there's nothing about 's that says it relies on v.

So it's not that "'v got extended" past the drop(v), it's that without the 'v: 's constraint, there's nothing requiring that 'v be valid past L2. Note that the full signature of get_mut_ptr is

pub const fn as_mut_ptr<'a>(&'a mut self) -> *mut T

The borrow only has to last until the *mut T is returned; there is no output lifetime to result in some constraint. 'v won't be used any more, and the actual input to some_unbounded_function -- the *mut -- has no lifetime to connect to the output lifetime.

What was the root of the problem in the problematic version of test? In ~~some_unbounded_function~~ PosStencilMut::new, you used a *mut to create a &mut with ^[2] unbounded lifetime, which in turn allowed creating a reference that outlived the drop(v). The function was unsound -- possible of causing UB when called from safe code. It could have been an unsafe function with the same qualifications as changing *mut into &mut. ^[3] Then it would have been test's responsibility.

But encoding the actual qualifications in the ABI by maintaining the 'v: 's connection/constraint is even better.

UB is insidious -- once you hit UB, all bets are of. UB can even "time travel" -- if you're unconditionally going to hit UB, parts of your program you consider to logically come before that can misbehave too. So once you have UB, there are no guarantees. That includes memory safety guarantees.

Maybe it's semantics, but I wouldn't say "if you use unsafe incorrectly, Rust might calculate lifetimes incorrectly". This case was more like "if your API doesn't convey the necessary constraints for soundness/defined behavior, the borrow checker can't catch any violations." ^[4] The result may be unsoundness or UB.

More generally, using unsafe without upholding the invariants may result in code that's definitely UB or code that's unsound (UB only if called in a certain way). Rust doesn't generally catch unsafe mistakes as dipping into unsafe is telling the compiler "I know you can't prove it, but trust me -- all the safety invariants are in tact after this."

Looking back, let me file a nit against myself: your new function wasn't technically unbounded because of the PhantomData parameter. But the PhantomData construction was itself unbounded, so the effect was completely the same. I'll just keep considering that function unbounded for this discussion. ↩︎
(effectively) ↩︎
Modulo fixing the Miri errors discussed in the last post. ↩︎
Though it's true that it is only possible to not convey the necessary constraints with unsafe somewhere. ↩︎

gabyx · January 16, 2023, 5:49pm

@quinedot : Thanks a lot for this long and even more technical detailed answer than I could wish for!

So to summarized: The example in the unboudnded lifetime case produces UB and I assumed the call to *s.cell += 3 will produce a segfault, but the line just passed fine, because its UB and the memory obvisouly was still belonging to my process (?) and I did not pass this boundary, but I can be sure this L4 is an access violation because v was destroyed. Please correct me if I am wrong.

quinedot · January 16, 2023, 7:54pm

Both examples were UB due to the thing Miri complained about if nothing else. But let's assume that was fixed in both first.

I think the example without an unbounded lifetime is probably ok (but I didn't do an in depth review). Miri is okay with it and the construction lifetimes make sense to me anyway.

The example with an unbounded lifetime and a drop is definitely UB due to accessing something that was dropped.

If you removed the drop, it might merely be unsound. If additionally the unsoundness was all behind private functions that didn't trigger UB, then the program as a whole might be well defined! (But I did even less review of this case so consider that wild speculation.)

Without any documentation of the necessary safety requirements or why a given unsafe block is valid, I'd still consider a locally unsound but globally sound module problematic (probably the unsoundness was accidental and as such there's a good chance for it to become UB or be publicly exposed in a later version), but sometimes the unsafe abstraction boundary is intentionally at the module or library level and not just the function level.

Check out the methods and trait implementations of binary_heap::PeakMut for example; Drop and DerefMut and pop all have to work together, but combined they provide a sound data type (which furthermore protects the ordering invariant of the type). It would be easy to create a PeakMut with an invalid original_len within this module, but impossible outside of it.

(There's probably even better examples, but this is one that came to mind.)

Nit on "produce a segfault": That may be the most likely scenario for a drop-then-read like here, but UB has no guarantees. Maybe it just turns that function into a no-op. Maybe it scribbles memory in a non-violating manner and the program completes, but not usefully. Etc.

gabyx · January 19, 2023, 11:11pm

Thanks for all the help so far. Really appreciated.

Another question with rayon and Iterators turned up:
I have now a compiling (, not sure if everything is logically correct but leave that aside ;))
stencil together with the iteration and its apply function:

impl Grid {
    fn apply_pos_stencils<T>(&mut self, func: T)
    where
        T: Fn(&mut PosStencilMut<Cell>) + Send + Sync,
    {
        const OFFSETS: [Index2; 4] = [idx!(0, 0), idx!(1, 0), idx!(0, 1), idx!(1, 1)];

        for offset in OFFSETS.iter() {
            let mut stencils: Vec<_> = positive_stencils_mut(
                self.cells.as_mut_slice(),
                self.dim,
                Some(idx!(1, 1)),
                Some(self.dim - idx!(1, 1)),
                Some(*offset),
            )
            .collect();

            stencils.par_iter_mut().for_each(&func);
        }
    }

So far I needed to collect all stencil lightweight reference objects PosStencilMut into a Vec, probably due to the fact that I do not fullfill the iterator requirement in rayon (?).
I could not even preallocate that Vec because the borrowchecker complains something about
that I cannot fill it dynamically during looping (prob. due to producing multiple &mut probably (if I would add them which I obvisouly cant.)

let mut stencils: Vec<PosStencilMut<Cell>> = Vec::with_capacity(self.cells.len());

        for offset in OFFSETS.iter() {
            stencils = positive_stencils_mut(

error[E0499]: cannot borrow `self.cells` as mutable more than once at a time
   --> src/scene/grid.rs:322:17
    |
321 |             stencils = positive_stencils_mut(
    |             -------- first borrow might be used here, when `stencils` is dropped and runs the `Drop` code for type `Vec`
322 |                 self.cells.as_mut_slice(),
    |                 ^^^^^^^^^^^^^^^^^^^^^^^^^ `self.cells` was mutably borrowed here in the previous iteration of the loop

The question remains how I would get rid of the dynamic memory allocation in Vec.
Also one stencils value comprises all &mut of the whole grid, so I am basically unable to prestore these lists somewhere (because then I would have 4x all &mut ;-).
Is the only way to make it better parallelizabel, to implement the ParallelIteratortrait or so inrayon which seems a bit complicated?

quinedot · January 20, 2023, 12:43am

You'll probably have to adjust your iterator implementation to use rayon's iterators and traits (and add some Send and maybe Sync bounds).

Here's what it looks like when I adjust one of my earlier examples.

gabyx · January 20, 2023, 11:35am

Thanks, that worked nicely:

gabyx · January 22, 2023, 1:45pm

Unfortunately the whole implementation is really slow, even slower than in single core. Not really sure what the problem is, I guess there is lots of overhead when running:

github.com

gabyx/RustoFluid/blob/main/src/scene/grid.rs#L294


      
          fn solve_incompressibility(
              &mut self,
              log: &Logger,
              dt: Scalar,
              iterations: u64,
              density: Scalar,
              execution_mode: ExecutionMode,
          ) {
              match execution_mode {
                  ExecutionMode::Parallel => {
                      self.solve_incompressibility_parallel(log, dt, iterations, density, false);
                  }
                  ExecutionMode::ParallelUnsafe => {
                      self.solve_incompressibility_parallel(log, dt, iterations, density, true);
                  }
                  ExecutionMode::Single => {
                      self.solve_incompressibility_sequential(log, dt, iterations, density);
                  }
              }
              self.compute_stats(&log);
          }

even tough I first distribute all values and the run the iterative loop RustoFluid/grid.rs at ebbfa742e8f2a08a1edf069edbc9347597cb48c4 · gabyx/RustoFluid · GitHub
which runs in each loop 4 times a parallel stencil run.

Topic		Replies	Views
Parallel non-blocking access to vector elements? help	5	519	October 19, 2023
Is there a multi-thread version of std::cell::RefCell? help	13	752	June 22, 2023
Which container are thread-safe and can return mutable reference? help	21	1552	January 3, 2020
Alternative to Mutexs, When code is sound without them	4	442	February 20, 2023
Can I use a Mutex<()> to avoid conflicts when accessing an UnsafeCell? help	18	1076	January 6, 2023

Turning mutable grid access into a thread-safe grid access (stencils)

Related Topics