Pin and Unsafe usage code review

Hi, I have publish compose-rt crate which help user to manage node tree in a way similar to Jetpack Compose Runtime. One usage is build declarative GUI framework.

I have two point not sure in the implementation, and would like your review and comment.

  1. usage of Pin, is it needed?
    compose-rt/composer.rs at main · cksac/compose-rt · GitHub
  2. not sure soundness of one unsafe in the this crate
    compose-rt/composer.rs at main · cksac/compose-rt · GitHub

Considering that Unpin is a super-trait of ComposeNode, putting Pin on it is a no-op. As for the unsafe, it's quite confusing. Are you trying to go from *mut T to &mut T? The way to do that is unsafe { &mut *ptr }.

3 Likes

thanks for your time to review. I have removed the the Pin and changed to unsafe { &mut *ptr } .

for the soundness which I am not sure, Line213-Line238

summary below
given:

self is type Composer, self.tape = Vec<Slot<Box<dyn ComposeNode>>>

let cached: Option<&mut ComposeNode> = self.tape.get_mut(i).map(..get node from slot via unsafe to end mut brrorw of self..);
if Some(node) = cached {
children_closure(self) # closure take &mut self, wihch will growth self.tape after index i
update_closure(node) # closure take &mut ComposeNode
}

Is it safe to pass node to update_closure if self.tape growth beyond its capacity in children_closure and reallocation?

Your code is unsound. As a minimal example, it causes this safe program to produce a segmentation fault when run:

use compose_rt::{Composer, Recomposer};
use std::mem;

fn main() {
    let group = |cx: &mut Composer, cached: bool| {
        cx.group(
            |_| "Hello, World!",
            |cx| {
                if cached {
                    let mut recomp = Recomposer::new();
                    mem::swap(cx, recomp.composer());
                }
            },
            |_| false,
            |node| assert_eq!(*node, "Hello, World!"),
            |_| {},
        );
    };
    let mut recomp = Recomposer::new();
    group(recomp.composer(), false);
    recomp.finalize();
    group(recomp.composer(), true);
}

This is why unsafe casts should never be used to extend the lifetime of references. Without the pointer cast, the compiler gives an error about this:

error[E0499]: cannot borrow `*self` as mutable more than once at a time
   --> /home/lm978/compose-rt/compose-rt/src/composer.rs:243:42
    |
222 |         let cached = self.tape.get_mut(cursor).map(|s| {
    |                      ------------------------- first mutable borrow occurs here
...
243 |                         let c = children(self);
    |                                          ^^^^ second mutable borrow occurs here
244 |                         apply_children(node, c);
    |                                        ---- first borrow later used here

(In fact, this compiler error is what enabled me to design the example above: it shows exactly where the issue occurs.) Therefore, I would recommend taking the negligible performance hit from a RefCell, to prevent errors like this. unsafe blocks should only be used when one is absolutely sure that the preconditions can never be violated; the ideal scenario is when the unsafety is contained within a single function that does not call anything else.

1 Like

Instead of preventing user from calling mem:swap in compile time.
I added an runtime check

lett id = self.id
user_closure(self)
assert!(id == self.id, "Composer changed");

what do you think?

That still doesn't work.

use compose_rt::{Composer, Recomposer};
use std::mem;

fn main() {
    let group = |cx: &mut Composer, cached: bool| {
        cx.group(
            |_| "Hello, World!",
            |cx| {
                if cached {
                    let mut recomp = Recomposer::new();
                    mem::swap(cx, recomp.cx());
                    recomp.finalize();
                    recomp.finalize();
                    mem::swap(cx, recomp.cx());
                }
            },
            |_| false,
            |node| assert_eq!(*node, "Hello, World!"),
            |_| {},
        );
    };
    let mut recomp = Recomposer::new();
    group(recomp.cx(), false);
    recomp.finalize();
    group(recomp.cx(), true);
}

Any use of the data reference following the self reference is considered undefined behavior. Also, the children callback can add slots to the tape vector. Expanding this vector past its capacity moves the Slots contained in it, including the Boxes they store. It is considered invalid to hold a reference to the contents of a moved Box, which is what your library is currently doing. Really, I'd encourage you to replace the unsafe block with something else.

Unfortunately that still won't work.

Using the reference to self and giving it to the user function invalidates any other borrows from it. This means that your cached variable is invalid after that and you would have to go and get it from the Vec again.

Second try to restrict the API and extra composing flag runtime check, Recomposer.compose

pub fn compose<F, T>(&mut self, func: F) -> T
where
F: FnOnce(&mut Composer) -> T,
{
let id = self.composer.id;
self.composer.composing = true;
let t = func(&mut self.composer);
assert!(id == self.composer.id, "Composer changed");
self.finalize();
t
}

For the Vec capacity over and reallocated issue, which is my initial concerns, To address this, I consider using some Arena crate, with only allocate new pages instead of moving the existing Vec.

the ability of invalidates others is mainly restrict by the Composer's API, now it will only allow it to growth the Vec it owns.

The library is still unsound despite the extra assertions.

use compose_rt::{Composer, Recomposer};
use std::{
    mem,
    panic::{self, AssertUnwindSafe},
};

fn main() {
    let group = |cx: &mut Composer, cached: bool| {
        cx.group(
            |_| "Hello, world!",
            |cx1| {
                if cached {
                    panic::set_hook(Box::new(|_| {}));
                    let mut swap = |recomposer: &mut Recomposer| {
                        panic::catch_unwind(AssertUnwindSafe(|| {
                            recomposer.compose(|cx2| mem::swap(cx1, cx2))
                        }))
                        .unwrap_err();
                    };
                    let mut recomposer = Recomposer::new();
                    swap(&mut recomposer);
                    recomposer.compose(|_| {});
                    recomposer.compose(|_| {});
                    swap(&mut recomposer);
                    Box::leak(Box::new([0usize, 1usize]));
                }
            },
            |_| false,
            |node| assert_eq!(*node, "Hello, world!"),
            |_| {},
        );
    };
    let mut recomposer = Recomposer::new();
    recomposer.compose(|cx| group(cx, false));
    recomposer.compose(|cx| group(cx, true));
}

Just regenerate the data reference from self every time you need it, perhaps by defining a new method on Composer. The performance cost is just a pair of dereferences and a zero-cost expect, which is completely negligible on all systems. Replacing the unsafe block with a repeated lookup would remove all unsoundness from your library. Due to how Rust is defined, even if you don't change both self and data at the same time, your unsafe block as written will ALWAYS result in undefined behavior.

Very appreciate your time and review. I have managed to remove the unsafe using the trick.

let (head_slots, tail_slots) = self.tape.split_at_mut(child_start);
fix unsafe · cksac/compose-rt@4ff90e1 · GitHub

now your code will not get segfault, but still has exception if print in catch hook

panic::set_hook(Box::new(|p| {println!("{:?}", p}));

trace log

[2022-03-16T04:36:56Z TRACE compose_rt::composer] + 0: &str | TypeId { t: 13307641874416792075 } | compose-rt/examples/poc.rs:14:12 |*
[2022-03-16T04:36:56Z TRACE compose_rt::composer] C 0: &str | TypeId { t: 13307641874416792075 } | compose-rt/examples/poc.rs:14:12 |*
PanicInfo { payload: Any { .. }, message: Some(Composer changed), location: Location { file: "/Users/cksac/codespace/compose-rt/compose-rt/src/recomposer.rs", line: 45, col: 9 }, can_unwind: true }
PanicInfo { payload: Any { .. }, message: Some(Composer changed), location: Location { file: "/Users/cksac/codespace/compose-rt/compose-rt/src/recomposer.rs", line: 45, col: 9 }, can_unwind: true }
PanicInfo { payload: Any { .. }, message: Some(Composer changed), location: Location { file: "compose-rt/examples/poc.rs", line: 14, col: 12 }, can_unwind: true }

Now, does it mean I don't need to worry the Vec reallocate issue?

Yes, you no longer need to worry about Vec reallocation. Users can still cause unexpected behavior by doing something dumb (like my false AssertUnwindSafe above), but at worst it will cause the code to panic or produce unexpected data. But note that programs are allowed to continue after catching a panic; libraries must always account for this when holding unsafe data through user code. This is known as panic safety. But even though you always want to avoid memory unsoundness, the user should not expect to receive meaningful data from the Composer following a panic.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.