Can chaining self-referential structs be done soundly?

I'm the author of self_cell.

A month ago I received this issue Possible to have a lifetime in owner? · Issue #9 · Voultapher/self_cell · GitHub and still I don't feel confident enough about dropping the requirement that owner has to have a static lifetime. While I can't come up with an unsound use case, I have this nagging feeling I'm missing something. I'd appreciate your opinions and input regarding this topic.

Assuming the full struct is also annotated with the lifetime, it seems ok with me.

Be aware that your code hits undefined behavior if both the owner and dependent are zero-sized types since core::alloc::alloc requires the layout to have a non-zero size. Consider using Box for allocations.

Thanks for pointing that out! I initially tried to use Box but couldn't figure out how to correctly do this partially initialized struct pattern. Just pushed a commit https://github.com/Voultapher/self_cell/commit/bb2f502fa0bae2318987e7e9a57e5a9a83e8ac25 that fixes it. Previously for soundness bugs I yanked all affected versions and published a new major. However this seems a like a rather esoteric edge case. Maybe it would be fine to just publish 0.9.1 with the fix. What do you think?

You can use the box to allocate uninitialized memory in the following way:

type JoinedCell<'a> = $crate::unsafe_self_cell::JoinedCell<$Owner, $Dependent<'a>>;

let b: Box<MaybeUninit<JoinedCell>> = $crate::alloc::boxed::Box::new(MaybeUninit::uninit());
let mut joined_ptr: *mut JoinedCell = $crate::alloc::boxed::Box::into_raw(b).cast();

let owner_ptr = core::ptr::addr_of_mut!(joined.owner);
let dependent_ptr = core::ptr::addr_of_mut!(joined.dependent);

// Move owner into newly allocated space.
owner_ptr.write(owner);

// Initialize dependent with owner reference in final place.
dependent_ptr.write(dependent_builder(&*owner_ptr));

Self {
    unsafe_self_cell: $crate::unsafe_self_cell::UnsafeSelfCell::new(
        joined_void_ptr: NonNull::new_unchecked(joined_ptr.cast()),
    ),
}

To deallocate it, you just use Box::from_raw to create a Box<JoinedCell> and drop the box. Note that this runs the destructor of the value. To not runt the destructor but just deallocate, you would create a Box<MaybeUninit<JoinedCell>> instead.

1 Like

Also, you have memory leaks if the provided user-code panics.

That's known and documented self_cell in self_cell - Rust. Given that catch_unwind may not catch all panics I decided it be better for users to know about this limitation. I even provide an example how to avoid leaks https://github.com/Voultapher/self_cell/tree/main/examples/no_leak_panic

The only panics that it can't catch are those that abort the process.

1 Like

Even so, I'd prefer the user to be aware of the catch_unwind and write it themselves. As I understand it, there can also be a performance penalty for catch_unwind which I'd rather the user decide if they want it.

You can do it without catch_unwind using a destructor. This has no runtime cost for code that doesn't panic.

2 Likes

Please elaborate.

When something panics, the destructors of local variables run. That destructor can perform cleanup.

This is known as the drop guard pattern. I can try to find some examples of it.

Correct me if I'm wrong, at the point I call user code I have a partially initialized struct, it wouldn't be safe to call it's destructor, right?

Check out this file: reusable-box-future/lib.rs at main · oblique/reusable-box-future · GitHub

In particular the set_same_layout method.

1 Like

Do I understand correctly, I could create a repr(transparent) type over JoinedCell that has a drop impl which only drops the owner?

In your case I would probably define the following:

struct DropGuard {
    ptr: *mut JoinedCell,
    should_deallocate: bool,
    should_drop_owner: bool,
}

impl Drop for DropGuard {
    fn drop(&mut self) {
        if self.should_drop_owner {
            let owner_ptr = core::ptr::addr_of_mut!(self.ptr.owner);
            core::ptr::drop_in_place(owner_ptr);
        }
        
        if self.should_deallocate {
            drop(Box::from_raw(self.ptr));
        }
    }
}

Then modify the booleans as appropriate throughout the method. On success, you just set both booleans to false and take the pointer.

Looks good, why would should_deallocate be optional? Currently NonNull::new handles allocation failure with a panic, and we don't have to clean up because the allocation failed right?

If nothing panics, you don't want it to deallocate the memory when it goes out of scope at the end of your method.

Also I'll have to get around to writing benchmarks now. Will report back.

Isn't that the same bool then?