ClosureCell: Design Feedback


ClosureCell is basically a Cell+. It does everything Cell does + it adds the following feature:
pub unsafe auto trait ClosureCellSafe {}

impl<T> !ClosureCellSafe for ClosureCell<T> where T: ?std::Sized {}

impl<T> ClosureCell<T>
where
    T: ?std::Sized + ClosureCellSafe,
{
    #[inline]
    pub fn with_inner(&self, f: impl std::FnOnce(&mut T) + ClosureCellSafe) {
        f(unsafe { &mut *self.value.get() });
    }
}

While being slightly restrictive on what T it works on and what variables can be captured, this method has the benefit of making it possible to get a mutable reference to the interior value without requiring unsafe on the user side.

Now, that I'm basically done, I've discovered several alternative ways to design the API.

  1. Store a Cell<T> and delegate methods
  2. Store a Cell<T> and impl Deref + DerefMut (requires Cell<T> to be !ClosureCellSafe
  3. Store Cell<T> and make it publically accessible

Maybe, there's yet another way to solve this, that didn't come to my mind. Which option would you choose and why?

Sorry, your crate is unsound.

use closure_cell::ClosureCell;

thread_local! {
    pub static FOO: ClosureCell<Vec<i32>> = ClosureCell::new(vec![1, 2, 3]);
}

fn clear_the_vec() {
    FOO.with(|cell| {
        cell.set(Vec::new());
    });
}

fn main() {
    FOO.with(|cell| {
        cell.with_inner(|mut_ref| {
            let slice = &mut_ref[..];
            clear_the_vec();
            println!("{:?}", slice); // prints deallocated memory
        });
    });
}

The auto-trait is neat, but unfortunately it doesn't protect against globals.

1 Like

Noooooo! :cry:

I thought I finally had the solution. Clearly, Rust needs a marker trait for local-only functions.

Yep, to achieve with_ref for Cell globals are annoying; a hacky way to do it would be to, in top of your auto-trait idea, to also use a macro that requires the function be const fn-compatible:

pub unsafe auto trait ClosureCellSafe {}

impl<T> !ClosureCellSafe for ClosureCell<T> where T: ?Sized {}

impl<T> ClosureCell<T>
where
    T: ?Sized + ClosureCellSafe,
{
    #[inline]
    pub unsafe fn with_inner(&self, f: impl FnOnce(&mut T) + ClosureCellSafe) {
        f(&mut *self.value.get());
    }
}

#[macro_export]
macro_rules! closure_cell_with_inner {(
    $at_cell:expr,
    |$mut_ref:ident : $T:ty| $body:expr $(,)?
) => (match $at_cell { at_cell => {
    const _: () = {
        const fn _check ($mut_ref: $T)
        {
            let _ = $body;
        }
    };

    let f = |$mut_ref: $T| $body;
    unsafe {
        $crate::ClosureCell::with_inner(at_cell, f)
    }
}})}

Obviously requiring that the body be const fn compatible will over-constraint the possibilities, but it's already a bit better than no compatibility at all :sweat_smile:

@Yandros Note that it does mean that no heap allocations can be done due to the const semantics. Which I consider constraining to the point that it's worth asking "what would/could this be used for, given the restrictions?".

For me the main motivation is to be able to check some trivial (const-compatible) things, such as .is_some(), without needing to replace twice the value:

fn main ()
{
    let cell = ClosureCell::new(None);
    dbg!(closure_cell_with_inner!(
        &cell,
        |mut_ref: &mut Option<i32>| mut_ref.is_none(),
    ));
}

For instance, I have compared the LLVM IR of these two functions:

#[no_mangle] pub
fn foo (cell: &'_ ClosureCell<Option<NonZeroU32>>)
  -> bool
{
    closure_cell_with_inner!(
        cell,
        |it: &mut Option<NonZeroU32>| it.is_none(),
    )
}

#[no_mangle] pub
fn bar (cell: &'_ ::core::cell::Cell<Option<NonZeroU32>>)
  -> bool
{
    let prev = cell.replace(None);
    let ret = prev.is_none();
    cell.set(prev);
    ret
}

and this yields:

; Function Attrs: norecurse nounwind nonlazybind readonly uwtable
define zeroext i1 @foo(i32* nocapture readonly align 4 dereferenceable(4) %cell) unnamed_addr #0 personality i32 (i32, i32, i64, %"unwind::libunwind::_Unwind_Exception"*, %"unwind::libunwind::_Unwind_Context"*)* @rust_eh_personality {
start:
  %_8.val.i = load i32, i32* %cell, align 4
  %.not.i.not.i.i.i = icmp eq i32 %_8.val.i, 0
  ret i1 %.not.i.not.i.i.i
}

; Function Attrs: nounwind nonlazybind uwtable
define zeroext i1 @bar(i32* nocapture align 4 dereferenceable(4) %cell) unnamed_addr #1 personality i32 (i32, i32, i64, %"unwind::libunwind::_Unwind_Exception"*, %"unwind::libunwind::_Unwind_Context"*)* @rust_eh_personality {
start:
  %x.val.i.i.i.i = load i32, i32* %cell, align 4
  %.not.i.not.i = icmp eq i32 %x.val.i.i.i.i, 0
  ret i1 %.not.i.not.i
}

We can see that Rust, in and of itself, already compiles those two functions to the same "logic" (bitwise-read of the discriminant, here, a comparison against 0), so this whole thing may be overkill, but we do notice that the emitted LLVM IR does state that foo does not mutate the pointee (readonly annotation) whereas this is not guaranteed for bar().


Indeed, my suspicion was right: that readonly can matter quite a lot. Let's compare, the assembly this time, of:



#[no_mangle] pub static mut SIDE_EFFECT: Option<NonZeroU32> = None;
    
#[no_mangle] pub
unsafe
fn uses_foo (cell: &'_ ClosureCell<Option<NonZeroU32>>)
  -> Option<NonZeroU32>
{
    let prev = cell.get();
    SIDE_EFFECT = prev;
    foo(&cell);
    cell.get()
}

#[no_mangle] pub
unsafe
fn uses_bar (cell: &'_ ::core::cell::Cell<Option<NonZeroU32>>)
  -> Option<NonZeroU32>
{
    let prev = cell.get();
    SIDE_EFFECT = prev;
    bar(&cell);
    cell.get()
}

(I was basically trying to test whether the second cell.get() call gets elided by reusing the first prev read)

We get:

uses_foo:                               # @uses_foo
# %bb.0:
	movl	(%rdi), %eax
	movq	SIDE_EFFECT@GOTPCREL(%rip), %rcx
	movl	%eax, (%rcx)
	movl	(%rdi), %eax
	retq
                                        # -- End function

uses_bar:                               # @uses_bar
# %bb.0:
	subq	$16, %rsp
	movl	(%rdi), %eax
	movq	SIDE_EFFECT@GOTPCREL(%rip), %rcx
	movl	%eax, (%rcx)
	movq	%rdi, (%rsp)
	movl	(%rdi), %eax
	movl	$0, (%rdi)
	movq	%rsp, %rcx
	movq	%rcx, 8(%rsp)
	leaq	8(%rsp), %rcx
	#APP
	#NO_APP
	movq	(%rsp), %rcx
	movl	%eax, (%rcx)
	movl	(%rdi), %eax
	addq	$16, %rsp
	retq

which is basically showing uses_foo has been optimized not not call foo (since nothing is done with the return value), whereas uses_bar has inlined the call to bar. That can be made obvious by #[inline(never)]-ing foo and bar, then yielding:

uses_foo:                               # @uses_foo
# %bb.0:
	movl	(%rdi), %eax
	movq	SIDE_EFFECT@GOTPCREL(%rip), %rcx
	movl	%eax, (%rcx)
	movl	(%rdi), %eax
	retq
                                        # -- End function

uses_bar:                               # @uses_bar
# %bb.0:
	pushq	%rbx
	movq	%rdi, %rbx
	movl	(%rdi), %eax
	movq	SIDE_EFFECT@GOTPCREL(%rip), %rcx
	movl	%eax, (%rcx)
	callq	*bar@GOTPCREL(%rip)   # <--- FUNCTION CALL
	movl	(%rbx), %eax
	popq	%rbx
	retq