Passing mutable raw pointer into callback

In the context of programming with window callbacks on Window, I've fallen into the following situation:

I've a struct which I can control:

struct Foo {
  // some custom fields
}

and a callback function whose the content is also controllable, but its type is fixed. This function will be called multiple times (with different parameters) by the system. There is only a single moment where param is controllable, it is where msg is MSG_INIT.

unsafe extern "system" fn callback(msg: usize, param: isize) {
  // some custom implementation
}

In the function, I need access/modify an instance of Foo (actually some fields Foo are initialized using information that the system passes to the callback). I've come to the following code:


let mut foo = ...some Foo...

extern "system" fn callback(msg: usize, param: isize) {
  static STORED_FOO_ADDR: OnceCell<isize>

  match msg {
    // this message is passed first, once by the system
    // and this is the single moment where I can control the param,
    // param will be a raw pointer to foo
    MSG_INIT => {
      // store address of foo
      STORED_FOO_ADDR.set(param).unwrap();
      // then use it as a &mut Foo
      let foo = unsafe {
        (param as *mut Foo).as_mut().unwrap()
      };
      // use foo
      ...
    }

    MSG_READ => {
      // extract the raw pointer from the stored pointer 
      let foo = unsafe {
        (*STORED_FOO_ADDR.get_unchecked().unwrap() as *const Foo).as_ref().unwrap()
      };
      // and use foo as &Foo
      ...
    }

    MSG_WRITE => {
      // extract the raw pointer from the stored pointer
      let foo = unsafe {
        (*STORED_FOO_ADDR.get_unchecked().unwrap() as *mut Foo).as_mut().unwrap()
      };
     // and use foo a &mut Foo
      ...
    }
  }
}

// finally, use callback and foo
system_use_callback(callback, &mut foo as *mut _ as isize);

In summary, I've stored a raw pointer *mut Foo to foo, and cast back it as either &mut Foo or &Foo depending where I need to change the content of foo or not.

While the code runs, I'm still worry about it. When I cast the raw pointer as &mut Foo, then use it, the memory location of the owner may be changed.

MSG_WRITE => {
  // extract the raw pointer from the stored pointer
  let foo = unsafe {
    (*STORED_FOO_ADDR.get_unchecked().unwrap() as *mut Foo).as_mut().unwrap()
  };
  // and use foo as &mut Foo
  ...
}

The next time when I cast the raw pointer as &Foo,

MSG_READ => {
  // extract the raw pointer from the stored pointer
  let foo = unsafe {
    (*STORED_FOO_ADDR.get_unchecked().unwrap() as *const Foo).as_ref().unwrap()
  };
  // and use foo as &Foo
  ...
}

the program may crash, or worst I may have another object which has nothing in common.

Is this case possible. If yes, how can I refactoring the code?.

This is a problem that is very common in futures (although it's without FFI).

Have a look at Box::pin (or Pin in general), which will prevent Foo from moving.

I don't think the pin would help here. Pin doesn't provide/guarantee anything on its own in unsafe context, I guess it only adds up its own UB condition into already error prone ffi code.

2 Likes

Does param need to be an actual pointer, or can it be an index into a static Mutex<Vec<Foo>> instead (or similar)?

This is what I think I would do for such a problem.

use lazy_static::lazy_static; // 1.4.0
use std::sync::{Arc, RwLock, Mutex, Once};

struct Foo(u32);
impl Foo {
    pub fn do_a(&self) -> u32 { self.0 }
    pub fn do_b(&mut self) { self.0 += 1; }
}

extern "system" fn callback(msg: u32, param: usize) {
    static FOO_INIT: Once = Once::new();
    lazy_static!{
        static ref FOO: RwLock<Option<Arc<RwLock<Foo>>>> = RwLock::new(None);
    }
    FOO_INIT.call_once(|| {
        let mut w = FOO.write().unwrap();
        w.replace(unsafe { Arc::from_raw(param as *const _) });
    });
    let foo = FOO.read().unwrap();
    let foo = foo.as_ref().unwrap();

    match msg {
        1 => {
            let r = foo.read().unwrap();
            dbg!(r.do_a());
        },
        2 => {
            let mut w = foo.write().unwrap();
            w.do_b();
        }
        _ => {},
    }
}

fn main() {
    let foo = Arc::new(RwLock::new(Foo(0)));
    let foo = unsafe { Arc::into_raw(foo) } as usize;
    callback(1, foo);
    callback(2, 0);
    callback(1, 0);
}

2 Likes

It takes time to digest your solution, thank you. As far as I understand, you've used :

  • Arc so that the instance of Foo has fixed memory location, and

  • RwLock to get mutable reference to the instance

Is it possible to avoid RwLock (and accept some unsafe), since there is only single thread here?

If only callback needs access to Foo, you can use Box instead of Arc.
RefCell can be used instead of RwLock for interior mutability in single threaded programs.

1 Like

This simplifies the problem by avoiding a lot of potential problematic situations that arise in multi-threaded code. So I'd suggest the following code:

use ::core::{cell::RefCell, mem};

#[allow(nonstandard_style)]
mod Msg {
    pub type T = u32;
    pub const Init: T = /* ... */ 0;
    pub const Read: T = 1;
    pub const Write: T = 2;
}

trait OnCallback : 'static {
    fn on_read (self: &'_ Self);
    fn on_write (self: &'_ mut Self);
}

unsafe // Safety: if `msg == MSG_INIT`, param must be `Box<T>`
extern "system"
fn callback<T : OnCallback> (msg: u32, param: usize)
{
    thread_local! {
        static STATE: RefCell<Option<Box<dyn OnCallback>>> = None.into();
    }
    STATE.with(|state| match msg {
        | Msg::Init => {
            state.borrow_mut().replace(unsafe {
                // Safety: contract of the function
                mem::transmute::<_, Box<T>>(param)
            });
        },
        | Msg::Read => {
            state.borrow().as_ref().unwrap().on_read();
        },
        | Msg::Write => {
            state.borrow_mut().as_mut().unwrap().on_write();
        },
        | _ => unreachable!(),
    });
}

fn split_for_cb<T : OnCallback> (it: T)
  -> (
        unsafe extern "system" fn(msg: u32, param: usize),
        usize,
    )
{
    (
        callback::<T>,
        Box::into_raw(Box::new(it)) as usize,
    )
}

impl OnCallback for i32 {
    fn on_read (self: &'_ i32)
    {
        println!("{}", *self);
    }

    fn on_write (self: &'_ mut i32)
    {
        *self += 27;
    }
}
fn main ()
{
    let foo = 42_i32;
    let (cb, param) = split_for_cb(foo);
    unsafe {
        cb(Msg::Init, param);
        cb(Msg::Read, 0);
        cb(Msg::Write, 0);
        cb(Msg::Read, 0);
    }
}
1 Like

Thank you @Yandros. Since I need also to use Foo outside the callback, then Arc should be used (instead of Box)?

Yes, Arc<RwLock<Foo>> or Rc<RefCell<Foo>> would be needed (and tweak OnCallback to take into account the fact that the pointee is now always shared:

trait OnCallback : 'static {
    fn on_read (self: &'_ Self);
-   fn on_write (self: &'_ mut Self);
+   fn on_write (self: &'_ Self);
}

// ...
- unsafe // Safety: if `msg == MSG_INIT`, param must be `Box<T>`
+ unsafe // Safety: if `msg == MSG_INIT`, param must be `Rc<T>`

-   STATE: RefCell<Option<Box<dyn OnCallback>>> ...
+   STATE: RefCell<Option<Rc<dyn OnCallback>>> ...
// ...
        | Msg::Write => {
-           state.borrow_mut().as_mut().unwrap().on_write();
+           state.borrow().as_ref().unwrap().on_write();

and

impl OnCallback for RefCell<i32> {
    fn on_read (self: &'_ RefCell<i32>)
    {
        println!("{}", *self.borrow());
    }

    fn on_write (self: &'_ RefCell<i32>)
    {
        *self.borrow_mut() += 27;
    }
}

fn split_for_cb<T : OnCallback> (it: Rc<T>)
  -> (
        unsafe extern "system" fn(msg: u32, param: usize),
        usize,
    )
{
    (
        callback::<T>,
        Rc::into_raw(it) as usize,
    )
}

fn main ()
{
    let foo = Rc::new(RefCell::new(42_i32));
    let (cb, param) = split_for_cb(Rc::clone(&foo));
  • Having on_write() take a &Self may look surprising but at least it lets the type decide how it wants to handle the shared mutability (e.g., in the example with i32, I could have optimized the code into using a Cell rather than a RefCell since I know I would only be using an integer). Or more generally, it would allow to have more fine-grained shared mutability.

  • But you can just keep the trait "intuitive" and have on_write(&mut self), by always wrapping the "thing" (e.g. i32): adapt the cb and the static to also carry an internal RefCell (i.e., ... RefCell<Option<Rc<RefCell<T>>>>), and then use .borrow().as_ref().unwrap().borrow_mut().on_write() on the Msg::Write branch.

All this ends up being a bit more cumbersome, but the resulting code ends up being far more robust (the only place able to trigger UB is if someone calls the cb with Msg::Init and a param that is not a Rc::into_raw pointer).

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.