Maybe this is a problem that convert `RwLock<()>` into compile-time checks

I don't know how to pick the correct title, since the question is a little bit complex.

The famous statistical software, R, has some mechanism to deal with FFI: You could alloc R objects in your ffi code, but when you communicate with R, you should ensure all the allocated objects are protected. Since R is mainly a single-thread interpreter, there is no need to consider multi-threading problem:

#[no_mangle] // exporting R code, *mut u8 is actually a `SEXP` opaque type.
extern fn foo(_protected_elem:*mut u8)->*mut u8 { // the return value should not be protected.
    let unprotected = unsafe {Rf_alloc(...)}; // alloc objects, may trigger R gc and then recycle all unnecessary objects.
    // let _ = unsafe {Rf_alloc(...)}; // this alloc might trigger R gc and thus invalidate `unprotected`.
    let protected = unsafe {PROTECT(unprotected)}; // protect `unprotected`.
    let unprotected = unsafe {Rf_alloc(...)}; // this alloc is fine since the allocated object is protected.
}

To avoid the careless about calling Rf_alloc accidently, such code works:

struct Guard;
impl Guard {
    fn alloc<'a>(&'a mut self)->Unprotected<'a>{...}
    fn protect<'a>(item: Unprotected<'a>)->Protected<*mut u8> {...}
    fn return<'a>(item: Protected<*mut u8>)->*mut u8 {...}
}
#[no_mangle] // exporting R code, *mut u8 is actually a `SEXP` opaque type.
extern fn foo(_protected_elem:*mut u8)->*mut u8 { // the return value should not be protected.
    let mut guard = Guard;
    let unprotected = guard.alloc(...); // alloc objects, may trigger R gc and then recycle all unnecessary objects.
    // let _ = guard.alloc(...); // cannot borrow guard since guard is borrowed
    let protected = Guard::protect(unprotected); // protect `unprotected`, thus the borrow of guard is released
    let unprotected = guard.alloc(...); // this alloc is fine since the allocated object is protected.
}

My question here is, The Guard function could be called for many times, thus 2 different guard could be created at the same time, thus the protection may fail:

#[no_mangle] // exporting R code, *mut u8 is actually a `SEXP` opaque type.
extern fn foo(_protected_elem:*mut u8)->*mut u8 { // the return value should not be protected.
    let mut guard = Guard;
    let unprotected = guard.alloc(...);
    let mut guard = Guard;
    let unprotected2 = guard.alloc(...); // unprotected might be recycled here
}

To avoid the duplication, maybe an Arc could be used:

pub static RwLock<Guard> GUARD=RwLock::new(Guard{_marker:()}); // use a marker to ensure `Guard` cannot be constructed in other way
// ...
let mut guard=GUARD.write().unwrap(); // enough.

Here, the guard could only be initialized once, thus all the unprotected results could be handled very well, but the cost is, we have to lock the RwLock in each function, maybe lock it for several times if another function call is used

#[no_mangle]
extern fn foo()->*mut u8{
    let mut guard=GUARD.write().unwrap();
    // do something
    drop(guard); // since bar needs the guard
    let sexp = bar();
    let mut guard=GUARD.write().unwrap();
    // do something
    Guard::return(sexp)
}
#[no_mangle]
extern fn bar()->*mut u8{
    let mut guard=GUARD.write().unwrap();
    // do something
    Guard::return(...)
}

Here, a new problem occurs: if the Guard is not dropped before bar is called, bar will panic.

Maybe we could write something like

#[no_mangle]
extern fn bar(mut guard:Guard)->*mut u8{
    // do something
    Guard::return(...)
}

but I'm not sure is it valid to put the ZST directly on the ffi interface. (more, guard is not FFI-safe.)

Is there a recommand way to perform such borrow check?

This really looks like multiple problems to me, and the whole Guard system seems to be really ill-designed:

  • why is it at all possible to construct multiple Guards if there should only ever be a single instance of it?
  • if some functions require it, why can they currently be called without an instance of it?
  • the questions of threading vs. concurrency are being mixed up. You assert that the interpreter is threaded, yet you are using an RwLock, which can deadlock if acquired twice on the single thread, so it's actually not even right to use it, to begin with. Concurrent (shared) mutation is troublesome even in the absence of threads, so this looks to me like it needs a complete ground-up redesign.

What are the actual requirements (from the R interpreter's side) on FFI functions? Do they need to be thread-safe? Re-entrant? Neither?

Since we have to make a path that could obtain one Guard, and I have no idea to prevent go through the path for mutiple times.

The Guard struct is actually a abstracted ZST, which provide no data about anything.
The restriction is in R documents, not in R FFIs, thus the Guard struct is not needed for executing R functions.

Actually, no need to consider threading or concurrency problem. R is a single thread software.

This is why I asked the question:)

R uses garbage collection thus before any R function is called (expect protect), all the used variable must be protected. This is the only restriction.

Since R is mainly a single-thread program, there is no need to consider multi-threading problems.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.