Pinning memory to pass through the FFI

Hi!

I have encountered a similar issues on several occasions when calling C code from Rust. Some API take pointer as input, and the function can return before the the pointer is done being used, leaving it up to the C user to ensure that the pointer does not get invalidated afterwards.

A made-up example of such a case would be the following asynchronous I/O API:

extern "C" {
    /// Queue the buf for asynchronous writting and
    /// returns immediately.
    /// The memory location of buf must not change while
    /// it is being written.
    fn write(buf: *const u8, len: usize);
    /// Blocks until the asynchronous write is finished.
    fn block();
}

A Rust wrapper of that API would be:

pub struct AsynchronousWriter {
    buffer: Vec<u8>,
}

impl Write for AsynchronousWriter {

    fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
        unsafe {
            block();
        }
        self.buffer.clear();
        self.buffer.copy_from_slice(buf);
        unsafe {
            write(self.buffer.as_ptr(), self.buffer.len());
        }
        Ok(buf.len())
    }

    fn flush(&mut self) -> std::io::Result<()> {
        unsafe {
            block();
        }
        Ok(())
    }
}

However here, we would like to guarantee that the memory location of the AynchronousWriter.buffer content cannot change. I thought of making it a Pin<Vec<u8>> but [u8] is Unpin which if I understood correctly basically means that I would not get any guarantee out of my Pin. So I am a bit confused, is Pin the right tool for the job? If yes how would one use it? If not, how would one guarantee stable memory location of pointers given to a C API in general?

In this particular case you should merely avoid touching the vector until the write succeeds. A vector already guarantees that its buffer does not move unless you call a method that causes it to reallocate.

1 Like

Good question; you have multiple approaches here. I'll also assume that if you deallocate the memory before calling block() it's UB as well.

Which yields the following constraints:

  • Memory in the heap

    It should not be moved (easy because in the heap), nor deallocated before calling .block(). Such things are easy to feature as API restrictions. While Pin could help, here, it's not needed, since we are not dealing with generic types, here. We jusrt need our own wrapper type, and given the "dealloc" constraints, it will be a an ownership-based API.

    pub
    struct AutoBlockBytes(Vec<u8>);
    
    pub
    fn write (bytes: Vec<u8>)
      -> AutoBlockBytes
    {
        let auto_block = AutoBlockBytes(bytes);
    
        impl Drop for AutoBlockBytes {
            fn drop (self: &'_ mut AutoBlockBytes)
            {
                unsafe { block(); }
            }
        }
    
        unsafe {
            write(auto_block.0.buffer.as_ptr(), auto_block.0.buffer.len());
        }
    
        auto_block
    }
    
    impl AutoBlockBytes {
        pub
        fn block (self: AutoBlockBytes)
          -> Vec<u8> // get the buffer back
        {
            unsafe { block(); }
            mem::take(&mut mem::ManuallyDrop::new(self).0)
        }
    }
    

    This would be the core primitive wrapping the unsafe / error-prone FFI functions into an impossible-to-misuse and thus non-unsafe API :slight_smile:: even if the AutoBlockBytes are leaked so as to never call block(), the associated memory / buffer will then remain untouched, hence avoiding any issue.

    • This API should not be touched, except, if you really want it, for a Deref impl and/or replacing the Vec<u8> with Box<[u8]>.

    From there, you could indeed feature your AsynchronousWriter, with some internal enum state of either the Vec<u8> and the AutoBlockBytes, should you want to reuse a Vec while moving away from the ownership-based API for the sake of ergonomics.

  • Memory in the stack

    This one is a bit more subtle, since Drop glue cannot be relied on because of leaks. In that case, the solution is to use a scoped API:

    fn while_writing<R> (
        input: &'_ [u8],
        concurrent_code: impl FnOnce() -> R,
    )
    {
        unsafe {
            write(input.as_ptr(), input.len());
        }
        ::unwind_safe::with_state(())
            .try_eval(|&mut ()| concurrent_code())
            .finally(|()| unsafe { block(); })
    }
    
  • To be used as:

    while_writing(buf, /* do */ || {
        ...
    }) // <- point of "auto-flush" / block
    

There is one caveat to both approaches, however: quid of concurrent calls?

while_writing(&[42], || {
    while_writing(&[27], || {
         // here we have called `write(&[42]); write(&[27]);`
    }); // `block()`
}); // `block()`

If this is not okay, then neither the write(..) -> AutoBlockBytes function above, nor while_writing should be free functions; instead, they should be &mut self-based methods of a singleton (search for the singleton pattern for more info).

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.