Safer_ffi question: how to make c_char into char rather than int8_t

Apologies for the potential mis-targeting of this question. I know some safer_ffi folks are here. Sorry to everybody else.

I am wondering how I get a C header that looks like this:

typedef struct callbacks {
    size_t (*display)(void const *, char *, size_t);
} callbacks_t;

as opposed to this:

typedef struct callbacks {
    size_t (*display)(void const *, int8_t *, size_t);
} callbacks_t;

Here is the Rust code:

#[ffi_export]
#[derive_ReprC]
#[repr(C)]
pub struct callbacks {
    display: extern "C" fn(*const c_void, *mut c_char, usize) -> usize,
}

Unfortunately

#[ffi_export]
#[derive_ReprC]
#[repr(C)]
pub struct callbacks {
    display: extern "C" fn(*const c_void, safer_ffi::char_p::char_p_raw, usize) -> usize,
}

produces a const buffer pointer

typedef struct callbacks {
    size_t (*display)(void const *, char const *, size_t);
} callbacks_t;

Thanks for any ideas.

BTW: it's amazing that, other thank this one thing, I've managed to completely eliminate all unsafe from a C API with a lot of surface area using safer_ffi. (I know it's still a free-for-all on the C side, but at least Rust won't do anything it shouldn't)

For the first part, I suspect it's not possible. The signage of char is not fixed. For some platforms char is signed for others it's unsigned. That sort of thing is not tolerable in Rust so c_char has to be either int8 or uint8. Apparently the author went with signed.

But I probably misunderstood the question.

1 Like

You are likely right. Except I am confused by the fact that using the safer_ffi const char* abstraction (char_p_raw in safer_ffi::char_p - Rust) produces char const * I just can't find a way to make it non-const.

Rust's char is a 4-byte Unicode code point. C's char is a 1-byte byte. It's not correct to pass a pointer to one when the other was expected. (You can't read an array of 4-byte code points and get the same result as if you read an array of bytes, nor does it work vice versa.)

1 Like

You can leave it as const. Constness of raw pointers in Rust basically doesn't matter except for variance when they are inside a user-defined type. You can just cast the const pointer to a mut pointer when needed.

1 Like

I'm not using Rust's char primitive anywhere. On the Rust side, I'm using c_char from c_char in std::ffi - Rust . I'm wondering about safer_ffi header generation process that translates that into a C header, where I'm hoping I can get it to use char

Everything is happy on the Rust side. But when the C header says the callback takes const it causes the C compiler to throw a warning - not to mention being a bad experience to tell users to write to a pointer with a const type.

1 Like

I don't see a way to do it either.

I get the impression that safer_ffi is meant to be C-calls-Rust-in-very-C-ish-ways. Caller allocates, like what I think you're doing, is nearly always the right choice but that is not very C-ish (think strdup). They even have an example ( string_concat) that's as classic C as it could possibly be. (As a side note the drop call in string_concat::free_string is unsafe.)

You could try making a contribution to the library. Adding char_p_mut looks like the work would be a lot of copy-then-paste-then-small-modifications.

2 Likes

Perhaps I don't understand the philosophy of safer_ffi. I was hoping to attention-fish @Yandros or somebody else who has familiarity with the particulars and the vision of safer_ffi.

I opened an issue: Need a way to get a mutable char* argument in a callback function · Issue #162 · getditto/safer_ffi · GitHub - I'd be happy to code up something like char_p_mut as you suggest, but I want to know if the PR would be accepted before I do it.

Thanks for all the thoughts & replies.

1 Like

Sorry for the late reply, I didn't see this thread at the time (btw, anybody is welcome to directly ping me in this forum when talking about safer-ffi :slightly_smiling_face:); and while I had afterwards noticed the GH issue, I have been quite busy to properly tackle it (I suspect I may have to add extra raw APIs to attend to your current need).

Context

A few remarks to set some context:

  • the c_char of the stdlib/libc, much like c_int, and much unlike usize, is a type alias for another primitive type depending on the platform. I dislike such a design quite a bit, to be honest: imagine if we had had type usize = u32/u64; depending on the platform!

  • safer-ffi thus defines an internal c_char new type (new struct), which is a transparent wrapper around an 8-bit integer, but which, when translated to C headers, yields char rather than int8_t or uint8_t (but byte for C#).

  • safer-ffi emits C-compatible headers of otherwise "Rust idioms". That is, for instance, it considers that "stuff" is either:

    • owned (e.g., heap-allocated),
    • borrowed exclusively-and-thus-mutably,
    • or borrowed in a shared-fashion and "thus"[1] immutably.

Hence, for instance, the existence of c_slice::{Ref,Mut,Box}<u8> to represent, respectively, &[u8], &mut [u8], and Box<[u8]>.

However, in the case of str, both in the Rusty wide pointer case (str::Ref and str::Box respectively), and in the C thin pointer case (char_p::Ref and char_p::Box respectively), I have semi-deliberately omitted the &mut case.

  • The reason for this is that these types represent UTF-8 encoded strings (with the extra constraint of the NULL terminator in the char_p cases), and mutating these is footgun-prone, given how (Rust) chars / Unicode code points have varying-width UTF-8 encodings.

    • For instance, in Rust, seeing a &mut str is extremely rare.

But I could get by the idea of the C user only wanting to deal with ASCII strings, which a fortiori, are UTF-8 strings, and mutable access to ASCII strings is less error-prone (e.g., a &mut [AsciiByte] kind of API).

Or, unsafely, exposing a &mut str much like Rust's stdlib does.


To answer the OP

Be it as it may, my answer to the OP would thus be:

  • usually, *mut c_char is used to represent an owned C string, and in safer-ffi, this would be a char_p::Box;

    • I could also get by the idea of defining a char_p::Malloc type, so that release of the pointer is done with free rather than with GlobalAlloc::dealloc(ptr, strlen(p)) (which is what char_p::Box does, and which may not match free unless the GlobalAlloc is registered as malloc/free).

      #[derive_ReprC]
      #[repr(transparent)]
      pub struct char_p_malloc(
          ptr::NonNullOwned<::safer_ffi::c_char>,
      );
      
      impl char_p_malloc {
          pub fn new(s: &str) -> Result<Self, InvalidNulTerminator> {
              if … // check against truncation
              let ptr =
                  ptr::NonNull::new(unsafe { ::libc::strndup(s.as_ptr(), s.len()) }))
                      .expect("`strndup` returned NULL")
              ;
              Self(ptr::NonNullOwned(ptr, PhantomData))
          }
      }
      
      // and/or `Deref`.
      impl char_p_malloc {
          pub fn as_ref(&self) -> char_p::Ref<'_> {
              char_p::Ref((*self.0).into())
          }
      }
      
      impl Drop for char_p_malloc {
          fn drop(&mut self) {
              unsafe { ::libc::free(self.0.as_mut_ptr().cast()) }
          }
      }
      
  • but if you do need a C equivalent of *mut c_char (which correctly uses char in the C headers), but which does not represent ownership over the given C string, then I could expose a low-level/unsafe type to do so;

    • (or this AsciiByte abstraction idea)

For instance, I should very much expose the aforementioned c_char type that safer-ffi internally uses (so that, worst case, with some unsafe, you can roll your own *mut safer_ffi::c_char yourself).


Aside about char_p::Raw

  • By the way, char_p::Raw is not intended to be used as a *mut c_char, but rather, as a char_p::Ref<'erased>, since for technical reasons[2], extern "C" fn (char_p::Ref<'_>) does not implement ReprC despite the impl<A: …> ReprC for extern "C" fn(A).

    Thence the need to internally use unsafe extern "C" fn(char_p::Raw), which could soundly be called with a s: char_p::Ref<'_> by doing s.into().

    Nowadays, though, the expected way to handle this limitation is to write:

    #[derive_ReprC]
    #[repr(transparent)]
    pub struct MyHigherOrderCb /* = */ {
        c_fn: pub extern "C" fn(char_p::Ref<'_>),
    }
    

    to let the derive manually implement ReprC for this newtype wrapper type (thereby removing the need for unsafe).


  1. interior/shared mutability is not within the design of primitives; worst case scenario somebody could end up with a *const for a shared-but-mutable thing. In that case, the C usage ought to know they can cast the const-ness away, while sending the "needs attention" signal about this not being mutable due to exclusive access, so that some care ought to be taken ↩︎

  2. there exists no single type A which matches char_p::Ref<'any> as 'any varies. ↩︎

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.