Removing the libc calls

pub struct Jit64_Memory {
    addr: *mut u8,
    size: usize,
    offset: usize,}

impl Jit64_Memory {
    pub fn new(num_pages: usize) -> Jit64_Memory {
        let size: usize = num_pages * G.page_size;
        let addr: *mut u8;
        unsafe {
            let mut raw_addr: *mut libc::c_void = std::mem::uninitialized();
            libc::posix_memalign(&mut raw_addr, G.page_size, size); // allocate aligned to page size
            libc::mprotect(raw_addr, size, libc::PROT_READ | libc::PROT_WRITE); // read write
            libc::memset(raw_addr, 0xc3, size); // return addr everywhere
            addr = std::mem::transmute(raw_addr);}
        Jit64_Memory { addr, size, offset: 0}}}

G.page_size is 4096. Context: we are allocating some pages where we put raw x86_64 machine code then later make the page executable and run it.

Question: is there a way to do this without the libc dependency? I am on Linux x86_64 and do not care about other platforms.

Sure, you can just inline the PROT_ constants and function definitions from the libc crate.

Also note that your usage of mem::uninitialized() is UB, you might want to use MaybeUninit instead.

4 Likes

I'm not familiar with this. Is what you are suggesting equivalent to: (1) libc has a bunch of extern "C" defs, (2) copy over the ones you are using ?

posix_memalign can be replaced with std::alloc::Global.alloc. memset can use slice::fill I think.

Yes. Note that even on Linux the glibc and musl libc may use different definitions for the functions. You should also probably add an compile_error!() if the target doesn't match what you expect.

3 Likes

Does this fix the UB?

pub struct Jit64_Memory {
    addr: *mut u8,
    size: usize,
    offset: usize,}

impl Jit64_Memory {
    pub fn new(num_pages: usize) -> Jit64_Memory {
        let size: usize = num_pages * G.page_size;
        let addr: *mut u8;
        unsafe {
            let mut raw_addr: MaybeUninit<*mut libc::c_void> = std::mem::uninitialized();
            libc::posix_memalign(raw_addr.as_mut_ptr(), G.page_size, size); // allocate aligned to page size
            libc::mprotect(raw_addr.assume_init(), size, libc::PROT_READ | libc::PROT_WRITE); // read write
            libc::memset(raw_addr.assume_init(), 0xc3, size); // return addr everywhere
            addr = std::mem::transmute(raw_addr);}
        Jit64_Memory { addr, size, offset: 0}}

(not sure if the usage of assume_init and as_mut_ptr are correct)

  1. Given that libc appears to just be wrappers, is there any real benefit for me to remove libc as a dependency (which seems to just duplicate the work of writing bindings and be error prone).

Rust's allocators don't expose mprotect, though, and a JIT needs that.

I was about to ask you that! I don't think libc is costing you much, just a little bit of build time. And if your program as a whole uses many libraries, it's quite likely one of them will pull in libc anyway. So unless you have strong evidence to the contrary, I'd say it's not worth avoiding libc.

2 Likes

Nope: from what I remember, std::mem::uninitialized is always insta-UB, no matter the context.

If you wanted to use MaybeUninit, you should write

let mut raw_addr: MaybeUninit<*mut libc::c_void> = MaybeUninit::uninit();
libc::posix_memalign(raw_addr.as_mut_ptr(), G.page_size, size);
// Also needs error handling for `posix_memalign`, etc...
let raw_addr = raw_addr.assume_init();

However, reading the documentation of posix_memalign, I think the initial value of raw_addr does not matter, so a simple

let mut raw_addr: *mut libc::c_void = std::ptr::null_mut();
libc::posix_memalign(&mut raw_addr, ...);

should be good ?

2 Likes

As far as I know mem::uninitialized() is fine when undef is a valid value for the type. MaybeUninit is one of the few types for which undef is a valid value.

3 Likes

Ohhh, sweet

1 Like

std::mem::uninitialized is still deprecated though, so even in this case where it'd be sound I'd prefer MaybeUninit::uninit.

3 Likes

Thanks for everyone's explanations. Looks like "UB awareness" is learned one scar/skeleton at a time. :slight_smile:

It is the only one.

While it is valid, I don't see why you need it. MaybeUninit is used to not perform an expensive but redundant initialization code. Here, you can just initialize the pointer with null.

Also, the usual pattern is to initialize the pointer, then shadow the MaybeUninit with the .assume_init()ed value, so you don't need multiple .assume_init(). For example:

        let addr = unsafe {
            let mut raw_addr = MaybeUninit::uninit();
            libc::posix_memalign(raw_addr.as_mut_ptr(), G.page_size, size); // allocate aligned to page size
            let raw_addr = raw_addr.assume_init();
            libc::mprotect(raw_addr, size, libc::PROT_READ | libc::PROT_WRITE); // read write
            libc::memset(raw_addr, 0xc3, size); // return addr everywhere
            addr as *mut u8
        }
1 Like

I agree that in this particular case, we can just use std::ptr::null_mut();. I have been playing around with MaybeUninit as I'm trying to understand it's power/limitations in full.

That's not entirely correct, there are other types where undef is a valid value:

union MyType { // basically how `MaybeUninit` is actually defined
    a: (),
    b: &'static str,
}

MaybeUninit::<MyType>::uninit().assume_init(); // valid since MyType::a doesn't need any valid bytes
MaybeUninit::<[MaybeUninit<u8>; 5]>::uninit().assume_init(); // valid since every byte is wrapped in `MaybeUninit`
MaybeUninit::<()>::uninit().assume_init(); // valid since zero-size types have no bytes that could be uninit
// Important exception:
// MaybeUninit::<!>::uninit().assume_init(); // not valid even though `!` is a zero-sized type, it is uninhabited -- it's not valid to ever create a value of an uninhabited type through any means.
5 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.