Need Help Translating C into Rust

Hello, I am working on a little project trying to rewrite the MINIX 3 scheduler in Rust and I need help.

I know that ideally using raw pointers should be avoided, but I am not sure how to do that here.
First, obviously there is no C-like macros, so I replaced all #ifdefs with constants. Another issue is, there is no standard library on MINIX yet, so things like assert! won’t work. I am working on getting the standard library to work on there, but no luck yet.

All structs and typedefs from C were converted using rust-bindgen and placed into a separate mod bindings.

Here is one of the functions of the scheduling algorithm in C: https://pastebin.com/Wt55h1Sx
Here is what I have so far in Rust: https://pastebin.com/40p4YaA9

I would appreciate any help!

Thanks,
Andrew

The code is using a linked list. Linked lists require a shared mutable ownership, which is risky and not allowed by the borrow checker, so you’ll have to use either unsafe raw pointers, or Rc<RefCell<…>> for references to list elements.

Or better, change the algorithm. Get rid of the linked lists completely and use some other container instead, perhaps VecDeque or BinaryHeap.

Thank you so much for your quick response. VecDeque sounds like a great replacement for the linked lists. It seems like I would need the standard library for that to work though, is that correct?

Ah, yes, it’d need std :frowning:

The std implementation is quite big due to it being so flexible, but if you only need to add one element, you might be able to roll your own.

Hey, so I am finally trying to actually code this up. The problem is I need to call a single function written in Rust from C. The function written in Rust has a call to a C function inside…I was wondering if I am moving in the right direction. Also, what is the correct way to handle -> operators in Rust? If I have a pointer to a structure, can I use something like “->” to access a member of the struct that the pointer is pointing to?

Any help is appreciated!

extern “C” {
fn get_cpu_var(cpu : c_uint, name : *proc) -> *mut proc;
}

pub fn enqueue(rp: *mut proc, run_q_head : *proc, run_q_tail : *proc) {

let q: i32;
let rdy_head: *mut proc;
let rdy_tail: *mut proc;

unsafe {
    q = rp->p_priority;
    rdy_head = get_cpu_var(rp->p_cpu, run_q_head);
    rdy_tail = get_cpu_var(rp->p_cpu, run_q_tail);
}

}

In rust, raw pointer dereferencing must be done like so:

struct Bar {
    data: usize
}
fn foo() {
    let x: *mut Bar = //
    unsafe {
        (*x).data = 20;
    }
}

Also for exposing it to a c interface you need to turn your exported function definition into the following:

#[no_mangle]
pub extern unsafe fn enque(rp: *mut proc, run_q_head : *proc, run_q_tail : *proc);

You can optionally add the unsafe part into the declaration, as though FFI is almost always unsafe.


Note that the following in your code:

unsafe {
    q = (*rp).p_priority;
    rdy_head = get_cpu_var((*rp).p_cpu, run_q_head);
    rdy_tail = get_cpu_var((*rp).p_cpu, run_q_tail);
}

Will move p_cpu and p_priority out of their respective variables, unless they are Copy.

1 Like

Thank you for a quick and a detailed answer!!

It is possible to borrow p_priority and p_cpu when accessing data through a dereferenced raw pointer? I don’t think I need to take ownership of them in this case.

Another thing I am struggling with, is there a way to check if the value a pointer is pointing to is null? If I understand correctly, raw pointers CAN be null? In other words, how should something like this be handled?

if (!rdy_head[q]) { … }

Hi! Sorry for the late reply, you can take a reference to your data and turn it into a pointer like so (repeating this idea for wherever you want to get a pointer):

q = &(*rp).p_priority as *[mut/const] _;

(You can use *mut _ or *const _ depending on the type of q)
And to check if a given pointer p is null, you can use .is_null().

That’s not totally correct. To be specific you need alloc which has been stabilized recently :partying_face:.

I’m telling this, because I think that @andrew_l is working on an embedded device (am I right?).
It’s not possible (in most cases) for a embedded device without OS to use std, but alloc “only” needs an Allocator which can be simply implemented (e.g. by using the provided malloc/free).

1 Like

No, it’s not an embedded device. I am trying to rewrite the scheduler of MINIX 3 in Rust, and there is no Rust std available for MINIX, so I have implement everything myself :frowning:

Which I just realized, it looks like I can’t even use .is_null()
So what if I get a pointer to null from C, how can I check that?

let rdy_head: *mut *mut proc;
rdy_head = get_cpu_var(&(*rp).p_cpu, run_q_head);

get_cpu_var is a function defined in C.

Also, thank you for the insight on alloc, that’s good to know, because I am considering doing some work for embedded devices later!

Std re-exports core, so usually such primitive functions are actually from core, even if the docs show them in std.

1 Like

I suggest using Option<&mut proc> instead of *mut proc (i.e., a pointer coming from C is either null or valid; which sadly cannot really be enforced by C, but it is, in practice, the used convention). This way you get Rust to guard against your forgetting about NULL pointers right from the start:

#![no_std]
#![feature(lang_items, core_intrinsics)]
/// Quick "on the fly" setup for a no_std environment, do not rely on this part
mod no_std_setup {
    use ::core::{
        intrinsics,
        panic::PanicInfo,
    };
    #[lang = "eh_personality"] #[no_mangle]
    pub extern fn rust_eh_personality () {}
    #[lang = "eh_unwind_resume"] #[no_mangle]
    pub extern fn rust_eh_unwind_resume () {}
    #[lang = "panic_impl"] #[no_mangle]
    pub extern fn rust_begin_panic (info: &PanicInfo) -> ! { unsafe { intrinsics::abort() } }
}

use ::libc::{
    c_uint,
    c_int,
};

#[allow(non_snake_case)]
#[repr(C)] // SAME DECLARATION ORDER THAN IN C
pub struct proc_t {
    p_priority: c_int,
    p_cpu: c_uint,
}

type Ptr<'a, T> = Option<&'a mut T>;

extern "C" {
    fn get_cpu_var (cpu: c_uint, name: Ptr<proc_t>) -> Ptr<proc_t>;
}

#[no_mangle]
pub unsafe extern "C"
fn enqueue (
    rp: Ptr<proc_t>,
    run_q_head: Ptr<proc_t>,
    run_q_tail: Ptr<proc_t>,
)
{
    if let Some(rp) = rp {
        let q: c_int = rp.p_priority;
        let rdy_head = get_cpu_var(rp.p_cpu, run_q_head);
        let rdy_tail = get_cpu_var(rp.p_cpu, run_q_tail);
        // ...
    }
}
  • For more information about wrapping C (nullable) pointers in Option<P> where P is one of the many Rust non-nullable pointers, see this other thread: Storing C callbacks in Rust

To verify that this does the correct NULL checking, when compiled and disassembled in x64, we get:

# getting a dummy libbar.a static library to simulate C giving us get_cpu_var()
$ echo 'void * get_cpu_var (unsigned int cpu, void * name) { return NULL; }' | cc -o bar.o -c -xc - && ar rcs libbar.a bar.o

# compiling our no_std Rust lib.rs linking against libbar.a (optimizing for size for disassembly readability ;))
$ cargo rustc -- -C opt-level=z -L. -lbar

# disassembling our resulting function with gdb
$ gdb -q --batch -ex "disassemble enqueue" target/debug/libfoo.so

Dump of assembler code for function enqueue:
   0x0000000000000550 <+0>: 	test   %rdi,%rdi
   0x0000000000000553 <+3>: 	jz     0x57b <enqueue+43>
   0x0000000000000555 <+5>: 	push   %r15
   0x0000000000000557 <+7>: 	push   %r14
   0x0000000000000559 <+9>: 	push   %rbx
   0x000000000000055a <+10>:	mov    %rdx,%r14
   0x000000000000055d <+13>:	mov    %rdi,%rbx
   0x0000000000000560 <+16>:	mov    0x4(%rdi),%edi
   0x0000000000000563 <+19>:	lea    0x12(%rip),%r15        # 0x57c <get_cpu_var>
   0x000000000000056a <+26>:	callq  *%r15
   0x000000000000056d <+29>:	mov    0x4(%rbx),%edi
   0x0000000000000570 <+32>:	mov    %r14,%rsi
   0x0000000000000573 <+35>:	callq  *%r15
   0x0000000000000576 <+38>:	pop    %rbx
   0x0000000000000577 <+39>:	pop    %r14
   0x0000000000000579 <+41>:	pop    %r15
   0x000000000000057b <+43>:	retq   
End of assembler dump.

We do get the NULL check at lines <+0>, <+3>, and C’s ->p_cpu dereference at line <+16>


Relevant lines in Cargo.toml

[lib]
name = "foo"
crate-type = ["cdylib"]

[dependencies]
libc = { version = "0.2.53", default-features = false }
2 Likes

Putting Option<&mut T> directly into function signatures seems like overkill. You can use <*mut T>::as_mut.

To be honest, I’m not sure if I’ve ever dereferenced a raw pointer using the * operator. The functions in std::ptr (which are now methods) are safer, and they have always been enough for me.

2 Likes

Thank you everyone, lots of great information in this thread. Thanks for the detailed explanation. Most of my code is compiling now, the last piece I am not sure about is this:

C code:
struct proc **rdy_head;

if (!rdy_head[q]) {
rdy_head[q] = rdy_tail[q] = rp;
}

How can accessing the data the pointer is looking at be handled in Rust?

There are two most prominent ways of accessing data in a structure defined in C:

  1. Send rust a function to access the data:
//c
struct Foo {
    int x;
}
&int access_x(Foo* data) {
    return &data->x;
}
extern "c" void init(&int (*getx)(Foo*));
void main() {
    init(&access_x);
}
//rust
pub static mut getx: Option<extern fn(*const ()) -> *const i32> = None;
extern "C" fn init(fun: extern fn(*const ()) -> *const i32) {
    unsafe {getx = fun}
}
fn bar() {
    let myfoo = //
    let pointer_to_data = if let Some(f) = &getx { f(myfoo) } else { panic!() };
}
  1. Or, just define your structures to have the same representation in memory:
//c
struct Foo {
    int x;
}
extern Foo* getFoo() { /**/ }
//rust
#[repr(C)]
struct Foo {
    x: i32
}
extern fn getFoo() -> *const Foo;
2 Likes

I guess it is a matter of preference. My reasoning is the following:

  • Some people, specially beginners, do not know about .as_mut(), and may be tempted to dereference raw pointers like they do in C / C++, usually guarding these raw dereferences with a null check, but this is error-prone.

  • Thus the only sensible way to dereference a raw pointer is by having them converted to Option<& [mut] _>, by calling .as_mut() (or .as_ref() for a *const _) on it;

  • Since fn as_mut (*mut T) -> Option<&mut T> transformation is just a cast / transmute under the hood, given enum layout optimization and & [mut] _ being non-null, it can be automagically done for all function parameter by casting / transmuting extern function parameters like I did.

  • This not only removes the need to call .as_mut() when wanting to work with these functions, it also makes it impossible to dereference (anywhere in the function) the given pointer when it is NULL.

Instead of tempting people with forbidden apples, let’s get rid of the forbidden tree altogether.

6 Likes

That sounds fair.

To be honest, a big part of my reasoning against putting it in the signature is that putting it in the signature gives it an unbound lifetime (very dangerous!). But after thinking more about it, that same problem also applies to as_ref().

(I guess the only notable difference is that, in the case of as_ref, there is documentation to remind you of the danger of unbound lifetimes!)

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.