Hello, I am working on a little project trying to rewrite the MINIX 3 scheduler in Rust and I need help.
I know that ideally using raw pointers should be avoided, but I am not sure how to do that here.
First, obviously there is no C-like macros, so I replaced all #ifdefs with constants. Another issue is, there is no standard library on MINIX yet, so things like assert! won't work. I am working on getting the standard library to work on there, but no luck yet.
All structs and typedefs from C were converted using rust-bindgen and placed into a separate mod bindings.
The code is using a linked list. Linked lists require a shared mutable ownership, which is risky and not allowed by the borrow checker, so you'll have to use either unsafe raw pointers, or Rc<RefCell<…>> for references to list elements.
Or better, change the algorithm. Get rid of the linked lists completely and use some other container instead, perhaps VecDeque or BinaryHeap.
Thank you so much for your quick response. VecDeque sounds like a great replacement for the linked lists. It seems like I would need the standard library for that to work though, is that correct?
Hey, so I am finally trying to actually code this up. The problem is I need to call a single function written in Rust from C. The function written in Rust has a call to a C function inside...I was wondering if I am moving in the right direction. Also, what is the correct way to handle -> operators in Rust? If I have a pointer to a structure, can I use something like "->" to access a member of the struct that the pointer is pointing to?
It is possible to borrow p_priority and p_cpu when accessing data through a dereferenced raw pointer? I don't think I need to take ownership of them in this case.
Another thing I am struggling with, is there a way to check if the value a pointer is pointing to is null? If I understand correctly, raw pointers CAN be null? In other words, how should something like this be handled?
Hi! Sorry for the late reply, you can take a reference to your data and turn it into a pointer like so (repeating this idea for wherever you want to get a pointer):
q = &(*rp).p_priority as *[mut/const] _;
(You can use *mut _ or *const _ depending on the type of q)
And to check if a given pointer p is null, you can use .is_null().
I'm telling this, because I think that @andrew_l is working on an embedded device (am I right?).
It's not possible (in most cases) for a embedded device without OS to use std, but alloc "only" needs an Allocator which can be simply implemented (e.g. by using the provided malloc/free).
No, it's not an embedded device. I am trying to rewrite the scheduler of MINIX 3 in Rust, and there is no Rust std available for MINIX, so I have implement everything myself
Which I just realized, it looks like I can't even use .is_null()
So what if I get a pointer to null from C, how can I check that?
let rdy_head: *mut *mut proc;
rdy_head = get_cpu_var(&(*rp).p_cpu, run_q_head);
get_cpu_var is a function defined in C.
Also, thank you for the insight on alloc, that's good to know, because I am considering doing some work for embedded devices later!
I suggest using Option<&mut proc> instead of *mut proc (i.e., a pointer coming from C is either null or valid; which sadly cannot really be enforced by C, but it is, in practice, the used convention). This way you get Rust to guard against your forgetting about NULL pointers right from the start:
#![no_std]
#![feature(lang_items, core_intrinsics)]
/// Quick "on the fly" setup for a no_std environment, do not rely on this part
mod no_std_setup {
use ::core::{
intrinsics,
panic::PanicInfo,
};
#[lang = "eh_personality"] #[no_mangle]
pub extern fn rust_eh_personality () {}
#[lang = "eh_unwind_resume"] #[no_mangle]
pub extern fn rust_eh_unwind_resume () {}
#[lang = "panic_impl"] #[no_mangle]
pub extern fn rust_begin_panic (info: &PanicInfo) -> ! { unsafe { intrinsics::abort() } }
}
use ::libc::{
c_uint,
c_int,
};
#[allow(non_snake_case)]
#[repr(C)] // SAME DECLARATION ORDER THAN IN C
pub struct proc_t {
p_priority: c_int,
p_cpu: c_uint,
}
type Ptr<'a, T> = Option<&'a mut T>;
extern "C" {
fn get_cpu_var (cpu: c_uint, name: Ptr<proc_t>) -> Ptr<proc_t>;
}
#[no_mangle]
pub unsafe extern "C"
fn enqueue (
rp: Ptr<proc_t>,
run_q_head: Ptr<proc_t>,
run_q_tail: Ptr<proc_t>,
)
{
if let Some(rp) = rp {
let q: c_int = rp.p_priority;
let rdy_head = get_cpu_var(rp.p_cpu, run_q_head);
let rdy_tail = get_cpu_var(rp.p_cpu, run_q_tail);
// ...
}
}
For more information about wrapping C (nullable) pointers in Option<P> where P is one of the many Rust non-nullable pointers, see this other thread: Storing C callbacks in Rust - #6 by Yandros
To verify that this does the correct NULL checking, when compiled and disassembled in x64, we get:
# getting a dummy libbar.a static library to simulate C giving us get_cpu_var()
$ echo 'void * get_cpu_var (unsigned int cpu, void * name) { return NULL; }' | cc -o bar.o -c -xc - && ar rcs libbar.a bar.o
# compiling our no_std Rust lib.rs linking against libbar.a (optimizing for size for disassembly readability ;))
$ cargo rustc -- -C opt-level=z -L. -lbar
# disassembling our resulting function with gdb
$ gdb -q --batch -ex "disassemble enqueue" target/debug/libfoo.so
Dump of assembler code for function enqueue:
0x0000000000000550 <+0>: test %rdi,%rdi
0x0000000000000553 <+3>: jz 0x57b <enqueue+43>
0x0000000000000555 <+5>: push %r15
0x0000000000000557 <+7>: push %r14
0x0000000000000559 <+9>: push %rbx
0x000000000000055a <+10>: mov %rdx,%r14
0x000000000000055d <+13>: mov %rdi,%rbx
0x0000000000000560 <+16>: mov 0x4(%rdi),%edi
0x0000000000000563 <+19>: lea 0x12(%rip),%r15 # 0x57c <get_cpu_var>
0x000000000000056a <+26>: callq *%r15
0x000000000000056d <+29>: mov 0x4(%rbx),%edi
0x0000000000000570 <+32>: mov %r14,%rsi
0x0000000000000573 <+35>: callq *%r15
0x0000000000000576 <+38>: pop %rbx
0x0000000000000577 <+39>: pop %r14
0x0000000000000579 <+41>: pop %r15
0x000000000000057b <+43>: retq
End of assembler dump.
We do get the NULL check at lines <+0>, <+3>, and C's ->p_cpu dereference at line <+16>
Relevant lines in Cargo.toml
[lib]
name = "foo"
crate-type = ["cdylib"]
[dependencies]
libc = { version = "0.2.53", default-features = false }
Putting Option<&mut T> directly into function signatures seems like overkill. You can use <*mut T>::as_mut.
To be honest, I'm not sure if I've ever dereferenced a raw pointer using the * operator. The functions in std::ptr (which are now methods) are safer, and they have always been enough for me.
Thank you everyone, lots of great information in this thread. Thanks for the detailed explanation. Most of my code is compiling now, the last piece I am not sure about is this:
C code:
struct proc **rdy_head;
if (!rdy_head[q]) {
rdy_head[q] = rdy_tail[q] = rp;
}
How can accessing the data the pointer is looking at be handled in Rust?
I guess it is a matter of preference. My reasoning is the following:
Some people, specially beginners, do not know about .as_mut(), and may be tempted to dereference raw pointers like they do in C / C++, usually guarding these raw dereferences with a null check, but this is error-prone.
Thus the only sensible way to dereference a raw pointer is by having them converted to Option<& [mut] _>, by calling .as_mut() (or .as_ref() for a *const _) on it;
Since fn as_mut (*mut T) -> Option<&mut T> transformation is just a cast / transmute under the hood, given enum layout optimization and & [mut] _ being non-null, it can be automagically done for all function parameter by casting / transmuting extern function parameters like I did.
This not only removes the need to call .as_mut() when wanting to work with these functions, it also makes it impossible to dereference (anywhere in the function) the given pointer when it is NULL.
Instead of tempting people with forbidden apples, let's get rid of the forbidden tree altogether.
To be honest, a big part of my reasoning against putting it in the signature is that putting it in the signature gives it an unbound lifetime (very dangerous!). But after thinking more about it, that same problem also applies to as_ref().
(I guess the only notable difference is that, in the case of as_ref, there is documentation to remind you of the danger of unbound lifetimes!)