Totally with you there.
I'm a bit worried about my use of unsafe
. When I do unsafe things in unsafe functions my compiler warns me I should also wrap those unsafe things in unsafe
. For example if I have this:
pub unsafe fn len(ptr: *const Vector<T>) -> usize {
(*ptr).len
}
I get this warning:
warning[E0133]: dereference of raw pointer is unsafe and requires unsafe block
--> src/vector.rs:64:9
|
64 | (*ptr).len
| ^^^^^^ dereference of raw pointer
So I wrap that in unsafe:
pub unsafe fn len_2(ptr: *const Vector<T>) -> usize {
unsafe { (*ptr).len }
}
But now the unsafe
on the function is redundant. So I remove it. Which is nice because the caller now does not have to use unsafe
.
But this breaks "Every function is unsafe" rule.
Tried to implement the guessing game from the Rust book but it only works on nightly due to the use of str::from_utf8_unchecked():
impl<T> Vector<T> {
unsafe fn to_slice(&self) -> &[T] {
return slice::from_raw_parts(self.ptr, self.len);
}
}
unsafe fn getline() -> Vector::<u8> {
let mut c: u8 = b'\0';
let mut line: Vector::<u8> = Vector::<u8>::new();
loop {
if c == b'\n' {
break;
}
c = libc::getchar() as u8;
line.push(c);
}
return line;
}
#[no_mangle]
unsafe extern "C" fn main(_argc: i32, _argv: *mut *mut u8) -> i32 {
libc::printf("Guess the number!\n\0".as_ptr());
let secret_number = (libc::rand() % 100) + 1;
loop {
libc::printf("Please input your guess.\n\0".as_ptr());
let guess: i32 = match str::from_utf8_unchecked(getline().to_slice()).trim().parse() {
Ok(num) => num,
Err(_) => continue,
};
libc::printf("You guessed: %d\n\0".as_ptr(), guess);
match guess.cmp(&secret_number) {
cmp::Ordering::Less => libc::printf("Too small!\n\0".as_ptr()),
cmp::Ordering::Greater => libc::printf("Too big!\n\0".as_ptr()),
cmp::Ordering::Equal => {
libc::printf("You win!\n\0".as_ptr());
break;
},
};
}
return 0;
}
Edit: I just realised that this code breaks the rule about no references, only raw pointers. Time to rewrite the str module and Ord trait in the C/Rust fashion.
Ord trait implemented in C/Rust:
pub trait OrdCRust: Eq + PartialOrd<Self> {
unsafe fn cmp_crust(&self, other: *mut Self) -> cmp::Ordering;
}
impl OrdCRust for i32 {
unsafe fn cmp_crust(&self, other: *mut i32) -> cmp::Ordering {
if *self < *other {
return cmp::Ordering::Less
}
else if *self > *other {
return cmp::Ordering::Greater
}
else {
return cmp::Ordering::Equal
}
}
}
#[no_mangle]
unsafe extern "C" fn main(_argc: i32, _argv: *mut *mut u8) -> i32 {
libc::printf("Guess the number!\n\0".as_ptr());
let mut secret_number = (libc::rand() % 100) + 1;
loop {
libc::printf("Please input your guess.\n\0".as_ptr());
let guess: i32 = match str::from_utf8_unchecked(getline().to_slice()).trim().parse() {
Ok(num) => num,
Err(_) => continue,
};
libc::printf("You guessed: %d\n\0".as_ptr(), guess);
match guess.cmp_crust(ptr::from_mut(&mut secret_number)) {
cmp::Ordering::Less => libc::printf("Too small!\n\0".as_ptr()),
cmp::Ordering::Greater => libc::printf("Too big!\n\0".as_ptr()),
cmp::Ordering::Equal => {
libc::printf("You win!\n\0".as_ptr());
break;
},
};
}
return 0;
}
For the str module, we have str::from_raw_parts() in nightly gated by
#![feature(str_from_raw_parts)]
We want something that is like str::from_raw_parts() but returns a raw pointer to a string instead of a reference to a string. That way we can directly convert C strings to Rust's str, instead of having to deal with slice and str references.
Alternatively, we can directly create a C/Rust analogue of the String module using a dynamic array of u8s, and then define the unsafe trim() and unsafe parse() functions in the C/Rust String module.
It isn't. unsafe
on the function is not "we allow unsafe operations in the body" - it's "this function needs some additional invariants to call" (here - the fact that *const Vector<T>
is pointing on valid value). This doesn't change by the fact you use unsafe
block inside the function.
You don't need to do &mut place as *mut _
; you can do &raw mut place
(previously known as ptr::addr_of_mut!
) to directly create a pointer instead.
This actually has semantic impact. (Doing &mut as *mut
twice will invalidate accesses through the first pointer, but &raw mut
twice will create two mut pointers with the same provenance scope.)
But… do be careful. I'm not sure whether this will cause temporary lifetime extension to kick in or not.
This is because the default Rust style still has the majority of an unsafe fn
body as safe code, with unsafe
operations being rare enough to merit being highlighted. In a C/Rust style, it's perfectly reasonable to allow unsafe_op_in_unsafe_fn
, as that doesn't hold.
Did you know....
It turns out that one can use raw pointers as the self parameter in method implementations. Like so:
#![feature(arbitrary_self_types_pointers)]
struct S { value: u32 }
impl S {
fn get_value(self: *mut Self) -> u32 {
unsafe { self.as_mut().unwrap().value }
}
fn as_mut_ptr(&mut self) -> *mut S {
self as *mut S
}
}
fn f() -> u32 {
let s = S { value: 1 }.as_mut_ptr();
s.get_value()
}
Very interesting. Another step to eliminating references in Crust.
Back in the day, I had this idea of my "dream language": It was basically C, but with C++ namespaces and generics and collections.
This sort of looks and feels like what I wanted back then.
You might have gotten confused by this change that makes unsafe_op_in_unsafe_fn default in 2024 edition.
Interesting. Oddly though I could not change this:
let c_ptr = &mut Conn::new() as *mut _;
to this:
let c_ptr = &raw mut Conn::new();
Because:
error[E0745]: cannot take address of a temporary
That's "temporary lifetime extension" rearing it's confusing head again. Short version is let a = &b;
is syntactically magic and is rewritten to let a = b; let a = &a;
, but pretty much any other change, including implicit borrows, or, apparently, using &raw
instead, don't get that treatment.
Well, I thought I was not confused but now I think I am. The Crust idea starts out with the rule "All functions are unsafe
" With the intension I guess that the Crust programmer can do almost whatever they want anywhere they want, just like in C. (God bless their souls)
But now the Rust compiler comes along and tells me I should put unsafe
on unsafe bocks within a function. And then I find I don't actually need unsafe
on the functions, even if they do take raw pointers as parameters.
All of this sounds great to me, it minimises verbiage by not repeating unsafe
everywhere and it clearly denotes specific unsafe operation rather than offering blanket coverage.
So now the compiler is happy, clippy is happy, Miri is happy, the code runs. But I have this nagging feeling, given that the program is held to together by pointers and prayers, that I don't have enough unsafe
!
This may be a misconception. It is unsafe to call a function that takes a raw pointer if calling it with any arbitrary pointer (e.g., an invalid one) will cause UB. If the function can accept any pointer and correctly determine its validity, then it could be made safe to call.
The broad rule is that if the caller must uphold an invariant that cannot be statically (or dynamically) checked, the function is unsafe
.
But if you are just doing bizarre throw-away crap, none of this matters. Write whatever you want, break things.
Well, what I said is actually true. The compiler does not require me to annotate functions that take raw pointers as unsafe
.
That is what I would expect of course.
Thing is the function may be perfectly well behaved when given a valid raw pointer. But it's the caller that needs to create or obtain that pointer. The safety checks need to be done there, in the caller.
Actually that is no different from a normal safe function taking references. It's the caller than needs to ensure those references are not bogus. The safety checks need to be done there, in the caller.
Ouch! I like to think of it as an art form
You're not wrong. No one ever said art had to be in good taste.
Anyone writing Rust for a serious purpose takes its principles seriously (at least I hope so). That includes demarcating functions as unsafe
when the caller must uphold pre- or post-condition invariants. I disagree that this is "no different from a normal safe function taking references". Safe code cannot create invalid references, eliminating any possibility that the caller could be at fault for providing one.
You do get a clippy lint for taking a pointer parameter on a safe function, but only if you deref and it's a public function. Remarkably constrained of clippy!
To make it extra clear: this is exactly what unsafe fn
means — that the caller must uphold some extra rules in order to maintain validity of the resulting program.
Distinguishing that meaning from the unsafe
block meaning of “allow me to do unsafe
things; because I promise not to do anything invalid” is exactly the purpose of the unsafe_op_in_unsafe_fn
lint. If you follow the lint guidance, you always say "trust me" and "trust the caller" independently, which can be of significant benefit when using both safe and "trust me" code in a "trust the caller" context.
But if you're writing C with Rust syntax, just #[allow]
the lint and make all your functions unsafe
.
I'm glad you said that. I thought I was imagining things. But for sure I can have a method:
fn capacity(self: *const Vector<T>) -> usize {
unsafe { (*self).capacity }
}
Which is tested in a test module. There are no dire warnings about lack of unsafe
only about unused method. Seems testing a thing does not count as using it.
Things make more sense now that I found out why I got no warnings about missing unsafe
on my methods that take raw "this" pointers.