Customs around building a safe API

Hi there,
I created the dart_sys crate about four months ago. This exposes raw bindings to the dart native extensions api.

I've since spent about two months exploring dart, and playing around with it and I now feel comfortable using it.

My current plan is to create safe and easy to use bindings to the API for Rust, but I am having some issues deciding on a design for a part. Specifically, the Dart_Handle.

The Dart_Handle is used as a handle to a Dart object on the heap, and is updated automatically to refer to the correct object. About half of the API interacts with Dart_Handles, so this is not something I want to leave as is.

The thing is, is that the Dart documentation claims that I will not get an invalid handle to something, and it makes it seem that if I give it an invalid handle, it will try to deal with it itself, and return a handle to an exception.

But I am not sure of the validity of their functions, and in fact, I'm not sure that I could create the api in a way that it's impossible to put it into an invalid state.

The issue is that I am now unsure of whether I should make my constructor unsafe, or the individual functions unsafe. Since a rust user is supposed to never be able to trigger UB through a non-unsafe function, one could say that making the constructor unsafe and the rest of the functions safe would break that rule, but that would be unwieldy, since my safe api has suddenly become a struct with a safe constructor and 100 unsafe functions.

IE, it's this:

// Unverified since it may be an exception handle.
struct UnverifiedDartHandle {
    handle: ffi::Dart_Handle // The raw handle
}
impl UnverifiedDartHandle {
    pub fn new(ffi::Dart_Handle) -> Self { /**/ }
    pub unsafe fn is_error(&self) -> bool { /**/ }
}

versus this:

// Unverified since it may be an exception handle.
struct UnverifiedDartHandle {
    handle: ffi::Dart_Handle // The raw handle
}
impl UnverifiedDartHandle {
    pub unsafe fn new(ffi::Dart_Handle) -> Self { /**/ }
    pub fn is_error(&self) -> bool { /**/ }
}

I am wondering what the recommended course of action would be.

Thank you, and have a nice day!

Just a few more thoughts I've had since,

If I do the following:

let mut my_vec = unsafe { 
    Vec::<u8>::from_raw_parts(
        std::ptr::null::<u8>(),
        3,
        4,
    )
};
my_vec[0];

I will have successfully invoked UB from safe code, because I broke state in my unsafe code.

Similarly, if I have the following *:

let my_handle = std::ptr::null::<_>();
let safe_handle = unsafe { UnverifiedHandle::new(my_handle) };
safe_handle.is_error();

Then I may or may not have just invoked UB, but that's because I broke state in unsafe code.


  • Note that
    pub type Dart_Handle = *mut _Dart_Handle;
    
    Here.

But then I could do the same and reflect the raw pointer apis, where I can say the following in safe code:

let my_ptr = 1234 as *const u8;

But I then have to use unsafe to use said pointer:

unsafe {
    let my_val = *my_ptr; //Oh no! UB
}

In other words, pointers are safe to construct, just not to use.

I don't see how that is true. There's a big honkin' unsafe in the very snippet you posted. It doesn't matter that my_vec[0] is where the debugger points when the code crashes. You still had to write unsafe for summoning nasal demons. The extent of the semantics of unsafe is not only syntactically the block itself. It is whatever abstraction boundary you define.

In other, simpler words, it's not true that "Rust can only crash in unsafe code", and it's not required (and is impossible) to write code that guarantees that a crash can only ever occur inside unsafe code.

1 Like

One way to think of it is this: The documentation on Vec::from_raw_parts lies out various rules that among other things say it is UB to call it with a null pointer. This immediately invokes UB due to the documentation saying it is UB to do this.

Now what does UB mean? It means that the compiler can compile your program to do anything it wants, and anything it wants involves crashing on the my_vec[0] line.

2 Likes

I see what you are both talking about; it makes sense. I wanted to provide something more like the following:

let my_vec: Vec<u8> = unsafe {
    // Potentially unsafe construction of a Vec
};
//Then, in safe code
my_vec[0];

Where I wanted to point out that if I trigger UB, it could occur in the safe code because I broke an invariant in the unsafe code.

This relates to my problem like so; if I have a dart handle, then to create a UnverifiedHandle, I must make sure that the handle that I have is valid. I could follow the Vec example, and force the programmer to ensure that the handle is valid by making them use unsafe code when creating the UnverifiedHandle, or I could force them to ensure that it's valid when they try to use said handle, such as, for example, std::mem::MaybeUninit, where the construction of it is safe, but the modification (unions require unsafe) and consumption of the MaybeUninit requires unsafe.

This suggests that the constructor itself must be unsafe, with a very loud description of its invariants and that breaking those invariants will produce undefined behavior.

Does this imply that there is a valid use case for constructing UnverifiedHandle with garbage? In other words, what are the invariants you are trying to uphold with this type? MaybeUninit has a pretty clear use case, that the pointer may be uninitialized at construction time and must be initialized later.

3 Likes

No, I don't see why there would be a reason for creating an UnverifiedHandle with garbage.

This effectively decides it then, thank you!

1 Like