Unsafe C function that changes more than its direct parameter

I am wondering about the side-effects of a C function that modifies more than what it is directly passed.

Consider the following structures that C code will work with (simplified versions of the actual ones generated by bindgen):

#[derive(Clone, Copy, Default)]
struct Plane(u32);
struct Buffer {
    num_planes: u32,
    planes: *mut Plane,
}

A Buffer holds a number of Planes, described by a C array pointed to inside the Buffer structure.

Now say I want to call an C function that will fill both Buffer and Planes:

let mut planes: [Plane; 8] = Default::default();
let mut buffer = Buffer {
    num_planes: planes.len() as u32,
    // Hold my beer.
    planes: &mut planes[0] as *mut Plane,
};

// query_buffer_ioctl() may modify planes as well!
unsafe { query_buffer_ioctl(&mut buffer); }

The C code will fill the planes pointer to by buffer. Now the above code looks quite fishy to me: the compiler has no reason to believe that planes is modified by the C function and could try to optimize future reads away ("it's initialized to 0s, so it's still all 0s").

Worse, I'm not even sure the compiler has a reason to think that planes need to exist by the time the C function is called.

Could someone who understand this better suggest a safer version of this code?

A pointer (&mut buffer) to a pointer (&mut planes[0]) to the (first element of the) array is passed to C.
Looks valid to me.

However i would probably write planes.as_mut_ptr() instead of &mut planes[0] as *mut Plane.

2 Likes

Agreed on using as_mut_ptr.

Also, the unsafe block should encompass all unsafety such that any edits outside the unsafe block don't break any invariants. In this case, I think the entire let mut buffer statement should be inside the unsafe block. Currently, you could insert a (safe) mutable access to planes after the creation of buffer, but before the use of buffer in the ioctl. That could result in a data race.

As to how the compiler can prove that it needs to keep planes around? The same question applies when using C. I don't think you need to worry about this. Note that lifetimes and borrowing are a Rust construct, and most the optimizations are done in LLVM. By the time your code gets to LLVM, everything is pointers, just like it would be in C. All the ownership and lifetime information is gone then. Since you took a pointer to planes, presumably the compiler knows not to throw it out.

1 Like

Strong disagree with this. All safe code inside a module which contains any unsafe code should always be suspect - we don't need extra unsafe blocks to denote this. Keeping the unsafe blocks small lets us focus on exactly what the compiler would otherwise complain about, and comments can tell us just as well about what other safe code and fields needs to matter.

Some safe code it isn't even possible to mark as unsafe - like struct definitions, yet they matter just as much in many cases. I'd argue that this is why we should care much more about any code in a module which contains unsafe code, not just the unsafe code itself.

4 Likes

If you create a raw pointer with &mut planes[0] as *mut Plane, then that raw pointer is only allowed to access the first element, and it would be UB to access the other seven elements with the pointer. Using as_mut_ptr does not have this issue. This is because a raw pointer created from a reference can only access what you could have accessed through the reference, and a reference to the first element can't modify the other seven elements. You can see this by picking MIRI under the tools in this playground.

The compiler wont optimize your reads away unless you create a reference to the array between creating the raw pointer and passing the raw pointer to C.

// this is ok
let ptr = planes.as_mut_ptr();
call_c_func(ptr);
// this is not ok
let ptr = planes.as_mut_ptr();
let r = &mut planes[0]; // this invalidates the pointer
call_c_func(ptr);
5 Likes

Thanks for all the insightful answers! As you have probably guessed I am still learning the ropes of system programming using Rust and this is super useful. I didn't know about MIRI and will be including it into my testing routine.

I was also thinking of maybe using Pin to make sure the plane data does not move away (thus invalidating buffer's inner pointer), but maybe that's overthinking it?

Using Pin might overcomplicate things because the invariants it provides aren't necessarily the invariants that you expect. Pin is... subtle.

I've been playing around with unsafe Rust for several years now and would consider myself quite an advanced user, and even I got tricked by thinking (the equivalent of) Pin<&mut Plane> would guarantee the data doesn't move. To actually guarantee it can't be moved you need to make sure Plane: !Unpin, which is only possible in stable Rust by adding a std::marker::PhantomPinned to the type.

I'd agree with @Michael-F-Bryan - it's kind of overkill. Plus, it might lead you to even more UB!

Specifically, if you create a Pin of a value, that value must never again move, ever. So if your plan is to initialize the array with the C function, then return it, and you pin the array, you can't actually return it - since that would be moving the array. If you Pin it once, you guarantee that it will never move again.

The array is supposed to be very short lived and never exit the function, so that behavior should be ok. On the other hand it is so short-lived (create array; call ioctl; convert array to more appropriate structure) that it indeed appears overkill to have it pinned. Maybe I'm just overthinking it.

Besides, the way pin enforces that values are not moved again is by forcing people to use unsafe to construct a Pin<&mut TheValue>, thus having them promise not to move it. The real purpose is to allow one library to promise another library that the first library wont move some value, allowing the second library to rely on that.

You don't gain anything when there isn't that kind of library/module distinction.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.