Converting between references and *c_void

I spent a little while yesterday figuring out how to convert between rust references and C style FFI void* pointers, and I thought it worth writing a few notes on my experiences. There were a few wrinkles which tripped me up.

First of all, I struggled to find a concise summary of how to do this. Eventually I ended up with two simple wrapper functions which do the job:

unsafe fn voidp_to_ref<'a, T>(p: *const c_void) -> &'a T
{
    unsafe { &*(p as *const T) }
}

fn ref_to_voidp<T>(r: &T) -> *const c_void
{
    r as *const T as *const c_void
}

My observations in getting there, in no particular order.

  • Actually one of the unsafe keywords needs to go, but I'm not sure which is the right one to lose. I find it odd that an unsafe function automatically marks its entire implementation as unsafe, I'd rather focus my use of unsafe where it's important.

  • Speaking of focusing unsafe, this threw me for a bit: I know that the dereference *(...) is the truly unsafe part, so I wrote the following instead (and not wrapped inside a function, so the object lifetime was less managed):

    &unsafe{*(p as *const T)}
    

    Turns out that this has completely different semantics (returning a reference to a copy of *p instead)! Wow. Are there other cases where the presence or absence of unsafe makes a semantic difference to code that otherwise compiles? Or is it just this particular &* glyph that needs to be left unbroken?

  • Finally, the double cast in ref_to_voidp was a surprise, though I can see why it's happening. The compiler messages didn't help me when I tried the direct cast r as *const c_void: the compiler just says "E0606 casting ... is invalid"; for quite a while I thought I was on completely the wrong track.

My main problem was probably finding my way around the documentation, but it's hard to find a concise summary of the bits I'm after at any particular moment!

By the way, what is the inferred lifetime 'a from &*p? I think I have no idea... unless it's simply "as long as required" (which I guess might as well be 'static)!

1 Like

The inferred lifetime in voidp_to_ref can indeed be anything, including 'static. And yes, it's unfortunate that marking a function unsafe makes the contents not require unsafe blocks.

I think this is a property of blocks in general, not just unsafe {}! &{*x} differs in the same way that &unsafe{*x} does. Having a block surrounding an expression forces whatever the expression evaluates to be moved, because while an expression can result in a place, a block must produce a value.

Oh, that makes sense. So &{*x} forces a copy (to get the *x value out of the block), whereas &*x is a sort of short circuit no-op (except for side effects like type conversion triggered by the dereference, I guess)?

I'd better watch where I put my blocks now...

1 Like

Yep! *x kind of produces a value, but that value has a place associated with it, so & can just grab that place. Kind of similar to how &struct.field doesn't move out of struct.field, only grabs its place.

I usually don't worry about it, as it's pretty rare to produce non-value values from blocks, and the compiler will (in most instances) error out if this causes bad behavior.

The only place it can really get concerning is if you do &mut { ... } with some Copy value, and accidentally mutate the copy.

There is no doubt about the order: it must not be possible to call the function from within non-unsafe code (because the function itself is not an unsafe fn) and yet cause Undefined Behavior. So, since voidp_to_ref(ptr::null()) is UB, voip_to_ref must be marked unsafe.

The fact that you lose the unsafe { ... } requirement within the function body is an oversight / mistake that the language is working on, it looks like this is gonna be changed in the 2021 edition. In the meantime, you can use the #[require_unsafe_in_body] attribute on the function:

#[require_unsafe_in_body]
/// Safety: `p` must be a non-NULL, and well-aligned
/// readable pointer to an immutable `T`; and all of this must
/// be valid for the lifetime `'a`.
unsafe
fn voidp_ref<'a, T : 'a> (p: *const c_void) -> &'a T
{
    unsafe {
        // Safety: the narrow contract of the function guarantees
        // this is sound.
        &*p.cast()
    }
}

Yes, () and {} are not exactly the same: the former just isolates for an explicit order of operands, whereas the latter forces an evaluation to a value expression (~rvalue in C++ terms).

The * operator creates a place expression (the place the address points to), from which we can:

  • take a reference with the & operator,

  • upgrade to a value expression when it is Copy (this is what happens when using braces { }), which can itself be evaluated as a place expression by generating a (local) temporary with that value, which the & operator operates on.

In other words, by doing {} when the type is Copy, you force it to become a value expression by Copy. Then, by doing & on it, you force the value to become a place expression by creation of an ephemeral local / stack value, so as to make it have an address.

It is indeed "unfortunate" that unsafe defines a block and thus implies the latter semantics.

At the end of the day, it is far simpler to just see &* as a single operation, that of upgrading a raw pointer to a Rust reference (which implies many many invariants, so when in doubt it is better to avoid having such unconstrained functions), thus the unsafe { } block must contain both sigils inside.

  • The unbounded lifetime is quite easy to misuse. Must such functions in Rust are thus of the form:

    unsafe
    fn voidp_ref<'a, T : 'a> (&p: &'a *const c_void) -> &'a T
    {
        unsafe { &*p.cast() }
    }
    

    This way when you call it on some local pointer you don't get a lifetime that can escape the function.

Yes, references cannot be cast, only raw pointers can. But a Rust reference can be coerced to its downgraded "raw" form. So, before being able to cast the pointer, you need to coerce the reference:

  • <ref> as *const _

  • let p: *const _ = <ref>; (implicit coercion, proves this is not a cast)

  • and my personal favorite (implicit coercion inlined within the successive cast):

    <*const _>::cast::<T>(r)
    
1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.