What are smart pointers? (Part 2 with definition that compiles)

Of course SmartPointer::from_raw_parts needs to be unsafe, but that isn't what I meant.

SmartPointer::into_raw_parts is not unsafe (and doesn't need to be). But for SmartPointer to be useful, we must give another guarantee:

    // Sound implementations must return a pointer that's dereferenceable until `from_raw_parts` is invoked:
    fn into_raw_parts(this: Self) -> (*mut <Self as Deref>::Target, Self::ExtraMetadata);

Thus that this is can be sound

fn test_value<T>(smart_pointer: T)
where
    T: SmartPointer + Debug,
    <T as Deref>::Target: Debug,
{
    let (ptr, meta) = SmartPointer::into_raw_parts(smart_pointer);
    let inspect: &<T as Deref>::Target = unsafe { &*ptr };
    println!("Pointer dereference: {inspect:?}");
    let restored: T = unsafe { SmartPointer::from_raw_parts(ptr, meta) };
    println!("Restored smart pointer: {restored:?}");
}

… as long as all unsafe impl SmartPointer for /* … */ provide a proper implementation of into_raw_parts.

Compare with Arc::into_raw, which is also not unsafe. But Arc is a type and not a trait and thus we know that whatever Arc::into_raw produces can be fed back into Arc::from_raw (given the pointer is still valid and has the correct offset from the beginning of the Arc in memory).

When we deal with a trait, however, implementors of SmartPointer::into_raw_parts (which is safe!) could provide a broken implementation. By making the trait (not the method) unsafe, we can demand that sound code must provide a correct implementation of into_raw_parts.

Your "so that this can be sound" code, which importantly is the user of the trait, contains unsafe. That is exactly because you are trying to dereference the raw pointer you got from into_raw_parts() and because you are trying to call from_raw_parts(). Since dereferencing a raw pointer and calling from_raw_parts is always unsafe, there is no way to observe a dangling pointer without using unsafe even if the SmartPointer trait is safe. Thus, it doesn't need to be unsafe in order to be sound.

I think you've got your responsibilities backwards. unsafe code isn't allowed to rely on traits being implemented correctly. It must be paranoid and defensive and anticipate that (at least) 3rd-party non-std code will contain wrong implementations.

1 Like

Consider:

// This implementation is unsound:
unsafe impl<T> SmartPointer for Arc<T> {
    type ExtraMetadata = ();
    fn into_raw_parts(this: Self) -> (*mut <Self as Deref>::Target, Self::ExtraMetadata) {
        let _ = this;
        (std::ptr::null_mut(), ())
    }
    unsafe fn from_raw_parts(ptr: *mut <Self as Deref>::Target, extra_meta: Self::ExtraMetadata) -> Self {
        // But this isn't unsound:
        panic!();
    }
}

(Playground)

If SmartPointer was not an unsafe trait, then the above implementation couldn't be considered unsound. When someone implements SmartPointer, we must be able to rely on the returned pointer to fulfil certain properties. This cannot be enforced by the compiler. Thus we should mark the trait as unsafe and put up certain rules how into_raw_parts should behave (we don't need to, but if we don't do, our trait isn't very useful).

And indeed, it isn't – as long as you want to do anything with the returned raw pointer that would invoke UB due to it being null, you would have to write unsafe.

I didn't mean that you can't actually create a dangling pointer (that's trivial in safe code anyway, by casting align_of::<T>() to *const/mut T). The point is you can't do anything non-trivial with it without writing unsafe.

Yes, we will always need unsafe to do something with the pointer; but if the trait isn't unsafe too, then it's impossible to soundly use unsafe later.

My idea was that the unsafe should be here:

unsafe trait SmartPointer: Deref {

Thus that this can be sound:

    let (ptr, meta) = SmartPointer::into_raw_parts(smart_pointer);
    let inspect: &<T as Deref>::Target = unsafe { &*ptr };

If the trait wasn't marked unsafe, then we had no way to soundly dereference the returned pointer (unless SmartPointer was a sealed or private trait).

Actually that's wrong. If we can store metadata, it's always possible to temporarily turn something into a raw pointer that's dereferencable:

use std::ops::Deref;

unsafe trait SmartPointer: Deref {
    type ExtraMetadata;
    fn into_raw_parts(this: Self) -> (*mut <Self as Deref>::Target, Self::ExtraMetadata);
    unsafe fn from_raw_parts(ptr: *mut <Self as Deref>::Target, extra_meta: Self::ExtraMetadata) -> Self;
}

unsafe impl<T> SmartPointer for T
where
    T: Deref,
{
    type ExtraMetadata = Box<Self>;
    fn into_raw_parts(this: Self) -> (*mut <Self as Deref>::Target, Self::ExtraMetadata) {
        let boxed = Box::new(this);
        (&**boxed as *const _ as *mut _, boxed)
    }
    unsafe fn from_raw_parts(ptr: *mut <Self as Deref>::Target, extra_meta: Self::ExtraMetadata) -> Self {
        let _ = ptr;
        *extra_meta
    }
}

(Playground)

I agree now that the strict definition is more useful than the compromise that allows extra metadata. (Edit: Maybe not, see following post.) When using the strict definition, however, it's clear that Deref should also be implemented for some types that are not smart pointers (e.g. String). Following that, the documentation of Deref should (still) be updated.

But isn't there another difference between raw pointers and smart pointers than whether they are "owning" or "borrowing"? Opposed to a raw pointer, a smart pointer can be dereferenced in safe code. Thus I feel like many "smart pointers" should rather be named "smart references". And if we do so, we could just relax the overall definition anyway and allow anything that can be "dereferenced" (which isn't a plain reference) to be a smart reference.

Then the only question that would be remaining is when to implement Deref, i.e. when to make a type dereferencable. And that is (more or less):

However I deviate from @tczajka's view insofar as that I would then call the type dereferenceable (or being a "smart reference", except for the trivial case of &).

Mere wrappers generally shouldn't implement Deref. I think in particular the newtype pattern, for example, shouldn't implement Deref.

I see one way out to provide a more formal definition of "smart pointer" while allowing extra metadata. It would be to make extra demands in regard to implementation details of SmartPointer::into_raw_parts:

use std::ops::Deref;

// UNSAFE: Implementors must ensure that `SmartPointer::into_raw_parts.0`
// can be dereferenced at least until `SmartPointer::from_raw_parts` is
// used.
// Moreover, it should only be implemented if `into_raw_parts` is a cheap
// operation (i.e. doesnt allocate, for example).
unsafe trait SmartPointer: Deref {
    type ExtraMetadata;
    fn into_raw_parts(this: Self) -> (*mut <Self as Deref>::Target, Self::ExtraMetadata);
    unsafe fn from_raw_parts(ptr: *mut <Self as Deref>::Target, extra_meta: Self::ExtraMetadata) -> Self;
}

You are probably right. It may still help to phrase some parts of something in terms of Rust for a better understanding. The updated definition above contains some "human information" when to implement the trait.

It's not so uncommon. For example AsRef can be implemented for costly conversions, but it shoudn't:

If you need to do a costly conversion it is better to implement From with type &T or write a custom function.

So if we would do the following:

// This is sound, but would violate the semantics of `SmartPointer`:
unsafe impl<T> SmartPointer for T
where
    T: Deref,
{
    type ExtraMetadata = Box<Self>;
    fn into_raw_parts(this: Self) -> (*mut <Self as Deref>::Target, Self::ExtraMetadata) {
        let boxed = Box::new(this); // sound but violating semantics of `SmartPointer`
        (&**boxed as *const _ as *mut _, boxed)
    }
    unsafe fn from_raw_parts(ptr: *mut <Self as Deref>::Target, extra_meta: Self::ExtraMetadata) -> Self {
        let _ = ptr;
        *extra_meta
    }
}

(Playground)

It compiles, but would violate the intended semantics.

We could now say that:

  • A "smart pointer" is something that could implement the unsafe trait SmartPointer (as long as into_raw_parts could be made reasonably cheap; and of course the returned raw pointer is usable until the raw parts are converted back into the smart pointer).
  • A reference-like type is anything that implements Deref.
  • A "smart reference" is a reference-like type which isn't & or &mut.
  • Deref should be implemented when a reference to the type should be treated similarly as a reference to a target type (i.e. when/if we want deref-coercion).
1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.