Why doesn't `Option` support `dyn Trait`?

RefCell is just an inline cell. However the Rc has a pointer access. So Rc around a Box means two indirections from the caller to reach the actor data. I think that would be acceptable workaround for the case where the actor is intentionally implementing a trait, but I don't want to pay it for all actors. I'm still figuring out whether I can make your Actor<Box<dyn Trait>> idea work. It needs a bit more experimentation. I hope I can get it to work.

1 Like

Okay, @Douglas's idea of putting the Box<dyn Trait> inside the actor does actually work quite well, and doesn't require any changes to Stakker. So I think this can work as a workaround. It requires some boilerplate, but it's all on the actor implementation side, so is transparent to the users of the group of actors that implement the trait.

As far as I can see what @trentj said explains best why Option, and enums in general, don't support ?Sized. So if it's just to allow layout optimisations, then there is no fundamental reason why I can't implement my own enum with unsafe code that's compatible with ?Sized. The problem is right now my crate has a "no-unsafe" feature which disables all unsafe, so as there isn't a safe way of doing it, that would break that feature. So I'm limited to the workarounds either entirely within or entirely outside the actor interface.

Thanks everyone for their suggestions. I'll post the workaround code later.

With a bit more work I got rid of the boilerplate in the Actor<Box<dyn Trait>> solution. In the example below I previously had an Animal actor which encapsulated the Box<dyn Trait> and forwarded calls. But I realized that Box will auto-deref if called, so I could get rid of the intermediate type. This works out much cleaner. Everything seems to work fine within the actor. I will just have to add a new macro to Stakker to make it easy to create these actors. So this is okay. Only these trait-based actors will pay the cost of the extra indirection.

use stakker::*;
use std::time::Instant;

// Something like this will go in Stakker
macro_rules! actor_of_trait {
    ($core:expr, $trait:ty, $type:ident :: $init:ident($($x:expr),* $(,)? ), $notify:expr) => {{
        let notify = $notify;
        let core = $core.access_core();
        let actor = ActorOwn::<$trait>::new(core, notify);
        call!([actor], <$type>::$init($($x),*));
        actor
    }};
}

// Trait definition
type Animal = Box<dyn AnimalTrait>;
trait AnimalTrait {
    fn sound(&mut self, cx: CX![Animal]);
}

struct Cat {
    miaow_count: usize,
}
impl Cat {
    fn init(_: CX![Animal]) -> Option<Animal> {
        Some(Box::new(Cat { miaow_count: 0 }))
    }
}
impl AnimalTrait for Cat {
    fn sound(&mut self, _: CX![Animal]) {
        self.miaow_count += 1;
        println!("Miaow {}", self.miaow_count);
    }
}

struct Dog {
    bark_count: usize,
}
impl Dog {
    fn init(_: CX![Animal]) -> Option<Animal> {
        Some(Box::new(Dog { bark_count: 0 }))
    }
}
impl AnimalTrait for Dog {
    fn sound(&mut self, _: CX![Animal]) {
        self.bark_count += 1;
        println!("Woof {}", self.bark_count);
    }
}

pub fn main() {
    let mut stakker = Stakker::new(Instant::now());
    let s = &mut stakker;

    let animal1 = actor_of_trait!(s, Animal, Cat::init(), ret_nop!());
    let animal2 = actor_of_trait!(s, Animal, Dog::init(), ret_nop!());

    let mut list: Vec<Actor<Animal>> = Vec::new();
    list.push(animal1.clone());
    list.push(animal2.clone());
    list.push(animal1.clone());
    list.push(animal1.clone());
    list.push(animal2.clone());

    for a in list {
        call!([a], sound());
    }

    s.run(Instant::now(), false);
}

I was still wondering whether I could implement an unsized enum in an external crate in unsafe (but stable) Rust, and it appears not. The problem is that I need to express to the compiler the size of my enum struct. Both enum and union do this magic, taking the largest size out of a list of types, but enum doesn't support unsized, and union only supports Copy types (on stable). So unless I can find some other way to tell the compiler the size of my type, I've hit a brick wall -- at least until the compiler gets better support for union -- unless someone else can think of a way.

Otherwise, I'll accept defeat and proceed with Actor<Box<dyn Trait>>.

(union seems rather incomplete/broken as described, as it gives no mechanism for dropping a value of one member and putting in a new value of another member. However I could do direct unsafe access to the memory, ignoring the union members, if only I could get union to sort out the size and alignment, which it won't unless the types are Copy.)

As you point out above, the compiler needs to know at compile time the size of the largest alternative in a tagged enum or untagged union so that it can allocate the maximum required amount of memory and compute any required field-access offsets. !sized does not provide the information required for such a compile-time allocation. Therefore the only way to support compile-time allocation, and thus use an enum or union, is to store the !sized component(s) on the heap at run-time and link them into the enum or union via sized references (e.g., box) that the compiler can allocate at compile-time.

Edit: In theory rustc could create a tagged enum or untagged union of DSTs (dynamically sized types) where only the last field of one or more of the alternative sub-structs was !sized, because such a construction would permit the compiler to compute all required field-offsets at compile-time. I don't know whether rustc today supports such a construction. Such a !sized enum or union DST could not be allocated on the stack, but could be stored on the heap.

1 Like

union already accepts ?Sized types, so that is clearly possible (as you said in your Edit). All ?Sized types already have to be created on the heap, by some code that actually knows their sizes, before the pointer has the magic "dyn coercion" happen to it which makes it a ?Sized type. So I think that part's already taken care of.

The problem is that the implementation of union is incomplete. If union just handled size and alignment and left all access to field members to std::ptr methods, that would be fine. As it is, I guess they couldn't figure out how to make dropping stuff sane, and added the Copy restriction. Really there is no way to make dropping sane on a union, because at drop time, you don't know which member is active. So the only solution is to pass this responsibility over to the unsafe coder who's using the union, to use drop_in_place as necessary. It's error-prone but there is no other solution that I can see.

1 Like

There’s some related discussion here: https://github.com/rust-lang/rfcs/issues/1151

Edit: Don't use this as it's unsound. See below. However this will be possible as soon as MaybeUninit supports ?Sized.

For future reference, I've put together a DST-friendly implementation of Option, which does support dyn Trait. So it can be done!

1 Like

That structure is UB to use with types that aren't valid to be all zeroes. You want to use MaybeUninit, not ManuallyDrop.

1 Like

Okay, I tried MaybeUninit but it doesn't support ?Sized. Also, it doesn't matter what those values are as they are not accessed unless active is true. I tried mem::uninitialized() but it claims to be deprecated, but I couldn't find a way to get MaybeUninit::uninit() to work in its place. mem::zeroed() seemed to do the job.

Is there some theoretical reason why this is UB, rather than a practical one? In the None case, the memory won't get dropped or accessed, so it seems to me that it doesn't matter what value it has. What compiler UB could this cause?

Edit: ManuallyDrop might not have quite the right API, but I don't think it's UB how I'm using it. It must be valid to have uninitialised data in ManuallyDrop, because that's how it is after you call ManuallyDrop::drop. Zeros are just a special case of uninitialised data. Or maybe I could just use mem::uninitialized() instead and put up with the warning.

That's not true; the act of calling mem::zeroed<&T>() (for example) is enough to invoke instant UB, even if the result is never used. The docs for ManuallyDrop even mention this:

That's precisely because MaybeUninit is designed to make it harder to invoke this particular kind of UB. Unfortunately it doesn't work in this case because MaybeUninit<T> doesn't have T: ?Sized. I'm not sure why not; maybe it's just an oversight.

No, there is a difference between uninitialized data and data that has been dropped. Even drop cannot zero the bits of a &T.

1 Like

Okay, so zeroing was a bad idea (according to all the theory, although a practical example of how this could go wrong in this case would make it clearer). But if I used ManuallyDrop::new(mem::uninitialized()), would that be okay?

UB is just shorthand for breaking the contract with the optimizer, so what could it optimize that would make this go wrong? I have to initialize (or not initialize) some memory that I will never access. No code touches that memory after it is written (or not written).

Edit: So, I've found the rule in the Rustonomicon that says I shouldn't produce any value which isn't valid. It all seems a bit woolly around ManuallyDrop, though, since it can be post-drop and an invalid value and still be moved around. So maybe it just needs someone to bless this particular use of ManuallyDrop. Otherwise really I need MaybeUninit to support ?Sized, but since it is a union, and unions don't support ?Sized (according to an error message) I'm stuck. So that's all pretty disappointing. I can't even use ptr::write, as I need something to reserve space in the struct, i.e. do size/alignment.

The answer is niche filling size optimizations.

It's easiest to demonstrate with the guaranteed non-null optimization of Option<&T>. This isn't represented as (bool, MaybeUninit<&T>), it's roughly represented as union { 0usize, &T }.

If someone does Option<jimuazu::Opt<&T>>, the compiler is perfectly allowed to say "hey, in the &T part of Opt<&T>, 0x00000000 is an invalid bit pattern that can never occur. Let's say that setting the &T to that is how we represent Option::None, so we don't need to make Option<jimuazu::Opt<&T>> any bigger than jimuazu::Opt<&T>."

This is incorrect. A dropped ManuallyDrop<T> is not an invalid value. It is a "logically moved from" value. The raw bits are still valid for type T. Drop::drop is still not allowed to put your value into an invalid state, just an unsafe one (i.e. one where safety invariants (of the type itself, not of its members!) are not met).


I'm pretty sure what you're trying to do isn't possible on the stack. But you're using an indirection, so you can theoretically do better by dropping down to manually allocating your own indirection. It would be annoyingly difficult and require re-implementing basically all of Rc, RefCell, and Option manually using just raw allocation, raw pointers, and UnsafeCell, but it would be possible. (And then you still wouldn't be able to coerce Actor<Struct> to Actor<dyn Trait> anyway because CoerceUnsized is a compiler-internal trait (and probably stuck that way).)

4 Likes

Thanks for explaining it. Okay, I see that part now.

I think it could be an invalid value, because the Rustonomicon says that a Box that is dangling is an invalid value. But I see what you are saying. It will be valid bit patterns for those types.

I'm a lot further along than you imagine. I have a type that's effectively a custom Rc of a base structure, a discriminant and a union (but not using union), with one branch of the union being unsized. I have solved the CoerceUnsized problem, because if you can get it in the form of a Box<something> you can do the coercion, and then turn it to a raw pointer to make the Rc. So I think I can do this, without UB (although I will double-check the Rustonomicon list). It's unfortunate about Opt, though, but I was only implementing that to test some of the ideas.

If I understand correctly, there are two different cases of "invalid values" (described here). In short, null pointer as &T is insta-UB, since references being non-null is a validity invariant; Box being dangling is UB only if it is accessed, since it pointing to valid memory is a safety invariant.

2 Likes

The maybe-uninit crate uses exactly the same approach that I did (an uninitialised ManuallyDrop instance), and so is also unsound behind a safe interface.

The maybe-uninit crate is a backport of std::mem::MaybeUninit for older Rust versions.

It is exactly the same implementation as std::mem::MaybeUninit.

It uses a union.

Edit: Nevermind, the linked scr in docs.rs confused me. I didn’t notice that clicking [src] on this page links to the standard library type but that one is only used with a certain compilation flag.

Edit2: But the README also says that the API is unsound before Rust 1.35 where that implementation is used.

1 Like

Well you can kind of see how it happened and understand the author's dilemma, but it is still a huge footgun if someone wraps an Option around it, and it is a dependency of crates such as crossbeam. The unsoundness will only appear when testing against earlier Rust versions.

The list is certainly quite short and as far as I can tell crossbeam doesn’t have the maybe-uninit-dependency anymore on master, so that’ll go down to maybe-uninit being almost entirely unused in the near future.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.