Help moving out of Box<dyn trait> or safe transmuting traits

I have a trait that defines a state to be executed

pub trait State {
    type Output;

    fn execute(self) -> Result<Self::Output, Box<dyn std::error::Error>>;
}

This trait is meant to be called by a composable chain of states, like this

  // this is called like this A.and_then(|_| B).and_then(|_| C).execute()
  fn and_then<Next, F>(self, map_fn: F) -> AndThen<Self, Next, F>
  where
      Self: State + Sized,
      Next: State,
      F: FnOnce(Self::Output) -> Next,
  {
      AndThen {
          previous: self,
          map_fn,
          _marker: Default::default(),
      }
  }

pub struct AndThen<Prev, Next, F> {
    previous: Prev,
    map_fn: F,
    _marker: PhantomData<Next>,
}

impl<Prev, Next, F> State for AndThen<Prev, Next, F>
where
    Prev: State,
    Next: State,
    F: FnOnce(Prev::Output) -> Next,
{
    type Output = Next::Output;

    fn execute(self) -> Result<Self::Output, Box<dyn std::error::Error>>
    where
        Self: Sized,
    {
        let previous_output = self.previous.execute()?;
        let next_task = (self.map_fn)(previous_output);
        next_task.execute()
    }
}

The composition works fine when I have known states for the execution flow, the problem starts when I try to add dynamic states

    // The branch function returns a Box<dyn State>, prev.branch(|_| if cond { Box::new(B) } else { Box::new(C) })
    fn branch<Out, F>(self, branch_fn: F) -> Branch<Self, F>
    where
        Self: State + Sized,
        F: FnOnce(Self::Output) -> Box<dyn State<Output = Out>>,
    {
        Branch {
            previous: self,
            branch_fn,
        }
    }

pub struct Branch<Prev, F> {
    previous: Prev,
    branch_fn: F,
}

impl<Prev, Out, F> State for Branch<Prev, F>
where
    Prev: State,
    F: FnOnce(Prev::Output) -> Box<dyn State<Output = Out>>,
{
    type Output = Out;

    fn execute(self) -> Result<Self::Output, Box<dyn std::error::Error>>
    where
        Self: Sized,
    {
        let previous_output = self.previous.execute()?;
        let next_task = (self.branch_fn)(previous_output);

        // This fails because it can't move out of Box<dyn State>
        next_task.execute()
    }
}

I can work around by introducing a new trait and using mem::transmute, but I guess it works by chance and it is probably UB

trait BoxedState {
    type Output;
    fn execute(self: Box<Self>) -> Result<Self::Output, Box<dyn std::error::Error>>;
}

impl<Prev, Out, F> State for Branch<Prev, F>
where
    Prev: State,
    F: FnOnce(Prev::Output) -> Box<dyn State<Output = Out>>,
{
    type Output = Out;

    fn execute(self) -> Result<Self::Output, Box<dyn std::error::Error>>
    where
        Self: Sized,
    {
        let previous_output = self.previous.execute()?;
        let next_task = (self.branch_fn)(previous_output);

        // Now the call works, with the unsafe transmute
        let next_task: Box<dyn BoxedState<Output = Out>> =
            unsafe { std::mem::transmute(next_task) };

        next_task.execute()
    }
}

Is there a way to implement that "move out of dyn trait" without changing the trait to be self: Box<Self>?
Or is there a proper way to implement the trasmute/cast between the traits?

Here is a link with the full example in the Rust Playground

Thanks in advance

This is definitely UB, and I'm surprised it works at all. It's not the kind of UB where you're just violating some compiler invariants and may be punished by the optimizer. It's the kind where the thing you wrote makes no sense at all.

A Box<T> is basically a raw pointer *mut T to some heap-allocated data, together with resource semantics. For T: Sized (i.e. the size is known at compile time, which is true for most types), a raw pointer is just that - a data pointer, in the usual C sense. For unsized T (or, more properly, dynamically sized), a raw pointer consists of a data pointer and some extra metadata. For trait object dyn Trait, that metadata is a pointer to a statically known vtable, containing the pointers to the implementations of methods of Trait for some specific type T. Note that the vtable is different for different types implementing the same trait, as well as for different traits.

mem::transmute just blindly reinterprets the bits of the target type as the destination type. If you transmute trait objects, it means that the old data pointer is reinterpreted as the new data pointer, and the old vtable pointer is reinterpreted as the vtable pointer for the new trait. In general, for unrelated traits, this makes no sense at all: the methods are entirely different, and even if they have the same signatures, they may have different order. You case seems to accidentally have both methods be ABI-compatible, and in the same place. This will almost never happen.

Strictly speaking, the transformation you wrote (Box<dyn A> to Box<dyn B>) is impossible to do, because a trait object has no information about the type which is contained within. Correspondingly, you don't even know whether *Box<dyn A> implements B, much less can get the correct vtable.

Bonus: it is UB to transmute generic types with default representation. You can't transmute Foo<A> to Foo<B>, even if A and B are safe to transmute. This is because the instantiations of the same generic type with different concrete type parameters can have entirely different memory layout. You need to use type-specific conversion functions, if provided. For example, for Vec<T> these would be Vec::into_raw_parts and Vec::from_raw_parts, together with proper pointer casts. For Box<T>, you should use Box::into_raw and Box::from_raw, again with pointer casts.

I believe Box<T> is intended to be layout-equivalent to *mut T, at least with the default global allocator. But I couldn't find anything guaranteeing it in the docs, and using the proper functions isn't hard anyway, so just don't do box-to-box transmutes.


So, how can one solve your problem? There are multiple possible approaches, depending on your needs. I'll list a few.

  • You can add explicit methods to your traits, which would do the desired Box<dyn A> to Box<dyn B> conversion. Since the trait's implementation knows the specific implementing type, it can trivially give the desired new box. I.e. you need a signature fn (Box<Self>) -> Box<dyn Target>.

  • If the set of implementing types is fixed and known to you, you can make your traits subtraits of Any, and use dynamic downcasting at conversion site to learn the boxed type.

  • You can just side-step the problem. Change the signature of State::execute to be fn(&mut self) -> .... It's not as pretty, and can cause you problems when you try to move contained data into the output state, but it solves the trait object issue.

  • Similarly, do you really need dynamically typed state machines? Maybe you do, but you should really consider whether you're just complicating stuff for no good reason.

  • If both workarounds above don't apply, changing the signature to Box<Self> may be indeed the proper solution. After all, that's what you intend to use it with.

There may be other solutions, including something obvious which I've missed, but that's all that I can see at the moment.

4 Likes

Incidentally, Miri (under Tools, top-right) is always a good first-stop for "is this sound?" questions.

4 Likes

This is definitely UB, and I'm surprised it works at all. It's not the kind of UB where you're just violating some compiler invariants and may be punished by the optimizer. It's the kind where the thing you wrote makes no sense at all.

I am also surprised, everything I read so far tells me this shouldn't work. My guess is it works in the example because the example is small and it accidentally did the right thing, but it would probably fail catastrophically with code that have proper State implementations.

I believe Box<T> is intended to be layout-equivalent to *mut T, at least with the default global allocator. But I couldn't find anything guaranteeing it in the docs, and using the proper functions isn't hard anyway, so just don't do box-to-box transmutes.

I have tried to use the into_raw and from_raw functions, but I couldn't find out the proper way to write it, I'll dig a little bit more just for the sake of curiosity


Currently I'm solving the scenario by writing and enum and implementing the State trait for the enum, but it hides the flow inside the enum implementation, so I'm testing different implementations.

Just to have an idea

// Today I would implement State or this
pub enum TransferType {
    FullFile,
    Partial,
    NoTransfer,
}

// I can return the enum from the previous step and just call it
FileTransferProcess::new(..)
    .and_then(|_| QueryTransferType::new(..))
    .and_then(|transfer_type| transfer_type) // The flow is hidden
    .execute() 

// I wanted to have something like this
FileTransferProcess::new(..)
    .and_then(|_| QueryTransferType::new(..))
    .and_then_branch(|transfer_type| match transfer_type {
         TransferType::FullFile => Box::new(FullTransfer::new(..)
             .and_then(|_| AckTransfer::new(..)))
         TransferType::FullFile => Box::new(QueryRequiredBlocks::new(..)
             .and_then(|blocks| TransferBlocks::new(blocks))             
             .and_then(|_| AckTransfer::new(..)))
         TransferType::NoTransfer => Box::new(AckTransfer::new(..))
     }) // The flow is explicit
    .execute() 

I have a different implementation where everything is Box<dyn State>, but it is kind of ugly, so I am trying a different approach.

Anyway, I really appreciate the help, thanks a lot!

So long as T: Sized, a Box is guaranteed to be represented as a single pointer and is also ABI-compatible with C pointers (i.e. the C type T*).

But yeah, don't do box-to-box transmutes.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.