Getting an `Rc`, from a field within an `Rc`. How unsafe is this?

nrxus · January 23, 2020, 8:42pm

I want to write code that looks like this:

use std::rc::Rc;

struct ParentFoo(ChildFoo);
impl ParentFoo {
    fn by_ref(&self) -> i32 {
        self.0.by_ref()
    }
    
    fn by_mut_ref(&mut self) -> i32 {
        self.0.by_mut_ref()
    }
    
    fn by_rc(self: Rc<Self>) -> i32 {
        self.0.by_rc()
    }
}

struct ChildFoo {
    a: i32,
}

impl ChildFoo {
    fn by_ref(&self) -> i32 {
        /* some code */
    }
    
    fn by_mut_ref(&mut self) -> i32 {
        /* some code */
    }
    
    fn by_rc(self: Rc<Self>) -> i32 {
        /* some code */
    }
}

link to playgrund

This works great for &self and &mut self (also self not shown in the code), but not for Rc since I can't get an Rc form of the child field from the parent field.

So I hacked up together this.... it feels super unsafe but I am convincing myself that it's not. I wanted to check in with you all to see if there is a reason why this is actually wildly wrong and I should never ever ever do it.

use std::rc::Rc;

struct ParentFoo(ChildFoo);
impl ParentFoo {    
    fn by_rc(self: Rc<Self>) -> i32 {
        unsafe {
            let rcd = Rc::from_raw(&self.0 as *const ChildFoo);
            let result = rcd.clone().by_rc();
            // do not drop what is inside the Rc because is not actually owned by it.
            std::mem::forget(rcd);
            result
        }
    }
}

struct ChildFoo {
    a: i32,
}

impl ChildFoo {
    fn by_rc(self: Rc<Self>) -> i32 {
        // what is in here does not matter for my use case
        self.a
    }
}

fn main() {
    let foo = Rc::new(ParentFoo(ChildFoo { a: 3 })).by_rc();
    assert_eq!(foo, 3);
}

Link to playground

For context, what I really want is the ability to have a parent struct and a child struct that both share the same function signatures, with the parent struct always proxying to the child. I can go more in-depth as to why for more context if anyone asks. Ideally I would also be able to do Box<Self>, Pin<Self> but I figured Rc<Self> and Arc<Self> might be good starting points.

mbrubeck · January 23, 2020, 9:02pm

I'm not 100% confident that this is possible to do soundly, but I think it might be. However, you probably at least need to ensure that the child struct is the only field in the parent struct, and that the parent struct is #[repr(transparent)]. Otherwise ChildFoo::by_rc could clone the Rc and then one of those clones could later be used to drop its allocation with the wrong layout.

Also, rcd might be dropped during unwinding if there is a panic during ChildFoo::by_rc. You should call forget before calling ChildFoo::by_rc:

let rcd = Rc::from_raw(self.into_raw() as *const ChildFoo);
let child_rc = rcd.clone();
// do not drop what is inside the Rc because is not actually owned by it.
std::mem::forget(rcd);
child_rc.by_rc()

or use Rc::into_raw to consume the original Rc before converting it:

let child_rc = Rc::from_raw(Rc::into_raw(self) as *const ChildFoo);
child_rc.by_rc()

nrxus · January 23, 2020, 9:11pm

I don't understand why I need to make sure the parent is #[repr(transparent)] and that it is the only field.

I am not converting Rc<ParentFoo> into Rc<ChildFoo> but rather Rc<ParentFoo>.0 into Rc<ChildFoo>, Rc<ParentFoo>.0 is a ChildFoo but it can only be obtained through a reference and not an Rc.

In my real use case ParentFoo is an enum of either ChildFoo, or something else that fakes ChildFoo (I am building a mocking library) so unfortunately ChildFoo is not it's "only" field.

Good call about doing the forget first though, that would have been not good.

mbrubeck · January 23, 2020, 9:49pm

The docs for Rc::from_raw require:

The raw pointer must have been previously returned by a call to a Rc::into_raw .

This is very important, because Rc<ParentFoo> points to a memory allocation that contains an RcBox, which is laid out in memory like this:

+-------+-------+-----------+
| usize | usize | ParentFoo |
+-------+-------+-----------+

When you pass a *const ParentFoo to Rc::from_raw, it decrements the pointer by an appropriate amount to find the address of the RcBox. It then returns an Rc pointing to the start of the RcBox, where the reference counts live.

If you pass in a pointer that does not point to the start of the ParentFoo (or whose alignment does match ParentFoo), then it will return an Rc that points to some memory that does not have the reference counts at the expected location. If a recipient of that faulty Rc then tries to read or write those reference counts, it will end up touching some other memory location that could contain anything:

impl ChildFoo {
    fn by_rc(self: Rc<Self>) -> i32 {
        self.clone(); // This will increment some random location in memory.
        0
    }
}

mbrubeck · January 23, 2020, 9:50pm

Even if your ChildFoo happens to share the same address and alignment as its ParentFoo, you're still in trouble if it isn't also the same size. The problem is creating an Rc<ChildFoo> that points to memory owned by an Rc<ParentFoo>. The Rc<ChildFoo> could outlive the Rc<ParentFoo>, for example if it gets passed to something that stores it in a static variable:

thread_local! {
    static CACHE: RefCell<Option<Rc<ChildFoo>>> = Default::default();
}

impl ChildFoo {
    fn by_rc(self: Rc<Self>) -> i32 {
        CACHE.with(|cache| {
            *cache.borrow_mut() = Some(self.clone());
        });
        self.a
    }
}

If this cached Rc<ChildFoo> gets dropped later (after all Rc<ParentFoo> that point to the same memory are dropped), it will try to deallocate the memory allocated by the original Rc<ParentFoo>. If the layouts of ChildFoo and ParentFoo are different, this is undefined behavior.

mbrubeck · January 23, 2020, 10:06pm

Just one more point: I want to emphasize that this isn't just a problem for "badly behaved" implementations of ChildFoo. Even a very basic ChildFoo::by_rc can trigger undefined behavior here:

impl ChildFoo {
    fn by_rc(self: Rc<Self>) -> i32 {
        0
        // `self` gets dropped here, decrementing the reference count.
        // If it doesn't point to a valid `RcBox<ChildFoo>` then this may corrupt random memory.
    }
}

cuviper · January 23, 2020, 10:08pm

mbrubeck:

This is very important, because Rc<ParentFoo> points to a memory allocation that contains an RcBox , which is laid out in memory like this:
+-------+-------+-----------+
| usize | usize | ParentFoo |
+-------+-------+-----------+

Also notable, that layout only holds true because it's defined as RcBox<T: ?Sized>, otherwise with a sized T they could be reordered as the compiler wishes. Even as-is, I think there's no guarantee which of those usize fields in memory is the strong or weak count.

nrxus · January 23, 2020, 10:16pm

So give that what I want is more like:

enum Foo {
    Real(RealFoo),
    Fake(/* something does not matter*/),
}

impl Foo {
    fn by_rc(self: Rc<Self>) -> i32 {
        match self {
            Real(r) => /* somehow call for RealFoo::by_rc() */,
            Fake(f) => /* something does not matter */
        }
    }
}

struct RealFoo {
    /* fields */
}

impl RealFoo {
    fn by_rc(self: Rc<Self>) -> i32 {
        /* some real implemenation */
    }
}

there is no sound way to do this because Foo and RealFoo are not the same size. Am I getting that right?

mbrubeck · January 23, 2020, 10:20pm

Right. Types like Rc/Arc/Box that manage memory can't generally be synthesized or cast like this except in extremely constrained ways. In general, "ownership-agnostic" code in Rust is tricky, so it might be a good idea to go into the higher-level reasons that you want this code structure.

nrxus · January 23, 2020, 10:47pm

So what i am working on is a mocking library for rust that generates mocks at compile time based on structs and their methods.

In a very TL;DR hand-wavy version, I am basically morphing a given struct to mock into an enum of either a fake or real implementation, and saving the real struct definition elsewhere, and then morphing all the methods to match on whether they are fake or real. If they are real then it should simply proxy to the real struct and real methods. This means that the "maybe fake" enum needs to have the exact same signature for each function as the real implementation and able to call the real implementation if need be.

This actually works pretty great for methods with &self, &mut selfand self. Because given P is a parent struct, and F is it's field that it owns, I can go from &P -> &F, &mut P -> &mut F, and P -> F. Unfortunately it seems like I cannot go from Rc<P> -> Rc<F> (or other arbitrary self types).

system · April 22, 2020, 10:47pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Creating a struct with a parent reference	6	8157	January 12, 2023
Is Rc a good way to work around partial borrow of self? help	14	2208	April 25, 2020
Rc and tree-like structures	8	518	July 19, 2023
How to reference self in a spawned child object help	11	613	March 1, 2023
Confusing pin example	3	334	November 22, 2020

Getting an `Rc`, from a field within an `Rc`. How unsafe is this?

Related Topics