Safe hiding of lifetime parameters

I wonder whether the following code using mem::transmute is safe.

A snippet using mem::transmute (on play.rust-lang.org).
use std::mem;
use std::sync::Arc;

#[derive(Debug)]
struct B {}

#[derive(Debug)]
struct C {
    b: B,
}

#[derive(Debug)]
struct A<'b> {
    b: &'b B,
}

#[derive(Debug)]
struct SafeContainer {
    c: Arc<C>,
    unsafe_a: A<'static>,
}

impl SafeContainer {
    fn new<'b>(c: &'b Arc<C>, a: A<'b>) -> Self {
        SafeContainer {
            c: c.clone(),
            unsafe_a: unsafe { mem::transmute(a) },
        }
    }

    fn a<'b>(&'b self) -> &A<'b> {
        unsafe { mem::transmute(&self.unsafe_a) }
    }
}

pub fn main() {
    let c = Arc::new(C { b: B {} });
    let a = A { b: &c.b };

    let container = SafeContainer::new(&c, a);

    drop(c);

    // do something with `container.a()`...
    println!("{:?}", container.a());

    drop(container);
}

I am trying to write Python bindings to a Rust crate of mine using PyO3. To this end, I have to somehow encapsulate structs with lifetime parameter inside a Python object whose corresponding struct must not have any lifetime parameters. So, I had the idea to extend the lifetime to 'static using mem::transmute and make sure that I keep an Arc owning the data around.

Is this a sound idea or is there a better way of doing this?

No, it not safe. Rust Playground
You can make it safe like so: Rust Playground (in SafeContainer::new, you must borrow from C in order to be safe)

1 Like

Thanks. I still have a hard time wrapping my head around ownership and borrowing.

Can I also borrow from another clone of the Arc the struct A borrows from? Maybe my example was a bit over-simplified, let's try again: Rust Playground. Now C provides a function compute_multiple_as to compute a vector of As which borrow from C. I would like to seperate these As from the original Arc<C> by safely wrapping them inside SafeContainer.

As long as you can guarantee that compute_multiple_as always borrows from self, then it is safe. SafeContainer's relies on SafeContainer.unsafe_a being borrowed from *SafeContainer.c, if this is ever not the case then your API is unsound, otherwise it's safe.

1 Like

Btw, I used Box in B to easily detect memory corruption. Ideally you will want to run your unsafe code through MIRI. On playground you can do so by clicking MIRI under Tools. You can also run MIRI locally by installing it with rustup component add miri, and running it with cargo miri run or cargo miri test

1 Like

Thank you very much!

Can I somehow ensure this on the callee side such that I get a compile error in case compute_multiple_as changes its behavior?

Not really, the function signature as provided is as close as you can get. You're playing with some rather subtle unsafe code and Rust doesn't (read: can't) provide enough tools to validate this at compile time.

1 Like

Sorry for asking yet another question. Don’t I have to swap the order of the fields c and unsafe_a in SafeContainer to avoid (potential) UB? Otherwise, when dropping a SafeContainer, c gets dropped first (potentially) leading to a dangling reference in unsafe_a which is only dropped afterwards. Thus, when drop on A is executed, the dangling reference may lead to UB (depending on the implementation of Drop on A, presumably, the default drop will just discard the dangling reference).

It's fine if unsafe_a dangles on drop. This also happens in safe code

fn main() {
    let mut x = None;
    let y = vec!["0"];
    x = Some(&y);
}

Here, y get's dropped before x, so when x drops it contains a dangling reference. But that's fine as long as it's not used on drop. Note: this is only fine when implicitly dropping things.

Any attempt to use the value after drop results in an error. Rust Playground
Any attempt Rust Playground

Ask away, I love answering questions!

2 Likes

Ask away, I love answering questions!

Good to know! :+1::slight_smile:

Thanks for the clarification. What I meant was something along the lines of: Rust Playground. Depending on whether and how A implements Drop, this might induce UB. So, for the example I provided, it is safe but if I do not know whether and how A implements Drop I should probably swap both fields. Thanks again for the trick with Box for checking UB. :slight_smile:

Yes, adding a Drop impl to A will induce UB if it accesses the reference. That's why in my last example fails to compile.

1 Like

I missed this when I first read. It's fragile to rely on the drop order of fields, instead use ManuallyDrop to explicitly specify the drop order.

use std::mem::ManuallyDrop;
#[derive(Debug)]
pub struct SafeContainer {
    data: Arc<B>,
    unsafe_a: ManuallyDrop<A<'static>>,
}

impl Drop for SafeContainer {
    fn drop(&mut self) {
        // safety: fields get dropped after `Drop::drop` is called
        // but it's fragile to rely on the drop order of fields
        // for safety. Instead we force `unsafe_a` to be dropped first
        // that way it doesn't access a potentially dangling reference
        // on drop.
        unsafe { ManuallyDrop::drop(&mut self.unsafe_a) }
    }
}

What does “is fragile” mean? The documentation of ManuallyDrop seems to suggest to reorder the fields of a struct to guarantee a certain drop order and not use ManuallyDrop.

1 Like

Technically, the borrowing data structure is also under my control (although located in another crate), and I could find some way to make it so that it does not borrow. Would it be wise to rewrite the code such that it does no longer borrow?

Here is a more detailed description of the actual problem I am trying to solve. The code I am writing is part of my Ph.D. thesis. So, in case this gets too technical and domain-specific, just ignore it. In my project, I use this for a data structure representing a transition in a Markov Decision Process (MDP) augmented with variables (basically a finite automaton where each edge probabilistically ranges over multiple successor states and you can write to global variables when taking a transition). As the MDP results from multiple interacting automata, each transition is composed of multiple edges of different automata. Now, these edges contain additional information, for instance, assignments to global variables. To this end, a Transition stores a boxed array of these edges to have the required information available should the user decide to query the successor states a transition leads to. Transitions are computed by taking the cartesian product over the edges of the automata.

When I say it's fragile, I mean that a later refactor or other changes could change the safety of the code. For example, reordering fields is normally benign, bit may introduce undafety here.

Ok, in that case just document that the field order shouldn't change because it could induce unsafety.

So if I understand you correctly, you are trying to cache a potentially expensive computation by storing references to CompiledEdge, but since they are always borrowed from an Arc it should be possible to get rid of the lifetime parameter. And that's where you were at the start of this thread.

Then this should be fine. But I have two other solutions

  • Another safer option would be to refcount the CompiledEdge, but that may be costly.
  • you can put CompiledEdge into an Arena, and then store the references that gives you. But this is a more invasive change. https://crates.io/crates/typed-arena

I will do that. :+1:

Yes, the structs (Transitions) with references to CompiledEdges are actually just intermediate results. Usually, the callee then decides to explore a subset of these structs further. While doing so it has to keep a reference to the owner of the CompiledEdges anyway. Finally, this process yields a struct which does not contain any references anymore. This works really nice for the Rust part of my code but as soon as I want to expose these intermediate results to Python so that Python code can decide which transitions to consider further, I run into these lifetime issues.

As far as I understand, Arena still gives me references which means that I have to use some kind of lifetime parameter. The problem really is that I cannot expose structs with lifetime parameters to Python because the lifetime of Python objects is out of Rust's control. Using an Arc to refcount the CompiledEdges would solve the problem—unfortunately, in my application domain, this would mean cloning several Arcs multiple tens/hundreds of millions of times. :see_no_evil:

I guess I will just leave it like that for now. Thank you very much for your advice and time.

1 Like

Cloning an Arc is just bumping a refcount, so that shouldn't be a problem, but if you need to clone the underlying data that could turn out to be problematic.

2 Likes

A related discussion about using field order to control drop order: Need for controlling drop order of fields - language design - Rust Internals

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.