Ouroboros, a crate for making self-referential structs

I recently had a need to create a self-referencing struct in my project. I found the rental crate very helpful, but with it no longer being maintained I thought I would take a shot at making a similar crate with a more streamlined and flexible API. The product of this effort is ouroboros, and programming with it looks something like this:

use ouroboros::self_referencing;

#[self_referencing]
struct MyStruct {
    int_data: Box<i32>,
    float_data: Box<f32>,
    #[borrows(int_data)]
    int_reference: &'this i32,
    #[borrows(mut float_data)]
    float_reference: &'this mut f32,
}

fn main() {
    let mut my_value = MyStructBuilder {
        int_data: Box::new(42),
        float_data: Box::new(3.14),
        int_reference_builder: |int_data: &i32| int_data,
        float_reference_builder: |float_data: &mut f32| float_data,
    }.build();

    // Prints 42
    println!("{:?}", my_value.use_int_data_contents(
        |int_data| *int_data)
    );
    // Prints 3.14
    println!("{:?}", my_value.use_float_reference(
        |float_reference| **float_reference)
    );
    // Sets the value of float_data to 84.0
    my_value.use_all_fields_mut(|fields| {
        **fields.float_reference = 
            (**fields.int_reference as f32) * 2.0;
    });

    // We can hold on to this reference...
    let int_ref = my_value.use_int_reference(|int_ref| *int_ref);
    println!("{:?}", *int_ref);
    // As long as the struct is still alive.
    drop(my_value);
    // This will cause an error!
    // println!("{:?}", *int_ref);
}
14 Likes

Isn't writing mut in mut float_data redundant, because you should be able to infer this from mut in &'this mut f32?

If you ever plan to make a blog post about it, I'd be interested to see the macro-expanded code to get a grasp what problem the crate actually solves. AFAICT, its main feature is to take the responsibility for writing correct unsafe code from the user by encapsulating it in a macro, that only has to be checked once for undefined behavior and is then guaranteed to never emit undefined behavior by being used in a safe and sound context.

Yeah, I would really like to see the expanded code this generates.

Nice initiative! I'd be interested in showing a bit the kind of expanded code / code the attribute writes for us, especially when self-referential patterns come into play, since it is easy to make it unsound :sweat_smile:


Two suggestions:

  • What about using 'self for the self-referential lifetime? Since it is reserved you will be guaranteed that it does not collide with struct-specific lifetimes: MyStruct<'this> { ... }.

  • I've skimmed the proc-macro code, and you currently handle most errors by panic!-king. A good improvement for higher quality proc-macros is to, instead, use ::syn::Errors:

    1. Once a problematic element has been identified, let's say some ident: Ident, you construct an error pointing at it, with your custom error message:

      return Err(Error::new_spanned(&ident, "<error message>"));
      
      • If you don't have any spanned element to use, you can use:
        Error::new(Span::call_site(), "<error message>")
    2. Propagate these Results all the way up to the exported #[proc_macro] pub fn using ?

    3. From that outer function, you unwrap the error using .to_compile_error().into(), which will be equivalent to having done:

      ::quote::quote_spanned!(ident.span() =>
          compile_error! { "<error message>" }
      ).into()
      

      That is, emitting code that invokes the compile_error! macro.

    This leads to beautiful error messages, whereby your macro "calmly" handles malformed input instead of panicking :slight_smile:

6 Likes

Thanks for the feedback! That error handling looks really nice, I'll go ahead and implement that. I'm hesitant to change the lifetime to 'self because there is some code that uses 'this as an actual template parameter.

Speaking of that smooth segue, here's a quick explanation of what the expanded code does:

  1. The order of all the fields in the struct is reversed so that references are dropped before the fields they reference.
  2. 'this is replaced with 'static.
  3. Data can only be accessed through use_* functions. These functions accept a closure. In the example, use_float_reference accepts something like for<'this> FnOnce(&'outer_borrow &'this f32). Using the for<'this> part means that the closure has to work for any possible value of 'this, which prevents the closure from storing the reference somewhere else and breaking safety guarantees. The closure can also return a value that will then be returned by the use_* function. As far as I can tell this return value is bounded by how long self was borrowed for, so this code correctly refuses to compile:
let self_ref_struct = make_self_ref_struct();
let float_data_ref = self_ref_struct.use_float_reference(|float_reference| *float_reference);
drop(self_ref_struct);
println!("{:?}", float_data_ref);

Here's the full code that the example I gave expanded into. I don't have the best understanding of how lifetimes work in all the corner cases, so I would greatly appreciate any review of the code:

Expanded Code

This code uses a helper function which looks like this:

    pub unsafe fn stable_deref_and_strip_lifetime<T: StableDeref + 'static>(
        data: &T,
    ) -> &'static T::Target {
        &*((&**data) as *const _)
    }
mod ouroboros_impl_documentation_example {
    use super::*;
    /// The example in the readme and the documentation.
    pub struct DocumentationExample {
        #[doc(hidden)]
        float_reference: &'static mut f32,
        #[doc(hidden)]
        int_reference: &'static i32,
        #[doc(hidden)]
        float_data: Box<f32>,
        #[doc(hidden)]
        int_data: Box<i32>,
    }
    ///A more verbose but stable way to construct self-referencing structs. It is comparable to using `StructName { field1: value1, field2: value2 }` rather than `StructName::new(value1, value2)`. This has the dual benefit of making your code both easier to refactor and more readable. Call [`build()`](Self::build) to construct the actual struct. The fields of this struct should be used as follows:
    ///
    ///| Field | Suggested Use |
    ///| --- | --- |
    ///| `int_data` | Directly pass in the value this field should contain |
    ///| `float_data` | Directly pass in the value this field should contain |
    ///| `int_reference_builder` | Use a function or closure: `(int_data: &_) -> int_reference: _` |
    ///| `float_reference_builder` | Use a function or closure: `(float_data: &mut _) -> float_reference: _` |
    pub struct DocumentationExampleBuilder<
        IntReferenceBuilder_: for<'this> FnOnce(&'this <Box<i32> as ::std::ops::Deref>::Target) -> &'this i32,
        FloatReferenceBuilder_: for<'this> FnOnce(&'this mut <Box<f32> as ::std::ops::Deref>::Target) -> &'this mut f32,
    > {
        pub int_data: Box<i32>,
        pub float_data: Box<f32>,
        pub int_reference_builder: IntReferenceBuilder_,
        pub float_reference_builder: FloatReferenceBuilder_,
    }
    impl<
            IntReferenceBuilder_: for<'this> FnOnce(&'this <Box<i32> as ::std::ops::Deref>::Target) -> &'this i32,
            FloatReferenceBuilder_: for<'this> FnOnce(&'this mut <Box<f32> as ::std::ops::Deref>::Target) -> &'this mut f32,
        > DocumentationExampleBuilder<IntReferenceBuilder_, FloatReferenceBuilder_>
    {
        ///Calls [`DocumentationExample::new()`](DocumentationExample::new) using the provided values. This is preferrable over calling `new()` directly for the reasons listed above.
        pub fn build(self) -> DocumentationExample {
            DocumentationExample::new(
                self.int_data,
                self.float_data,
                self.int_reference_builder,
                self.float_reference_builder,
            )
        }
    }
    ///A more verbose but stable way to construct self-referencing structs. It is comparable to using `StructName { field1: value1, field2: value2 }` rather than `StructName::new(value1, value2)`. This has the dual benefit of makin your code both easier to refactor and more readable. Call [`try_build()`](Self::try_build) or [`try_build_or_recover()`](Self::try_build_or_recover) to construct the actual struct. The fields of this struct should be used as follows:
    ///
    ///| Field | Suggested Use |
    ///| --- | --- |
    ///| `int_data` | Directly pass in the value this field should contain |
    ///| `float_data` | Directly pass in the value this field should contain |
    ///| `int_reference_builder` | Use a function or closure: `(int_data: &_) -> Result<int_reference: _, Error_>` |
    ///| `float_reference_builder` | Use a function or closure: `(float_data: &mut _) -> Result<float_reference: _, Error_>` |
    pub struct DocumentationExampleTryBuilder<
        IntReferenceBuilder_: for<'this> FnOnce(
            &'this <Box<i32> as ::std::ops::Deref>::Target,
        ) -> Result<&'this i32, Error_>,
        FloatReferenceBuilder_: for<'this> FnOnce(
            &'this mut <Box<f32> as ::std::ops::Deref>::Target,
        ) -> Result<&'this mut f32, Error_>,
        Error_,
    > {
        pub int_data: Box<i32>,
        pub float_data: Box<f32>,
        pub int_reference_builder: IntReferenceBuilder_,
        pub float_reference_builder: FloatReferenceBuilder_,
    }
    impl<
            IntReferenceBuilder_: for<'this> FnOnce(
                &'this <Box<i32> as ::std::ops::Deref>::Target,
            ) -> Result<&'this i32, Error_>,
            FloatReferenceBuilder_: for<'this> FnOnce(
                &'this mut <Box<f32> as ::std::ops::Deref>::Target,
            ) -> Result<&'this mut f32, Error_>,
            Error_,
        > DocumentationExampleTryBuilder<IntReferenceBuilder_, FloatReferenceBuilder_, Error_>
    {
        ///Calls [`DocumentationExample::try_new()`](DocumentationExample::try_new) using the provided values. This is preferrable over calling `try_new()` directly for the reasons listed above.
        pub fn try_build(self) -> Result<DocumentationExample, Error_> {
            DocumentationExample::try_new(
                self.int_data,
                self.float_data,
                self.int_reference_builder,
                self.float_reference_builder,
            )
        }
        ///Calls [`DocumentationExample::try_new_or_recover()`](DocumentationExample::try_new_or_recover) using the provided values. This is preferrable over calling `try_new_or_recover()` directly for the reasons listed above.
        pub fn try_build_or_recover(self) -> Result<DocumentationExample, (Error_, Heads)> {
            DocumentationExample::try_new_or_recover(
                self.int_data,
                self.float_data,
                self.int_reference_builder,
                self.float_reference_builder,
            )
        }
    }
    ///A struct for holding immutable references to all [tail and immutably borrowed fields](https://docs.rs/ouroboros/latest/ouroboros/attr.self_referencing.html#definitions) in an instance of [`DocumentationExample`](DocumentationExample).
    pub struct BorrowedFields<'outer_borrow, 'this> {
        pub float_reference: &'outer_borrow &'this mut f32,
        pub int_reference: &'outer_borrow &'this i32,
        pub int_data_contents: &'outer_borrow <Box<i32> as ::std::ops::Deref>::Target,
    }
    ///A struct for holding mutable references to all [tail fields](https://docs.rs/ouroboros/latest/ouroboros/attr.self_referencing.html#definitions) in an instance of [`DocumentationExample`](DocumentationExample).
    pub struct BorrowedMutFields<'outer_borrow, 'this> {
        pub float_reference: &'outer_borrow mut &'this mut f32,
        pub int_reference: &'outer_borrow mut &'this i32,
    }
    ///A struct which contains only the [head fields](https://docs.rs/ouroboros/latest/ouroboros/attr.self_referencing.html#definitions) of [`DocumentationExample`](DocumentationExample).
    pub struct Heads {
        pub float_data: Box<f32>,
        pub int_data: Box<i32>,
    }
    impl DocumentationExample {
        ///Constructs a new instance of this self-referential struct. (See also [`DocumentationExampleBuilder::build()`](DocumentationExampleBuilder::build)). Each argument is a field of the new struct. Fields that refer to other fields inside the struct are initialized using functions instead of directly passing their value. The arguments are as follows:
        ///
        ///| Argument | Suggested Use |
        ///| --- | --- |
        ///| `int_data` | Directly pass in the value this field should contain |
        ///| `float_data` | Directly pass in the value this field should contain |
        ///| `int_reference_builder` | Use a function or closure: `(int_data: &_) -> int_reference: _` |
        ///| `float_reference_builder` | Use a function or closure: `(float_data: &mut _) -> float_reference: _` |
        pub fn new(
            int_data: Box<i32>,
            mut float_data: Box<f32>,
            int_reference_builder: impl for<'this> FnOnce(
                &'this <Box<i32> as ::std::ops::Deref>::Target,
            ) -> &'this i32,
            float_reference_builder: impl for<'this> FnOnce(
                &'this mut <Box<f32> as ::std::ops::Deref>::Target,
            ) -> &'this mut f32,
        ) -> Self {
            let int_data_illegal_static_reference =
                unsafe { ::ouroboros::macro_help::stable_deref_and_strip_lifetime(&int_data) };
            let float_data_illegal_static_reference = unsafe {
                ::ouroboros::macro_help::stable_deref_and_strip_lifetime_mut(&mut float_data)
            };
            let int_reference = int_reference_builder(int_data_illegal_static_reference);
            let float_reference = float_reference_builder(float_data_illegal_static_reference);
            Self {
                int_data,
                float_data,
                int_reference,
                float_reference,
            }
        }
        ///(See also [`DocumentationExampleTryBuilder::try_build()`](DocumentationExampleTryBuilder::try_build).) Like [`new`](Self::new), but builders for [self-referencing fields](https://docs.rs/ouroboros/latest/ouroboros/attr.self_referencing.html#definitions) can return results. If any of them fail, `Err` is returned. If all of them succeed, `Ok` is returned. The arguments are as follows:
        ///
        ///| Argument | Suggested Use |
        ///| --- | --- |
        ///| `int_data` | Directly pass in the value this field should contain |
        ///| `float_data` | Directly pass in the value this field should contain |
        ///| `int_reference_builder` | Use a function or closure: `(int_data: &_) -> Result<int_reference: _, Error_>` |
        ///| `float_reference_builder` | Use a function or closure: `(float_data: &mut _) -> Result<float_reference: _, Error_>` |
        pub fn try_new<Error_>(
            int_data: Box<i32>,
            mut float_data: Box<f32>,
            int_reference_builder: impl for<'this> FnOnce(
                &'this <Box<i32> as ::std::ops::Deref>::Target,
            ) -> Result<&'this i32, Error_>,
            float_reference_builder: impl for<'this> FnOnce(
                &'this mut <Box<f32> as ::std::ops::Deref>::Target,
            )
                -> Result<&'this mut f32, Error_>,
        ) -> ::std::result::Result<Self, Error_> {
            let int_data_illegal_static_reference =
                unsafe { ::ouroboros::macro_help::stable_deref_and_strip_lifetime(&int_data) };
            let float_data_illegal_static_reference = unsafe {
                ::ouroboros::macro_help::stable_deref_and_strip_lifetime_mut(&mut float_data)
            };
            let int_reference = int_reference_builder(int_data_illegal_static_reference)?;
            let float_reference = float_reference_builder(float_data_illegal_static_reference)?;
            ::std::result::Result::Ok(Self {
                int_data,
                float_data,
                int_reference,
                float_reference,
            })
        }
        ///(See also [`DocumentationExampleTryBuilder::try_build_or_recover()`](DocumentationExampleTryBuilder::try_build_or_recover).) Like [`try_new`](Self::try_new), but all [head fields](https://docs.rs/ouroboros/latest/ouroboros/attr.self_referencing.html#definitions) are returned in the case of an error. The arguments are as follows:
        ///
        ///| Argument | Suggested Use |
        ///| --- | --- |
        ///| `int_data` | Directly pass in the value this field should contain |
        ///| `float_data` | Directly pass in the value this field should contain |
        ///| `int_reference_builder` | Use a function or closure: `(int_data: &_) -> Result<int_reference: _, Error_>` |
        ///| `float_reference_builder` | Use a function or closure: `(float_data: &mut _) -> Result<float_reference: _, Error_>` |
        pub fn try_new_or_recover<Error_>(
            int_data: Box<i32>,
            mut float_data: Box<f32>,
            int_reference_builder: impl for<'this> FnOnce(
                &'this <Box<i32> as ::std::ops::Deref>::Target,
            ) -> Result<&'this i32, Error_>,
            float_reference_builder: impl for<'this> FnOnce(
                &'this mut <Box<f32> as ::std::ops::Deref>::Target,
            )
                -> Result<&'this mut f32, Error_>,
        ) -> ::std::result::Result<Self, (Error_, Heads)> {
            let int_data_illegal_static_reference =
                unsafe { ::ouroboros::macro_help::stable_deref_and_strip_lifetime(&int_data) };
            let float_data_illegal_static_reference = unsafe {
                ::ouroboros::macro_help::stable_deref_and_strip_lifetime_mut(&mut float_data)
            };
            let int_reference = match int_reference_builder(int_data_illegal_static_reference) {
                ::std::result::Result::Ok(value) => value,
                ::std::result::Result::Err(err) => {
                    return ::std::result::Result::Err((
                        err,
                        Heads {
                            int_data,
                            float_data,
                        },
                    ))
                }
            };
            let float_reference = match float_reference_builder(float_data_illegal_static_reference)
            {
                ::std::result::Result::Ok(value) => value,
                ::std::result::Result::Err(err) => {
                    return ::std::result::Result::Err((
                        err,
                        Heads {
                            int_data,
                            float_data,
                        },
                    ))
                }
            };
            ::std::result::Result::Ok(Self {
                int_data,
                float_data,
                int_reference,
                float_reference,
            })
        }
        ///Provides limited immutable access to the contents of `int_data`. This method was generated because `int_data` is immutably borrowed by other fields.
        pub fn use_int_data_contents<'outer_borrow, ReturnType>(
            &'outer_borrow self,
            user: impl for<'this> FnOnce(
                &'outer_borrow <Box<i32> as ::std::ops::Deref>::Target,
            ) -> ReturnType,
        ) -> ReturnType {
            user(&*self.int_data)
        }
        ///Provides an immutable reference to `int_reference`. This method was generated because `int_reference` is a [tail field](https://docs.rs/ouroboros/latest/ouroboros/attr.self_referencing.html#definitions).
        pub fn use_int_reference<'outer_borrow, ReturnType>(
            &'outer_borrow self,
            user: impl for<'this> FnOnce(&'outer_borrow &'this i32) -> ReturnType,
        ) -> ReturnType {
            user(&self.int_reference)
        }
        ///Provides a mutable reference to `int_reference`. This method was generated because `int_reference` is a [tail field](https://docs.rs/ouroboros/latest/ouroboros/attr.self_referencing.html#definitions).
        pub fn use_int_reference_mut<'outer_borrow, ReturnType>(
            &'outer_borrow mut self,
            user: impl for<'this> FnOnce(&'outer_borrow mut &'this i32) -> ReturnType,
        ) -> ReturnType {
            user(&mut self.int_reference)
        }
        ///Provides an immutable reference to `float_reference`. This method was generated because `float_reference` is a [tail field](https://docs.rs/ouroboros/latest/ouroboros/attr.self_referencing.html#definitions).
        pub fn use_float_reference<'outer_borrow, ReturnType>(
            &'outer_borrow self,
            user: impl for<'this> FnOnce(&'outer_borrow &'this mut f32) -> ReturnType,
        ) -> ReturnType {
            user(&self.float_reference)
        }
        ///Provides a mutable reference to `float_reference`. This method was generated because `float_reference` is a [tail field](https://docs.rs/ouroboros/latest/ouroboros/attr.self_referencing.html#definitions).
        pub fn use_float_reference_mut<'outer_borrow, ReturnType>(
            &'outer_borrow mut self,
            user: impl for<'this> FnOnce(&'outer_borrow mut &'this mut f32) -> ReturnType,
        ) -> ReturnType {
            user(&mut self.float_reference)
        }
        ///This method provides immutable references to all [tail and immutably borrowed fields](https://docs.rs/ouroboros/latest/ouroboros/attr.self_referencing.html#definitions).
        pub fn use_all_fields<'outer_borrow, ReturnType>(
            &'outer_borrow self,
            user: impl for<'this> FnOnce(BorrowedFields<'outer_borrow, 'this>) -> ReturnType,
        ) -> ReturnType {
            user(BorrowedFields {
                float_reference: &self.float_reference,
                int_reference: &self.int_reference,
                int_data_contents: &*self.int_data,
            })
        }
        ///This method provides mutable references to all [tail fields](https://docs.rs/ouroboros/latest/ouroboros/attr.self_referencing.html#definitions).
        pub fn use_all_fields_mut<'outer_borrow, ReturnType>(
            &'outer_borrow mut self,
            user: impl for<'this> FnOnce(BorrowedMutFields<'outer_borrow, 'this>) -> ReturnType,
        ) -> ReturnType {
            user(BorrowedMutFields {
                float_reference: &mut self.float_reference,
                int_reference: &mut self.int_reference,
            })
        }
        pub fn into_heads(self) -> Heads {
            drop(self.float_reference);
            drop(self.int_reference);
            let float_data = self.float_data;
            let int_data = self.int_data;
            Heads {
                float_data,
                int_data,
            }
        }
    }
}
pub use ouroboros_impl_documentation_example::DocumentationExample;
pub use ouroboros_impl_documentation_example::DocumentationExampleBuilder;
pub use ouroboros_impl_documentation_example::DocumentationExampleTryBuilder;

While I've only shown examples where the self-referencing part is a literal reference, I designed the code so that it can work on anything that is self-referencing. For example, I am using a library which has a Context struct which can generate a Module<'ctx> struct. The builder for that would look like |context: &'this Context| context.build_module("name"). In that case it is not clear whether the reference needs to be mutable or immutable, so I opted to leave it explicit.

Finally, I wanted to mention I did encounter an issue with my implementation. It does not seem to work with chains, I.E. you cannot have a struct where C references B references A. I don't understand what is causing the issue so I've opened another thread to get some help on it, I would appreciate any help you all could give.

**quick edit: I wanted to point out that the generated code provides no way to access float_data directly, since it is already mutably borrowed by float_reference. In contrast, int_data can be borrowed immutably with borrow_int_data_contents since it has only been borrowed immutably by int_reference.

1 Like

I tried to run the example under miri, but it reported undefined behaviour.

Click to see full error
error: Undefined Behavior: trying to reborrow for SharedReadOnly, but parent tag <4250> does not have an appropriate item in the borrow stack
  --> src/main.rs:8:5
   |
8  |     int_reference: &'this i32,
   |     ^^^^^^^^^^^^^ trying to reborrow for SharedReadOnly, but parent tag <4250> does not have an appropriate item in the borrow stack
   |
   = help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
   = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
           
   = note: inside `ouroboros_impl_my_struct::MyStruct::new::<[closure@src/main.rs:17:32: 17:57], [closure@src/main.rs:18:34: 18:67]>` at src/main.rs:8:5
note: inside `ouroboros_impl_my_struct::MyStructBuilder::<[closure@src/main.rs:17:32: 17:57], [closure@src/main.rs:18:34: 18:67]>::build` at src/main.rs:3:1
  --> src/main.rs:3:1
   |
3  | #[self_referencing]
   | ^^^^^^^^^^^^^^^^^^^
note: inside `main` at src/main.rs:14:24
  --> src/main.rs:14:24
   |
14 |       let mut my_value = MyStructBuilder {
   |  ________________________^
15 | |         int_data: Box::new(42),
16 | |         float_data: Box::new(3.14),
17 | |         int_reference_builder: |int_data: &i32| int_data,
18 | |         float_reference_builder: |float_data: &mut f32| float_data,
19 | |     }.build();
   | |_____________^
   = note: inside `<fn() as std::ops::FnOnce<()>>::call_once - shim(fn())` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227:5
   = note: inside `std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:137:18
   = note: inside closure at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:66:18
   = note: inside `std::ops::function::impls::<impl std::ops::FnOnce<()> for &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>::call_once` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:259:13
   = note: inside `std::panicking::r#try::do_call::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:381:40
   = note: inside `std::panicking::r#try::<i32, &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:345:19
   = note: inside `std::panic::catch_unwind::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:382:14
   = note: inside `std::rt::lang_start_internal` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:51:25
   = note: inside `std::rt::lang_start::<()>` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:65:5
   = note: this error originates in an attribute macro (in Nightly builds, run with -Z macro-backtrace for more info)

I suspect it's due to the use of Box for the borrowed fields, since every access to a box (including moves of the containing struct) assert that the box has unique ownership of its contents. You will likely have to write your own aliasable box type like this to get around it:

use std::ptr::NonNull;

struct AliasableBox<T: ?Sized> {
    ptr: NonNull<T>,
}

impl<T> AliasableBox<T> {
    fn new(t: T) -> Self {
        Self::from_box(Box::new(t))
    }
}

impl<T: ?Sized> AliasableBox<T> {
    fn from_box(b: Box<T>) -> Self {
        unsafe {
            let ptr = Box::into_raw(b);
            Self {
                ptr: NonNull::new_unchecked(ptr),
            }
        }
    }
}

impl<T: ?Sized> Drop for AliasableBox<T> {
    fn drop(&mut self) {
        unsafe {
            drop(Box::from_raw(self.ptr.as_ptr()));
        }
    }
}
1 Like

Could you elaborate more on what the error entails? I am having trouble understanding what is safe with AliasableBox that is unsafe with Box.

The various pointer types have some assumptions about the thing it points at, and these assumptions are used for optimizing code. Since they're used by the optimizer, you can get invalid optimizations if you violate the assumptions.

In our case, the assumption is that Box (and &mut) assert exclusive access to the target. This exclusivity is very strong to the extent that every time you touch the pointer in any way, it kicks in, marking every other pointer to that thing invalid — touching such an invalidated pointer later would be undefined behaviour.

It's a bit hard to tell from my error, but I believe it went wrong when the value containing the box was returned from the constructor, touching both the Box and your other reference to it. Since this results in asserting exclusivity while also asserting the existence of some other reference, you cause UB.

My AliasableBox is different because it uses a raw pointer, and raw pointers are not considered "used" in the same way when moved around. They're only considered used when you actually read or write to it.

To understand the miri error message, you will need to read about Stacked Borrows, which explains the terminology about tags and borrow stacks.

2 Likes

Thanks for your help, your explanation and the linked post were very informative. It looks like the problem can also be solved by creating an uninitialized Self, ptr::write()ing the boxes into it, then later getting the references from there. My understanding is that since the box is moved before the pointer is taken, it never gets moved while the reference exists. I passed the resulting struct around between a few functions and interacted with the fields and Miri reported no errors, so I think I will go with that approach. Speaking of which, Miri is really cool, I hadn't heard of it before. I'm going to add some automated tests for it to the crate. Thanks again for your help!

I'll admit, I first suspected this would be easy to break, but after testing it a bit I was surprised at how robust it was. You really put some thought into making it sound. :+1:

Also it has a nice catchy name. :+1:

My only complaint would be that I think some of the methods are named oddly. The with prefix is common in the sense you use use. Also I'm not sure if the distinction between use_FIELD and use_FIELD_contents is meaningful given that tail fields and immutably-borrowed fields are mutually exclusive.

name-related bikeshedding

My preference would be along these lines:

  • use_FIELDwith_FIELD
  • use_FIELD_mutwith_FIELD_mut
  • use_FIELD_contentswith_FIELD (has the same semantics, but can't conflict with the other one because they are not both available on the same fields, unless I'm overlooking something)
  • use_all_fieldswith (no suffix means it can't conflict with a real field named e.g. all_fields)
  • use_all_fields_mutwith_mut (also can't conflict because fields can't be named mut)

(I don't know how I feel about self_referencing for the macro itself. It's a little verbose, but it is descriptive. I might be tempted to name it ouroboros just to reinforce that this is the main point of the crate named the same thing, but maybe that's worse overall. Wouldn't keep me from using the crate, certainly.)

You might consider adding into_FIELD methods for each of the "head" fields which would be effectively just shorthand for into_heads().FIELD. Actually now that I mention that it seems rather unnecessary. Still, maybe a nice-to-have?

4 Likes

Thank you all for the help and suggestions! I've bumped the crate to 0.2.0 with the following changes:

  • Fixed constructors triggering Miri errors.
    • New commits are automatically run through Miri to check for errors.
  • use_FIELD changed to with_FIELD, use_all_fields changed to with
    • @trentj Reading your post made me realize how common this was in other crates.
    • I opted to keep the _contents nomenclature because I feel it improves clarity by indicating it provides &<STRUCT_TYPE as Deref>::Target rather than &STRUCT_TYPE.
    • I also kept the macro name, I feel it would help anyone seeing it without knowing what the crate does.
  • Much nicer error reporting, in many places highlighting the exact cause of the error.
  • Invoking the macro with #[self_referencing(chain_hack)] will activate a workaround to make structs with reference chains work, the only drawback being that borrowed fields must be Box<T>s.
5 Likes

Looking at title i thought its crate for consensuses protocol for Cardano :slight_smile:

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.