Soundness review for Selfie, my personal self-referential struct library

For the past couple of months I've been working on Selfie, a small self-referential struct library for use in some of my personal projects. These require audio realtime-safety (and therefore no extra allocations), which is the reason why I went ahead and made my own library instead of using some existing libraries, which either allocate or do not support custom referential types (or have soundness issues).

After a small amount of use, a decent MIRI-covered test suite, and some research into other self-referential libraries' soundness issues, I'm confident enough to say that I haven't found any soundness issues with my implementation. But I'm far from an expert on the subject, so I would like to get as much review and scrutiny as possible from this community to try and find soundness issues before I promote it or use it in larger projects. :slight_smile:

The implementation is only around ~400 LOC (~1K with tests and examples). It's also fully documented, with code examples both in the docs and the Readme.

Thanks in advance for your time!

Selfie::with_referential_mut is unsound, it allows to write a <R as RefType<'s>>::Ref, where 's comes from &'s mut self, which is shorter than the lifetime of the Selfie. This allows to write an &'short str to a Selfie<String, Ref<str>> and then read it from the Selfie after the 'short lifetime has ended.

use core::pin::Pin;
use selfie::{refs::Ref, Selfie};

fn main() {
	let data: Pin<String> = Pin::new("Hello, world!".to_owned());
	let mut selfie: Selfie<String, Ref<str>> = Selfie::new(data, |s| &*s);

	{
		let new_string = String::from("foo");
		selfie.with_referential_mut(|s| *s = &new_string);
	}
	
	println!("{}", selfie.referential());
}
5 Likes

I don't understand why Pin is used.

I see potential soundness problems with Selfie::with_referential and Selfie::referential; I'll try to come up with demonstrative examples when I got more time. Haven't looked as SelfieMut yet. Feel free to ping me in a few days in case I forget this. Edit: Well, in the meantime @SkiFire13 has already come up with decent examples and provided good explanation, so I won't need to anymore :smiling_face:.

Right @steffahn, Selfie::with_referential and Selfie::referential have the same problem as Selfie::with_referential_mut when the reference type is invariant (Edit: the problem with Selfie::with_referential_mut is also due to invariant lifetimes but it doesn't need internal mutability to exploit it):

use core::cell::Cell;
use core::pin::Pin;
use selfie::{refs::RefType, Selfie};

struct MyCell<'a>(Cell<&'a str>);
struct MyCellStandIn;

impl<'a> RefType<'a> for MyCellStandIn {
  type Ref = MyCell<'a>;
}

fn main() {
	let data: Pin<String> = Pin::new("Hello, world!".to_owned());
	let selfie: Selfie<String, MyCellStandIn> = Selfie::new(data, |s| MyCell(Cell::new(&s)));

	{
		let new_string = String::from("foo");
		selfie.with_referential(|s| s.0.set(&new_string));
	}
	
	// Can't use `selfie.referential()` because `Cell<&str>: !Copy` 
	println!("{}", selfie.with_referential(|s| s.0.get()));
}
use core::cell::Cell;
use core::pin::Pin;
use selfie::{refs::RefType, Selfie};

struct DoubleRefStandIn;

impl<'a> RefType<'a> for DoubleRefStandIn {
  type Ref = &'a Cell<&'a str>;
}

fn main() {
	let data: Pin<String> = Pin::new("Hello, world!".to_owned());
	let selfie: Selfie<String, DoubleRefStandIn> = Selfie::new(data, |s| &*Box::leak(Box::new(Cell::new(&*s))));

	{
		let new_string = String::from("foo");
		selfie.referential().set(&new_string);
	}
	
	println!("{}", selfie.referential().get());
}
2 Likes

Just a couple of notes on how these problems could be solved:

  • with_referential and with_referential_mut's closure could swap the 's and 'this signature (i.e. F: for<'this> FnOnce(&'s mut <R as RefType<'this>>::Ref) -> T). This should "hide" the self-referential lifetime thanks to the use of 'this while the reference can easily have 's lifetime since it's actually a reference into &'s self. This is also how ouroboros does it.
  • referential should require <R as RefType<'s>>::Ref to be covariant in 's. This could be done with an additional trait that provides a fn upcast<'a: 'b, 'b>(this: <Self as RefType<'a>>::Ref) -> <Self as RefType<'b>>::Ref and call that method to create the value returned by referential.
2 Likes

Well that was fast! Thank you! :smile:

Ah, right! I was always thinking about the lifetimes being to short, and forgot about lifetimes being too long. :sweat_smile:

I have applied your fix to the latest main version on my repo, and also added an extra compile-fail test from your code! ^^

Using this solution, what would the signature of referential() look like? I'm assuming it can't be the same as the current one (i.e. fn referential<'s>(&'s self) -> <R as RefType<'s>>::Ref), otherwise it would also be unsound with your Cell example.

However, referential was just meant as a quick shortcut for .with_referential(|r| *r), it doesn't expose any functionality itself. I have removed it for now, but it's not a big deal and I don't think keeping it around is worth adding a whole other trait for external types to implement.

I was under the impression that Pin was made for those cases where the data behind the pointer P must not be moved, which is the case for self-referential structs, so using Pin seemed natural to me. Am I using it wrong? Or do you mean that it isn't useful in that context?

Pin won't do anything anyway unless the target type does not implement Unpin. So in particular, examples such as Pin<String> are useless.

The "doesn't move" property is ensured by StableDeref already, combined with the fact that the API doesn't give the user any way to modify the pointer/container type itself. It's also a different one from Pin, which is concerned with data not being moved until its destructed, whereas you only care about not moving the data until it's no longer borrowed.

The only useful thing about the Pin would be for SelfieMut where the constructor also gets a pinned pointer, so there's some pinning guarantees that user code could rely on. That's probably a rather niche case though, and in the common case, the Pin will be annoying and needs to be worked around. I'd consider variants of Selfie that offer Pin a feature that only seems necessary if there's any demand for it. It's also only a performance optimization: code needing pinned data could always use a Box<Pin<PointerType>> instead, yes a double indirection, but at least it works.

2 Likes

Ah, I didn't realize that distinction, in that case yeah enforcing the pointer to be pinned is useless. I have removed it in a new branch on my repo, however MIRI started reporting stacked borrows violations in the cascading SelfieMut test case once I did that, and I'm not too sure why. I'm gonna have to investigate this before I publish all of the changes from above.

Ah, that might be because Pin has some magical special-casing at the moment for another soundness issue that comes up for many self-referencing primitives: that if you have e. g. a struct (Box<T>, &T), where the &T is a pointer into the box, then taking a &mut reference to (or e. g. perhaps also moving) the whole struct will claim exclusive access to the Box<T> and invalidate the reference.

Crates like ouroboros solve this by using a pointer instead of a Box, in the form of aliasable::boxed::AliasableBox - Rust; I personally have the goal of eventually making an alternative to StableDeref that can work around this issue by converting by some associated type in problematic cases (unless something like that still exists). The Rust language also has the goal of solving this problem eventually as it's kind-of necessary for something like futures to work soundly reliable without Pin hacks. Some recent discussions on that topic I've seen were here (I mean, I think that kind of change is relevant for the futures issue... I don't fully remember if that discussion mentioned that or not, certainly had other use cases, too. But here's another place of discussion that's definitely relevant to the topic.)

1 Like

My idea was to add this trait:

trait CovariantRefType: for<'a> RefType<'a> {
    fn upcast<'s, 'a>(this: &'s <Self as RefType<'a>>::Ref) -> &'s <Self as RefType<'s>>::Ref;
}

and then referential would become:

    pub fn referential<'s>(&'s self) -> <R as RefType<'s>>::Ref
    where
        <R as RefType<'s>>::Ref: Copy,
        R: CovariantRefType,
    {
        self.with_referential(|r| *R::upcast(r))
    }

The idea was delegating the covariant conversion to the user. However while playing around with this I discovered thatCovariantRefType is not possible to implement for some types due to the for<'a> RefType<'a> bound, which is not satisfied for types like selfie::refs::Ref<T> due to needing the T: 'a bound. It would work if the CovariantRefType implementation required a T: 'static bound though. Rust Playground

Users should still be able to do .with_referential(|r| *r) (or even .with_referential(|r| r), if rustc is smart enough to notice the subtype coercion) for types that allow it though, so this is just an ergonomic issue.

Ah, thanks for the explanation!

From my understanding of all this, it seems that Box<T> (and most std pointer types) have a noalias guarantee applied to them by the compiler, but the compiler special-cases Pin by removing the noalias guarantee from the inner pointer type, so that Pin<Box<T>> isn't noalias. Is that correct?

Assuming my understanding is indeed correct, I can't follow ouroboros' example since Selfie doesn't use Boxes at all to begin with. In the future, an alternative StableDeref with an Aliasable associated type like you mentioned could work as well.

For now, I don't mind the ergonomic issue of the current Pin solution in Selfie, so I guess Pins are in to stay for the time being. However, I have some more questions about Pin's behavior in this case:

  • Is using Pin to remove noalias guarantees like I do in Selfie actually sound, or is it actually just making MIRI miss the issue?
  • If it is sound, is there a risk of future breakage when the Rust team might reintroduce noalias guarantees for Pin<P> later down the line?

Or in other words, do you see soundness/safety issues to the current approach in my library that could become problematic for widespread use?

Thanks again for your time and wisdom on the subject. :smile:

Ah, yes, I see what you mean now. ^^

As you said, .referential() is just a little helper for the few cases where the reference type is Copy, so I don't think it's worth all this trouble just to keep it. .with_referential(|r| *r) is ugly but it works, if it wasn't obvious enough from the whole RefType thing, my library doesn't quite have clean-looking code as a main concern. :sweat_smile:

Right, I think that’s the right question to ask. I don’t know the answer with certainty off the top of my head, but if I had to guess, I would assume the current approach around Pin is just a temporary solution to make execution async fn-generated futures sound, until there’s a different, dedicated solution to removing aliasing guarantees for the purpose of self-referential datatypes (like async fn or async { … } futures) that those language-built-in datatypes then could switch to using as-well.

It's just a way you can make miri shut up. There's no guarantee that its sound.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.