How unsafe is my reference type?

Yes, it seems like that could be a problem. I wonder how prevalent that is? Perhaps using a positive, auto-derived trait instead?

pub Trait SafeForget;

That is by default auto-derived for all types. Then mem::forget becomes:

pub fn forget<T>( t : T ) -> () where T : SafeForget {...}

And your ScDropper type becomes:

pub Trait ScDropper : !SafeForget { ... }

I think that would be 100% backwards compatible.

Whoops, this still requires updates when mem::forget is called in a generic context (I think) so, it solves nothing. :frowning:

This can be solved with some advanced lifetime trick. The key point is make your dropper logically borrow the setted reference, so dropper cannot outlive it.

struct Dropper<'a, T> {
    ptr: *mut *const T, // replace it with your actual dropper code
    _marker: PhantomData<&'a T>,
}

impl<T> Sc<T> {
    pub fn set<'a>(&self, v: &'a T) -> Dropper<'a, T> {
        self.data.set(v);
        Dropper { ptr: self.get_ptr(), _marker: PhantomData }
    }
}

PhantomData is zero-sized type so in runtime it doesn't have any value. But during compile time, borrow checker treat it as its type parameter, in this case &'a T, so it perform lifetime checking as usual. So if the dropper is leaked then the compiler throws error that the dropper is outlived from its original value, which is exactly what you want. Awesome?

1 Like

How does this prevent "mem::forget" being called on Dropper? Or, will this somehow still complain even though "mem::forget" does it's work in an unsafe block? I'm not understanding how this works.

Actually, yes, it does make sense. That's a neat trick! Sounds like "Problem Solved"?

As pointed out below, this doesn't actually work. mem::forget consumes its value so, from the perspective of the compiler, there is no reference still alive that needs to end.

mem::forget means the argument will not be dropped, so that value will be live until this process is killed. So compiler will throw reference outlives the original value error.

1 Like

You also have to forbid putting it in Rcs or Arcs transitively then as well and also forbid putting it in a ManuallyDrop as well I guess

mem::forget consumes it's argument and just doesn't run any drop code when it dies, so once you have passed the dropper to mem::forget it is gone and your program cannot access it anymore but its drop was never run so the Sc will never be emptied and the dangling reference will still be available to use.

Importantly mem::forget does not mean the value will live until the process is killed, semantically whatever you passed to it is gone just as any other value to move into a function in rust, the only difference is that the drop code is never called.

1 Like

I'm afraid that is going to be very difficult to implement.. Since the great 'Leakpocalyse' of 20162015, Leaking memory (i.e. never running Drop) is considered memory safe.
This was the least-bad way out of a very difficult situation. It proved unsoundness in the rust core, because there are simply too many ways to (accidentally or not) leak memory: In the very least, mem::forget itself, and cycles of RC/ARC came up. (edit: upon re-reading Huonw: thread deadlock too)

It basically taught us that it's a bad idea to rely on Drop-impls for memory safety, as Drop's guarantees aren't as strong as we'd like.
Java has the same problem with it's destructors, which promise a "best effort" to run, but keep caveats like Out-of-Memory, JVM shutdowns, and never being garbage-collected. It's the prime reason interface Closeable exists.

See also:

From the different github discussions I've seen that suggest new OIBIT/auto-traits, there is a massive reluctance to add more auto-traits to the rust core, because (as your edit about generic contexts shows) each new auto-trait complicates all generic downstream code.
Since std has a quite a lot of downstream, the threshold for adding new ones seems to be right about at "maybe, if its the only way, after years of design work, to remedy pre-existing unsoundness"

Exactly; this is a lifetime problem through and through (tying the lifetime of the reference to the lifetime of the dropper), and this is exactly the kind of scenario for which PhantomData was designed (be sure to check that doc-page for excellent showcase of full Unicode emoji support in Rust sourcefiles + RustDoc).

edit: oh man, 2015? has it really been that long already! Also: added link after reminder from @trentj

4 Likes

After a quick experiment, I figured out that my assumption was wrong. When value is mem::forget()ed borrow checker thinks this value is consumed, not leaked.

Now I think that Sc::get() cannot be implemented in safe way without the guarantee that the dropper always be dropped. What you're trying to do is dynamically bound external value's lifetime into Sc, which is not supported by rust.

Experiment: Rust Playground
println!() macros are commented out, otherwise this will attempts to print invalid utf8 sequence and playground sandbox will be terminated.

1 Like

I think this is the test you need to pass:

let sc = Sc::new();
{
    let s = String::from("foo");
    let dropper = sc.set(&s);
    mem::forget(dropper);
}
assert_eq!(sc.get(), None); // or alternative with `sc.visit(...)`

i.e., you can't rely on mem::forget being unsafe to call. Sc::visit doesn't fix this problem.

In addition to @juleskers's links, this blog post is a good overview of Leakpocalypse 2015. Many solutions were proposed, along with some that looked very similar to UnsafeForget. There are no types that today are UnsafeForget, but there are a bunch of types that would have to have !UnsafeForget bounds applied to them. You can leak something by putting it in an Rc cycle, so Rc<T> would presumably have a T: !UnsafeForget bound. And that means anything that deals with Rc generically has to also add that bound. Box::leak is safe, so it would also need a where T: !UnsafeForget bound, and so on.

The API you've outlined can't be implemented soundly, but it's possible you could make an API that would work by passing closures, similar to the way thread::scoped solved its problem. I think you would probably need to allocate something, though (not necessarily the whole T). The same blog post I linked above might give you some ideas.

4 Likes

Ah yes, how could I forget the PPYP-solution! Even the name itself caused a minor outcry! :smiley:
(I was a fan of his "pre-pit your peaches" compromise, but it seems he's kept to the poo-version too)

I've added it to my list, for completeness sake. Thanks!

3 Likes

Thank you for all the very informative links! So, reading the replies here, my thoughts are the following:

  • The Sc::get() is definitely unsafe, since it returns a reference that is only bound to the lifetime of Sc and not of dropper. Using the Sc::visit method (that I renamed to map because it looks exactly like Option::map) should be safe in that regard, so I consider that that problem solved.
  • The leakpocalypse is real and Sc will be unsafe forever if it relies on the destructor of the Dropper instance to be run. I don't have a solution to that problem.
    • I don't know if a "PPYP" scheme could work, because I'm don't know what I could leak to mitigate the problem if Dropper is leaked. Well, I do know, that would be the pointed to object, but then I would probably have to allocate it, which kind of defeats the purpose.
    • I also don't know if a "scope-based" solution would work, because it would require to open a new scope when calling Sc::set(), so basically having Sc::set() taking a closure, which I feel would limit the use case where I call Sc::set() and Sc::get() from different parts of the program (eg set when registering an observer, get when notifying the observer of a change). Or maybe there's something I didn't understand here?

As a side note, I'm really surprised that Rc is allowed to leak cycles of structs that are not 'static. This really "feels" like a bug to me, and certainly something that compromises the mental model I had of non 'static structs before.

Callbacks are a specific case of the whole 'cyclic references are hard in Rust' topic.
It really messes up some paradigms we're used to, and there are multiple people looking into other paradigms, for example for GUI's (where callback button actions are pervasive).

The PPYP idea is to pre-leak while still in a known good state, and then do the 'risky' thing, and then clean up.
Unfortunately, in your case the risky thing is the very first step:Sc::get. The only safe "pre-pitted" state would be to always return None :expressionless:

I hope you're not discouraged by the information I've given you, it is a truly hard problem!

I personally think that the scoping solutions are the only sound way forward.
Just thinking about it outside-in:

  • we have a variable, and this variable intrinsically has a scope
  • we wish to give away a reference to this variable
  • the only places this reference is valid, is inside its scope
  • the only way to guarantee it is never given away is either:
    • statically verify it isn't, during compilation (i.e. lifetimes)
    • 'somehow' make the scope an explicit thing during runtime

the only 'somehow' I'm currently aware of, are closures..

3 Likes

You're articulating very well why I can't use the PPYP pattern.

I've been thinking, could my problem be solved by making my Dropper struct immovable after it is returned from the Sc::set() method? That way I wouldn't be able to move it in a Rc nor in mem::forget?

I believe the Pin RFC covers how I could do that, but I'm not sure to understand how that would work exactly?

1 Like

Note that mem::forget itself is nothing special, just:

pub fn forget<T>(t: T) {
    ManuallyDrop::new(t);
}

I think you're right that being immovable would ensure a drop, but I'm not sure how you could enforce that after a return, since even the return itself involves moving.

The way stuff like thread locals and scoped threads have been solved is by forcing an inner calling scope. You never give the thing-that-must-be-dropped to an untrusted caller. Instead, you might have something like a with accessor:

let sc = Sc::new();
assert_eq!(sc.get(), None); // Nothing in Sc, initially
{
    let s = String::from("foo");
    // set a reference to s in Sc, for the duration of the closure
    sc.with(&s, || {
        assert_eq!(sc.get(), Some(&s)); // access the reference
    });
    assert_eq!(sc.get(), None); // After `with` returns, nothing in Sc again
}
assert_eq!(sc.get(), None); // After s is dead, still nothing in Sc

This is somewhat like what you were doing with visit, but without exposing any dropper.

1 Like

Does it? If I understand correctly, Pin is supposed to prevent the memory address from changing.
As far as I understood it, this doesn't prevent you from 'moving' ownership. (They're not talking about the same form of moving).

If we can still give ownership away, we can still pass it to mem::forget, and we're still at square one. No?

Or do you mean a reference to a Pin has a longer lifetime? SC would have to be !Unpin in that case, promoting it to 'static lifetime, and solving the problem that way, at the cost of allocating.

1 Like

OK, it took me some time, but I believe I may have something that would work and be safe without using closures (save for the visit/map method of Sc).

The solution I found involves three objects, not two like the previous solution.

The first object is Sc, and it is completely unchanged:

pub struct Sc<T>(Cell<Option<*const T>>);

The second object is our dropper, which is simpler than ever:

struct Dropper<'sc, T: 'sc> {
    sc: &'sc Sc<T>,
}

impl<'sc, T> Drop for Dropper<'sc, T> {
    fn drop(&mut self) {
        self.sc.0.set(None)
    }
}

(Dropper is not public anymore in this design)

Then, I introduce a third object that I call Locker:

pub struct Locker<'sc, 'auto, T: 'sc + 'auto> {
    sc: Option<Dropper<'sc, T>>,
    marker: Marker,
    autoref: Option<&'auto Marker>,
}

This object is self-referential, Locker::autoref will eventually reference Locker::marker.

Then, I replace the Sc::set method with the following Locker::lock method:

impl<'sc, 'auto, T> Locker<'sc, 'auto, T> {
    pub fn new() -> Self {
        Self {
            sc: None,
            marker: Marker,
            autoref: None,
        }
    }

    pub fn lock(&'auto mut self, val: &'auto T, sc: &'sc Sc<T>) {
        let ptr = val as *const T;
        sc.0.set(Some(ptr));
        self.sc = Some(Dropper { sc });
        self.autoref = Some(&self.marker);
    }
}

Because Locker becomes a self-borrowing struct after a call to Locker::lock, it is not possible to call mem::forget() on it (it is borrowed).

(for clarity, here is Sc's full impl block):


impl<T> Sc<T> {
    pub fn new() -> Self {
        Self { 0: Cell::new(None) }
    }

    unsafe fn get(&self) -> Option<&T> {
        self.0.get().map(|ptr| mem::transmute(ptr))
    }

    pub fn map<U, F: Fn(&T) -> U>(&self, f: F) -> Option<U> {
        unsafe { self.get().map(f) }
    }

    pub fn is_none(&self) -> bool {
        unsafe { self.get().is_none() }
    }
}

You can then use Locker in the following way:

fn it_works() {
    let sc = Sc::new();
    assert!(sc.is_none());
    {
        let s = String::from("foo");
        {
            let mut locker = Locker::new();
            locker.lock(&s, &sc);
            // cannot mem::forget(locker) here, because locker borrows itself
            // cannot mem::forget(s) either, because it is borrowed by locker
            assert!(!sc.is_none());
        }
        assert!(sc.is_none());
        {
            let mut locker = Locker::new();
            locker.lock(&s, &sc);
            assert!(!sc.is_none());
        }
        assert!(sc.is_none());
    }
    assert!(sc.is_none());
}

For convenience, I also provide a Wrapper type:

pub struct Wrapper<'sc, 'auto, T: 'sc + 'auto> {
    locker: Locker<'sc, 'auto, T>,
    data: T,
}

impl<'sc, 'auto, T> Wrapper<'sc, 'auto, T> {
    pub fn new(data: T) -> Self {
        Self {
            locker: Locker::new(),
            data,
        }
    }

    pub fn lock(this: &'auto mut Self, sc: &'sc Sc<T>) {
        this.locker.lock(&this.data, sc);
    }
}

// also impl Deref<Target=T> and DerefMut for Wrapper<T>

This enables the following:

#[test]
fn with_wrapper() {
    let sc = Sc::new();
    assert!(sc.is_none());
    {
        let mut s = Wrapper::new(String::from("bar"));
        {
            Wrapper::lock(&mut s, &sc);
            assert!(!sc.is_none());
        }
        // actually here we are still locked because a wrapper automatically locks for its whole lifetime
        assert!(!sc.is_none());
    }
    assert!(sc.is_none());
}

I pushed this as a branch on my repository (I only updated the Observer example, because I broke Sc with trait objects with this release).

What do you think?

2 Likes

Please use reborrow to convert pointer to reference, as document suggests.

Anyway, does it really work? For me it looks like Locker cannot be dropped after all as it's potential borrowed, and struct's fields are live longer than the outer struct. But if it works, awesome! It really is an interesting approach, and definitely worth trying.

1 Like

This is really clever. :sunglasses::+1:

I'm almost convinced at this point the API is safe. In other words, I've exhausted the ways I could think of to break it and none of them will work. There might be some way to make it cleaner with Pin, but I still don't understand that stuff well enough.

I agree that you should dereference instead of mem::transmuteing, and would add that Sc::is_none can be written safely and more simply as self.0.get().is_none(). Although they probably compile to the same thing after optimization.

Really interesting work.

1 Like

Oops, I'm still learning all of this. Thank you for the correction, it is very welcome :slight_smile:. I'll be updating my code soon.

The code I posted (which you can find in my repository) compiles and the test I posted are passing (meaning that sc.is_none() becomes true as soon as the locker gets out of scope).
I also tested that you cannot call mem::forget on the Locker object after having called lock. Also, I just tested, and you cannot have a Locker instance that has a lifetime greater than the T it refers to, because when you call Locker::lock, it attempts to borrow T for the entire lifetime of Locker, and so it fails to compile (which is what we want, I suppose).

Regarding the drop code, note that the Locker itself does not implement Drop. Attempting to implement Drop directly on Locker was actually my first attempt, but it failed because since Locker is autoborrowed by Locker::lock, it couldn't be borrowed mutable by the drop function.

Having one of its fields implement Drop works, because the fields are dropped right after the end of the Locker's lifetime, and at this point it is no longer borrowed :slight_smile:. I wish this trick of delegating the drop to a field for perma-borrowing struct was documented somewhere, would have saved me a bit of time!

I guess, a thing I need to do to gain confidence that this is safe is to try and put Locker in a Rc cycle, and then lock it. If this proves impossible, then I'll be happy I guess :grinning:.

@trentj: Thank you! I wonder if this type would be useful to store callbacks, like was discussed earlier in the thread? I really need to add Trait objects support back soon.

EDIT: I pushed the suggested changes, thank to both of you :slight_smile:

1 Like

Still broken :expressionless:.

#[test]
fn oopsie() {
    let sc = Sc::new();
    assert!(sc.is_none());
    {
        let s = String::from("foo");
        assert!(sc.is_none());
        {
            let locker = Box::new(Locker::new());
            let locker = Box::leak(locker);
            locker.lock(&s, &sc);
            assert!(!sc.is_none());
        }
    }
    sc.map(|x| println!("{}", x));
    assert!(sc.is_none());
 }

Here we are creating the locker in a Box, then getting a leaked reference to it by calling Box::leak, and finally locking the leaked reference with Locker::lock.
Interestingly, the Wrapper<T> doesn't suffer from this problem, because since it also owns the T, leaking the locker leaks the T at the same time, meaning calling Sc::map is never unsafe when the Sc is locker to a Wrapper<T>. So, that's a thing, I guess? :sweat:

Other than that, I really don't know how I could prevent a user to put a locker in a Box and then Box::leak'ing it. Any idea of how I could do that? A possibility would be to produce an autoborrowed struct when calling Locker::lock, but I don't know how to return such autoborrowed struct.

2 Likes

Awwh.

Well, it was fun while it lasted.

I don't think the Wrapper is as useful as the non-Wrapper version, because you have to move the T into it, which offsets the advantage of not allocating. Perhaps it could be useful when allocating isn't an option, though.

On the bright side you can probably simplify the API now that Locker isn't useful on its own.