TrashPool to retain temporaries past the end of a scope

Yes, just say that it is bounded by each of them individually.

In that case, I'm pretty sure this would work and I believe it would be sound... If only I could figure out how to transmute away the lifetime of my input variable, inside the function, when I don't know its size.

Feels like progress, but could be another blind alley. As somebody on this forum said a few months ago: (sadly can't recall who, nor find the quote) Rust is my favorite text-based dungeon exploration game.

impl<T: ?Sized> Trashable for T {}
pub trait Trashable {}

pub struct TrashPool {
    pool : RefCell<Vec<Box<dyn Trashable +'static>>>
}

impl TrashPool {
    pub fn new() -> TrashPool {
        TrashPool{pool : RefCell::new(vec![])}
    }
    pub fn dump<'c, 'b:'c, 'a:'c, T:Trashable + 'a>(&'b self, input : T) -> &'c T {

        let mut pool = self.pool.borrow_mut();

        let boxed_input = Box::new(input);

        let transmuted_box = unsafe {std::mem::transmute(boxed_input)};
//        let transmuted_box = unsafe {(&boxed_input as *const Box<T> as *const Box<dyn Trashable>) }; //transmute away any lifetime knowledge
//        let transmuted_box = unsafe {(boxed_input as Box<dyn Trashable>) }; //transmute away any lifetime knowledge

        pool.push(transmuted_box);
        let trash_ref = &**pool.last().unwrap();

        unsafe { &*(trash_ref as *const dyn Trashable as *const T) }
    }   
}

Thanks again! You, and everyone on this forum, are awesome!

It looks very cool. Removing the lifetime on the type is not sound, and the following example demonstrates that.

fn main() {
    let pool = TrashPool::new();
    let s = String::from("Some text");
    let obs_handle = pool.dump(Observer(&s));
// The string is dropped, but the Observer that will view it on drop, is still in the pool
// the borrow checker thinks this is ok because `obs_handle` is not used anywhere
// after s or the pool goes out of scope.
    drop(s);
    drop(pool);
}

(playground link -- broken code version)

(I had to modify the box/transmute code, since the posted version did not compile.)


I wrote a new version that simplifies the lifetimes for you, you don't gain anything from all those lifetime parameters; this new way is actually 100% equivalent - and doesn't use any trick to tweak lifetimes of stored objects. I won't fully research if this is a final sound version, but the remaining unsafe block only prolongs the life of the reference to the value in the box. (playground link); as you can see, you're only allowed to store values that outlive the pool itself.

(The example shows that the previous breaking code, does not compile anymore. You'll have to tweak the order the variables are defined and dropped, for it to compile.)


Now in the simplified scheme one could think that this would be good, but it's also wrong:

pub fn dump<T: Trashable + 'a>(&self, input: T) -> &'a T {

Yes, 'a is how long we can keep values of type T around, if we find a storage location that lasts that long. 'a is not our only limiting factor here. The Box is in the TrashPool, and the reference is invalid if the box is invalid. So we need to borrow from the TrashPool, that's our lower bound of the lifetime.

Correct:

pub fn dump<T: Trashable + 'a>(&self, input: T) -> &T {
2 Likes

Thanks for looking at it, @bluss! And also thanks for fixing my busted transmute.

Sadly, my reason for wanting to keep the pool's storage with the 'static lifetime was to support this use case, which your safe version now can't handle.

pub struct Container {
    data : RefCell<RefCell<Vec<u32>>>
}
impl Container {
    pub fn borrow_iterator<'b, 'a : 'b>(&'a self, trash_pool : &'b TrashPool) -> impl Iterator<Item=&'b u32> {
        let inner_cell_ref = trash_pool.dump(self.data.borrow());
        let vec_ref = trash_pool.dump(inner_cell_ref.borrow());
        vec_ref.iter()
    }
}

But you're totally right about the Drop trait. I was thinking of dumping an item into the pool as conceptually the same as almost-dropping it, with the returned ref being the only way to access the object from that point onward. So I figured that when the returned ref was dropped, the object in the pool was effectively dead too and that was all the lifetime bounding that was needed. I hadn't considered that a custom drop method would allow an object to come back from beyond the grave!
:ghost: :fearful:

It seems that creating a separate lifetime for each item in the pool is the only safe way to fix this, and I don't know of a way to do that in the language at present. Is there an RFC for this?

I have a vague feeling something may be doable with runtime checking - in a similar way to the mechanics behind Rc<>. So far, all my attempts to use Rc<> and Weak<> are flawed, but the underlying mechanism that makes Rc<> work might allow TrashPool to work too.

On a different note, I am so grateful for all the time and effort from everyone on this thread. @alice, @daboross, @Michael-F-Bryan, @bluss. Do you have a favorite project in need of sponsorship or a Patreon or anything? I'm just an individual, not a company with vast resources, but I'd like to give a little something back in recognition of all the time and effort you have put in.

Thank you!

2 Likes

I think I've finally got it! Fight :fire: with :fire: as they say. Fight the Drop trait using the Drop trait! They key is to make the reference passed back be an object implementing Deref, so it can invalidate the reference in the pool when it's dropped.

Here's the code. It could definitely be made a little cleaner, but I was excited to post.

use std::mem;
use std::ops::Deref;

impl<T: ?Sized> Trashable for T {}
pub trait Trashable {}

pub struct TrashRef<'a, 'b, T> {
    obj_ref : &'a T,
    pool : &'b TrashPool,
    pool_index : usize
}
impl<'a, 'b, T> Drop for TrashRef<'a, 'b, T> {

    fn drop(&mut self) {
        self.pool.invalidate(self.pool_index);
    }
}
impl<'a, 'b, T> Deref for TrashRef<'a, 'b, T> {
    type Target = T;

    fn deref(&self) -> &Self::Target {
        &self.obj_ref
    }
}

//Note: a safe clone, and a map function might need to be implemented to avoid serious inconvenience
// working with TrashRef<T>s, but that's for later.

//On the upside, if I implemented the reference tracking needed to safely implement Clone on TrashRef,
// I could 100% copy the semantics of the AutoReleasePool from ObjC and Swift.

pub struct TrashPool {
    pool : RefCell<Vec<Box<dyn Trashable + 'static>>>
}

impl TrashPool {
    pub fn new() -> TrashPool {
        TrashPool{pool : RefCell::new(vec![])}
    }

    //lifetimes provided for clarity, although some of them may be elided for improved readability later
    pub fn dump<'c, 'b:'c, 'a:'c, T:Trashable + 'a>(&'b self, input : T) -> TrashRef<'c, 'b, T> {

        let mut pool = self.pool.borrow_mut();

        let boxed_input = Box::new(input) as Box<dyn Trashable>;
        let transmuted_box = unsafe {std::mem::transmute(boxed_input)}; //Transmute away the lifetime on the input
        
        pool.push(transmuted_box);
        let trash_ref = &**pool.last().unwrap();
        let obj_ref = unsafe { &*(trash_ref as *const dyn Trashable as *const T) }; //Coerce the type back to the input type

        TrashRef{
            obj_ref : obj_ref,
            pool : self,
            pool_index : pool.len()-1,
        }
    }

    //Private, should only be called by TrashRef::drop() or UB!
    fn invalidate(&self, idx : usize) {

        let mut pool = self.pool.borrow_mut();
        mem::replace(&mut pool[idx], Box::new(0));
    }
}

It passes @daboross's test, by allowing dependent owned references

    let pool = TrashPool::new();
    let owned_string = "1234567890".to_string();
    let temp_str = pool.dump(owned_string.as_str());
    let substring = pool.dump(&temp_str[2..5]);
    println!("{}", *substring);

And it passes @bluss's test by not allowing this to compile:

    let pool = TrashPool::new();
    let s = String::from("Some text");
    let obs_handle = pool.dump(Observer(&s));
    drop(s);
    drop(pool);

And it does the right thing for a modified version of @bluss's test, when I drop the obs_handle before dropping s.

    let pool = TrashPool::new();
    let s = String::from("Some text");
    let obs_handle = pool.dump(Observer(&s));
    drop(obs_handle);
    drop(s);
    drop(pool);

I hope I haven't overlooked anything else. Thanks again everyone!

1 Like

Very good. That does indeed prevent a lot of the problems we have had so far. Unfortunately it is still unsound, as it allows postponing destructors:

struct PrintStrOnDrop<'a> {
    to_print: &'a str,
}
impl<'a> Drop for PrintStrOnDrop<'a> {
    fn drop(&mut self) {
        println!("{}", self.to_print);
    }
}

fn main() {
    let a = "Hello world!".to_string();
    
    let trash = TrashPool::new();
    
    let on_drop = PrintStrOnDrop { to_print: &a };
    mem::forget(trash.dump(on_drop));
    
    drop(a);
    drop(trash);
}

playground

2 Likes

Whaaaaaaat! :exploding_head:

What's the point of mem::forget() in the language? (Rhetorical question. I don't really need an answer. I'm just frustrated.)

I wish there was an Unforgettable trait that won't allow an object implementing it to be forgotten.

:frowning_face:

Thank you @alice for pointing out the issue.

I think it should be possible to fix this if we leak a values left in the TrashPool if the TrashRef is leaked. Maybe we could wrap the boxes, or their contents, in a ManualDrop? Something like

pub struct TrashPool {
    pool : RefCell<Vec<Box<ManualDrop<dyn Trashable + 'static>>>>
}

and associated changes in invalidate calling ManualDrop::into_inner to really drop it then should fix it.

The main purpose is to make it extremely obvious that not calling destructors is allowed :stuck_out_tongue:

Before it was made safe, there was a big issue where constructing reference cycles with nested Rcs allowed forgetting a reference without ever dropping it, and that made several safe stdlib APIs which relied on destructors being called unsound. See this article for a good writeup on that.

The resolution was a policy of explicitly allowing missing destructors, and making std::mem::forget() safe as part of that.

2 Likes

Just to be clear, I can do it without using mem::forget too. It exists just to make a construction like this easier.

struct Helper<T> {
    t: T,
    r: Option<Rc<RefCell<Helper<T>>>>,
}
fn forget<T>(t: T) {
    let helper = Rc::new(RefCell::new(Helper {
        t,
        r: None,
    }));
    let helper2 = Rc::clone(&helper);
    helper.borrow_mut().r = Some(helper2);
}

struct PrintStrOnDrop<'a> {
    to_print: &'a str,
}
impl<'a> Drop for PrintStrOnDrop<'a> {
    fn drop(&mut self) {
        println!("{}", self.to_print);
    }
}

fn main() {
    let a = "Hello world!".to_string();
    
    let trash = TrashPool::new();
    
    let on_drop = PrintStrOnDrop { to_print: &a };
    forget(trash.dump(on_drop));
    
    drop(a);
    drop(trash);
}

playground

I think leaking anything in the pool not dropped through the TrashRef should be ok?

2 Likes

Yes. Definitely feels acceptable to me! :tada:

Thanks for all your help!

What would you think if I made this change, migrated to some kind of reference counting approach (probably using Rc<>), implemented all the traits that make it ergonomic to use, and then published it as AutoreleasePool crate, adopting some of the method names and behaviors from the ObjC version of NSAutoreleasePool?

I'm sure this kind of memory management paradigm is useful for more people than just me.

Thanks again!

If you make any changes to how it works, you should probably post it here first to ensure all of the tricky details are just right, but sure.

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.