Any way for Rust to realise I won't drop something from a ! method...?

So... Context is I'm developing a runtime/OS for AVR microcontrollers. In particular the ATmega4809, which for an AVR is pretty advanced but still very, well, micro.

This is a process not without significant pain, but, I love Rust so I'm enjoying the suffering anyway. But there is one thing which I just find drives me crazy - and that's the Rust borrow checker's inability to understand that if my main() function never returns, it will never drop its variables.

This is frustrating for two reasons:

  1. It makes sharing things around in the OS 'client' code incredibly messy. Embedded systems typically require lots of mutable singletons - serial port, display device, some kind of shared data - which logically you'd set up at the beginning of your main() method and then off to the races using them in the rest of your code. But Rust won't let you do that. You have two suboptimal choices:
    a. Use Box::leak(), except Box() requires you have a dynamic allocator.
    b. Use static mut, except now you have to scatter unsafe code all over the place, and deal with the fact that mutable static initialisation is just ugly with either Options (and associated unwrapping) or deeply un-user-friendly MaybeUninit all over the place.
  2. Also, it wastes code space. A full 1KB of my test application is the Rust compiler generating drop_in_place implementations for things that will never be dropped. 1KB doesn't sound like much, but on a chip with 48KB of code space total it makes a difference.

Now for where I am, it's not blocking me 'doing' things - I am fortunate that I have a dynamic allocator, so the Box::leak() approach works, but it still feels rather dirty and it makes the API messy asking the application developer to even know they need to worry about this. I've tried to abstract it away a little with a StaticWrapper type class that kind of abstracts away the dirty, but even then it doesn't solve problem 2.

So I guess apart from just sharing my pain, I have two questions...

  1. Am I missing a third way, other than Box::leak()/static muts, of telling the borrow checker "this really really won't be going away, honest"?
  2. Is there any chance that someday it will be smart enough to know that if a function never returns, it will never drop its local variables? It seems like such an obvious thing - indeed a naive person (as I once was :wink: ) would wonder what even the point of the -> ! syntax was if not to accomplish that - that I assume there must be a really good reason it's not implemented, but... Can I dare to dream one day it will..?

If nothing else, thanks for sharing my pain :D.

All the best,
Tim

5 Likes

Is there any chance that someday it will be smart enough to know that if a function never returns, it will never drop its local variables?

Unfortunately, that isn't true about the language semantics: instead of returning, a function can also unwind (by panicking, if compiled with panic=unwind), which in your case would drop your variables in main(). This can't happen under panic=abort, but there's no way to, within the language, declare that there's no unwinding possible, so there's no way to get what you want automatically.

Now, I can imagine a library-based solution: write a helper something like this. Disclaimer: I don't know if someone's already done this, and I'm not an unsafe code expert and not confident in the correctness of this, and also catch_unwind is not available in core so something else is needed...

#![feature(never_type)]

use std::panic::{catch_unwind, AssertUnwindSafe};

fn with_these_things_forever<T, F>(mut data: T, function: F) -> !
where
    T: 'static,
    F: FnOnce(&'static mut T) -> !,
{
    // Note: We must store the result of catch_unwind to avoid dropping it
    // and potentially causing another panic.
    let _panic_payload = catch_unwind::<_, !>(AssertUnwindSafe(|| {
        let p = &mut data as *mut T;
        let extended: &'static mut T = unsafe { &mut *p };
        function(extended)
    }));
    
    // We should never get here unless unwinding, and if we do, we abort.
    // Therefore, `data` will never be dropped.
    std::process::abort();
}

fn main() {
    let data: (i32, i32) = (0, 0);
    with_these_things_forever(data, |data| {
        let _mutable_state_1: &'static mut i32 = &mut data.0;
        let _mutable_state_2: &'static mut i32 = &mut data.1;
        // carry on with &'static muts, but never return
        loop {}
    });
}
5 Likes

Well, it's still static mut under the hood, but you can encapsulate it rather plainly:

static GLOBAL_RESOURCE: UnsafeCell<MaybeUninit<GlobalResource>>
    = UnsafeCell::new(MaybeUninit::uninit());

fn global_resource() -> &'static GlobalResource {
    // SAFETY: you promised 😢
    unsafe { &*(GLOBAL_RESOURCE.get() as *const GlobalResource) }
}

fn main() {
    unsafe {
        *GLOBAL_RESOURCE.get() = MaybeUninit::new(init_global_resource());
    }
    // rest
}

That's not giving &'static mut access, though. But that does suggest a more ergonomic and fully no_std pattern than my previous idea, which requires an unsafe call but only in main:

extern crate core;
use core::cell::UnsafeCell;
use core::mem::MaybeUninit;

struct StaticResource<T> {
    data: UnsafeCell<T>,
}
// Safety: StaticResource will be accessed unsafely only once, hence
// from only one thread.
unsafe impl<T> Send for StaticResource<T> {}
unsafe impl<T> Sync for StaticResource<T> {}

impl<T> StaticResource<T> {
    /// Use this to statically initialize an instance with a constant value.
    const fn new(initial_value: T) -> Self {
        Self { 
            data: UnsafeCell::new(initial_value)
        }
    }

    /// Safety: This must be called at most once.
    unsafe fn get(&self) -> &'static mut T {
        &mut *self.data.get()
    }
}

/// If the initializer isn't const fn, use this
impl<T> StaticResource<MaybeUninit<T>> {
    const fn new_uninit() -> Self {
        Self::new(MaybeUninit::uninit())
    }

    /// Safety: This must be called at most once.
    unsafe fn get_init(&self, initial_value: T) -> &'static mut T {
        *self.data.get() = MaybeUninit::new(initial_value);
        self.get().assume_init_mut()
    }
}


fn main() {
    let my_static_mut: &'static mut i32 = {
        static STATIC: StaticResource<i32> = StaticResource::new(0);
        // Safety: this code must be executed only once
        unsafe { STATIC.get() }
    };

    *my_static_mut = 100;
    dbg!(*my_static_mut);
}

(It could be made safe at the cost of allocating 1 byte for a "already taken" flag.)

1 Like

Maybe a once_cell - Rust could help. The embedded-hal people might of solved this type of problem.

Yeah, so variants on the above approaches are essentially where I got to; it works, but still feels like I'm forcing the application developer to know about horrible implementation details. But I guess so be it.

Hadn't thought about unwinding - because, of course, if you use panic_unwind on a machine with 48k of program store, you only get about 6 bytes left for your own code after the panic handlers have embedded every dependency under the sun. It seems panic unwind is the root of much evil...

That definitely looks very interesting - thank you! I am definitely looking for a "cleaner" way of exposing this detail to the application developer, and on first glance this looks pretty nice.

(Not that there will likely ever be another application developer than me, but you know... Pride ;-))

this may be a spicy take but have you considered ManuallyDrop?

4 Likes

Whooo, spicy indeed - not a thing I knew existed :grin:. I'm definitely going to play with that tomorrow - I presume it won't affect the borrow checker's shenanigans, but it might help me save that 1K of pointless code! And definitely less spicy than the other plan I had for that (modifying the linker script :grimacing:)

2 Likes

I looked into OnceCell, but it's not really useful here. In #![no_std] contexts, the only available type is once_cell::unsync::OnceCell, which still requires us to put a Sync wrapper over it.

Also you could try to use mem::forget which prevents the values drop code to be executed. I'm not 100% sure though what would happen when unwinding. Theoretically I would expect the drop code to still be executed in such a case, this would only be prevented by using ManuallyDrop.

Fun fact: mem::forget is implemented using ManuallyDrop :smiley:

I believe MaybeUninit is also ManuallyDrop, though the reference doesn't explicitly mention this, it's implicit in the available methods and the fact it's a union without a Drop impl.

I'd suggest trying to get your entire global state into one big GLOBALS value (using composition, of course), to help ensure that you're definitely assigning all globals, and ideally also with the constructors not having access to the globals module, to ensure you don't have order-of-init issues (I think the only good way to do this is a separate crate?)

When I was doing something like this for trying to find the performance floor for GPU rendering in Rust this is what I ended up doing. Turns out, still sucks.

If you also want mutability but you also have threads, maybe you want to do something like the stdio types, and init a copy thread-local in first access, if you have it: stdio.rs - source

All three creation methods (new(), uninit(), and zeroed()) in ManuallyDrop's docs include this disclaimer:

1 Like

Due to the use of let _ =, the panic payload is dropped before the end of the function; it could itself drop and reinstate a panic.

Example.

Fixes:

4 Likes

Fun fact: I read about that gotcha, thought “yikes, that's bad”, and then didn't think of it when trying to use catch_unwind now.

I've now updated the code I posted to have that fix, just in case anyone decides to use it.

Yeah, I even checked for that when I saw the code and thought it was okay since it immediately calls abort(). I forgot that let _ = bindings in particular drop the value immediately. (That asymmetry between let _ = and let _ident = has always tripped me up in practice, even though it makes sense in the context of pattern semantics.)

  • forget it
  • ManuallyDrop it (same as above, really)

(hmm, now we kinda wish some sort of ManuallyDrop<Result<T, E>> -> Result<ManuallyDrop<T>, ManuallyDrop<E>> projection was a thing.)

1 Like

That sounds pretty easy, given ManuallyDrop is basically a marker:

fn project<T, E>(x: ManuallyDrop<Result<T, E>>) -> Result<ManuallyDrop<T>, ManuallyDrop<E>>) {
    x.into_inner().map(|t| ManuallyDrop::new(t)).map_err(|e| ManuallyDrop::new(e))
}
1 Like

in core, that is. :‌p

(also catch_unwind should probably recommend the use of ManuallyDrop...)

I'm not sure I understand what you're getting at?