Fallible Arc Creation

I'm working on a project that needs to avoid panicking on out of memory errors. We can accomplish this with Box<T> by creating a Layout for a T, calling the allocator manually and writing a T to the allocated region followed by a Box::from_raw. I'm struggling to see how we can create an Arc<T> in a similar fashion. Is there a way to create an Arc<T> while doing the allocation manually so that we can recover from any OOM errors?

My understanding was that on Linux, at least, calling malloc would very likely return "OK" even if the memory was not available at that time. That memory would not actually be mapped into ones program space until one made an access to it later. At which point, by the magic of virtual memory, a page would be found for that address. But if there was no page free a segfault would happen.

The advice we were given was to malloc all the memory ones program would ever need at start up and then touch every page of it, just to be sure you really had it. Then, if no segfault happened, proceed to run the application.

As such it's not clear to me how calling the allocator manually helps.

What am I missing here?

On nightly you can write

std::alloc::set_alloc_error_hook(|_| panic!(“alloc error”));

And wrap your Arc creations in catch_unwind.

2 Likes

Overcommit is an adjustable policy and ulimit can lead to malloc failures independently of that as well, so there are use cases. And on the flip side, you're always susceptible to the OOM killer and other process deaths. All comes down to how much defense you want to play.

1 Like

Hmm... that suggests to me that to be really sure a program does not fail for lack of memory one needs control over the entire machine, OS and all applications running on it.

In which case one can code ones applications such that it keeps track of it's data structures and ensures they never get too big. No need for special allocators and/or panic hooks.

By the way, the advice I was referring to was given many years ago during a training session by MontaVista on using their embedded Linux distribution. There was no mention of "ulimit" or OOM killer. Not sure if such things existed at the time. Of course we did have complete control over the embedded systems we were building (So much so that we did not even use MontaVista in the end :slight_smile: )

Sorry, I should have mentioned that I'm in a no_std context running on bare metal so linux behavior is not particularly relevant to me. Calling the allocator manually helps because the allocator returns a nullptr on allocation failure and that can be handled. Something like

pub fn try_box<T: Sized>(x: T) -> Result<Box<T>, OOM> {
    unsafe {
        let layout = Layout::new::<T>();
        let ptr = crate::alloc::alloc::alloc(layout);
        if ptr.is_null() {
            Err(OOM)
        } else {
            core::ptr::write::<T>(ptr as *mut T, x);
            Ok(Box::from_raw(ptr as *mut T))
        }
    }
}
1 Like

Wrapping Arc creation in catch_unwind is something I hadn't considered. I would really wish for a way to do this without the unwind machinery, but this may be the best option we currently have.

Interesting. I did not know we could do even that.

Yes, we do have control over the entire machine, but external users are interacting with the machine so we don't have full control over the requests that need to be fulfilled. The system is designed so that it only allocates when handling a user request. So if we've allocated more than than we desire (which may be a subslice of the entire machines resources), it's important that we return an error to the user and not crash the entire machine.

In regards to keeping track of the data structures in use and ensuring they never get too big, I would argue that's exactly what our allocator is doing. We just need a better way of communicating between the code allocating (Arc::new in this case) and the allocator itself. This exists for Box, but doesn't seem to exist for Arc. I was curious if there was a way that I was overlooking to do something similar for Arc, or if there was already a bug or feature request filed for it.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.