Opening a file without allocating on Unix?

File::open winds up copying the path it's given. I understand why the current implementation does so, but I wonder if there's a way to avoid it through the fs API?

(Short of extern-ing open(2) and using FromRawFd::from_raw_fd, I mean, which is what I'm currently doing.)

More specifically: The way that Path, OsStr, and CString are specified makes me wonder if the current method could even be reimplemented to avoid allocating. Thoughts?

(I'm very allocation-conscious from my day job, since every allocation is a potential failure to consider. open(2) can fail for several reasons, but userland heap exhaustion is not among them, and it certainly won't kill the caller if the pointers are valid -- so I was surprised to see the additional risk in Rust.)

Failed memory allocations currently abort the process: lib.rs - source

In the future it might be possible to make it panic instead, but recovering from OOM conditions can be tricky. Currently your only option if you want to handle allocation failures manually is to avoid most of the standard library.

1 Like

The rust standard library (not core) makes the following assumption (edit: for userspace applications): if you're out of memory, you probably can't do anything useful anyways (even unwinding needs memory edit: can allocate). In the future, rust may provide ways to deal with OOM conditions (e.g. custom allocators that can try to free caches etc.) but it doesn't do so now.

As for open, you could write a could package the manual open/from_raw_fd calls into an extension trait:

trait NonAllocatingOpen {
    fn open_noalloc(path: &CStr) -> io::Result<File> { /* ... */ }
}
impl NonAllocatingOpen for File {}

However, at the end of the day, anything in std can allocate so if you can't have allocations, you should consider sticking to core.

1 Like

Thanks! I'm currently trying to avoid core because of the instability, and to see how conscious I can be of allocations using libstd.

Are the assumptions you've listed documented somewhere, and I've missed them?

In practice, the embedded platforms I'm used to working on would not have this impedance mismatch, because I can eliminate all need for nul-terminated strings under the hood (unlike Unix).

Not that I know of. Initially, libstd had a runtime so the argument was: "no one will use this for embedded/kernel programming because it has a runtime". Therefore, the devs decided that aborting on oom was reasonable. That's changed but, as far as I know, no one revisited that decision. Relevant reddit thread: https://www.reddit.com/r/rust/comments/341v3n/cs_honors_thesis_reenix_implementing_a_unixlike/

My "nothing useful" statement wasn't correct. It's more that most userspace applications won't be able to do anything useful without heap space (as a matter of fact, the Linux kernel (usually) doesn't even let userspace programs handle OOM conditions). Also, unwinding itself doesn't require allocations but destructors are allowed to allocate.

1 Like

Thanks.

I was curious what unwinding implementation Rust was using that required heap allocation -- your second explanation makes more sense.

Note also that OOM from the kernel's perspective and allocation failure from a userland heap's perspective are different events -- OOM is a system-wide event made necessary by Linux and Unix's free-for-all attitude toward page overcommit. The systems I work on don't behave this way. So while I think there are good reasons to ignore heap allocation failure in application software, we shouldn't conflate the two.

libstd (as opposed to libcore)'s panic implementation does allocate memory to store the panic message:

// We do two allocations here, unfortunately. But (a) they're
// required with the current scheme, and (b) we don't handle
// panic + OOM properly anyway (see comment in begin_unwind
// below).

let mut s = String::new();
let _ = s.write_fmt(msg);
begin_unwind_inner(Box::new(s), file_line)
1 Like

Interesting. Thanks!