Can main thread fork safely from with thread::scope?

I have a resource leak issue from unscoped secondary threads that no longer exist after the main thread forks. That's expected, but I'd like to fix it somehow.

I could fix it if I instead scoped those threads and managed their resources outside that scope, passing refs into the thread closures. However, I would then need to fork the main thread from within the thread::scope closure, with the expectation that the forked main thread can return out of the thread::scope closure without having to wait/join for other threads that no longer exist in the forked process.

Would this work?

An alternative idea is to have all of those secondary thread resources owned by a static mutex, but this seems a tad wasteful because none of those resources are ever shared - the forked main thread would only access them (to drop them) when the secondary threads are gone (because not inherited by the child fork).

That won't work - the forked main thread would be locked out of the mutex by the non-existent secondary threads. I suspect, because ownership implies exclusive in Rust, that there is no safe way to do this cleanup without the original fork-within-thread::scope idea.

The more I think about this, the worse it seems.

For data that was mutable (via ownership, &mut or interior) by a thread that suddenly disappears (due to a fork), there's not much hope. Not only is there no way (except in the interior mutability case) for the remaining thread in the forked child process to get at that data, the data may be in an invariant-breaking state because the last thread may have been in the process of modifying it when it disappeared (from the viewpoint of the forked child process thread). This goes for locks/mutexes governing shared ownership as well.

The only data that can be guaranteed safe is data that is constant with respect to the disappearing thread(s). Or shared non-blocking data, like atomics.

I think all I can salvage are fds - I can have the threads store raw copies of any they use in AtomicI32s. The forked child thread has to cast these back to OwnedFds unsafely and close (by dropping) them.

I'm now wondering about the global allocator. How does it manage to always be in an invariant-preserving state after a fork? I guess this is a problem that was solved a long time ago for C/C++, but I'd like to know how. Maybe a similar mechanism can be added to Rust for this similar case...

From the man page:

After a fork() in a multithreaded program, the child can safely call only async-signal-safe functions (see signal-safety(7)) until such time as it calls execve(2)

Signal safety is a painfully restrictive concept. It means that almost everything is forbidden and nothing is safe. Notably you can't touch anything that could allocate or deallocate memory, so you're not allowed to create any String or drop any Vec. You can't panic either.

4 Likes

Oh - just watch me :scream:

This means my original C app I have almost completely translated into Rust is wrong as well.

Fortunately, forking is an optional behavior - one I added to get the safety and security of process isolation. I could add exec, but that would lose the benefit of common initialization of data done prior to execs.

1 Like

You can safely fork if you do this as the very first thing in main, before any threads are spawned or any allocations made. Then your child process can do whatever it wants without signal safety restrictions.

I do lots of allocations before forking. I've pulled out the forking code. I'll live without it.

Thanks for pointing out that paragraph in the fork man page. I knew about async-safe functions for signal handling purposes, but didn't know the same rule applied following a fork.

It applies to forking in multithreaded processes, because only the forking thread will be present in the child process (other thread may attempt memory allocation at the time of fork, and the child process may be created with memory allocator being in incorrect state). If you can guarantee that the process is single-threaded, you can fork, however, you should note that the child process will still differ from the parent process (see man fork for further information), therefore, you should be careful not to break objects that already exist in a process. Also, it's likely that the child process has the purpose of performing some job (rather than being a complete copy of the parent), so it's worth making sure that the function that created a child process doesn't return (whether by normal return or by panic). This can be done by executing the child process code inside the catch_unwind!() macro, and aborting if a panic is caught (the parent process will see that the child process was killed with SIGABRT signal, which can be used to detect a problem).

If it is allowable to do allocations but not create threads prior to forking, then my original C app is safe, at least. The reason I was creating another thread prior to forking in Rust was to do signal handling. In C I used a signal handler, not a separate thread.

I've since switched the Rust version to use a signalfd mechanism in the main thread (which sits in an epoll loop), which is much cleaner all around, and does not require a separate thread for signal handling. I may consider adding back the forking option if I need it, now that the process will be single-threaded at the point of the fork.

I've never heard of allocations making forking unsafe, the problem may be with regions of memory that are madvise()d with MADV_DONTFORK or MADV_WIPEONFORK (but you should make sure that no library functions you are calling before forking spawn threads). There are also things that may change in child processes (removal of POSIX timers, for example), and you should be especially careful while working with inherited file descriptors, since they point to the same open file descriptions as those in the parent process, which may lead, for example, to uncontrolled offset change.

GIven you're rewriting a C program in Rust, I'd like to ask why do you need to fork? I've worked with servers written in C that spawned a child process for every connection, and for computationally heavy tasks. In that case, processes may be replaced with threads (however, it's better to use a thread pool rather than spawn a thread for each background task). I/O-bound tasks (such as servicing a connection) may be solved with non-blocking I/O. If you are handling connections in a separate process for better security, I'd suggest performing execve() on a worker process binary to avoid having a "leftover" state (then you may simply use std::process::Command or tokio::process::Command).

UPD: Also, one of the use cases for forking is daemonization, which is okay, but should happen as early as possible.

1 Like

The app was offering forking for better security. I didn't exec after the fork because it had already set up considerable memory state that would be expensive for the child processes to duplicate without inheriting. That state is RO after being initialized, so there's also footprint and cache benefit to sharing it across children.

As for daemonization - yes, that is another option the C app has (it is a server, BTW) that I removed from the Rust version. The C version was using daemonizing primarily as a way to signal the invoking script that its server socket is ready to accept client connections. I have a workable alternative in place where the server signals readiness to its invoking script through a fd it is passed. This mechanism has the additional advantage that the server can be launched within a separate sandbox (bubblewrap) from the invoking script, and the sandboxing won't interfere. So, the loss of daemonizing is less problematic than the loss of forking client sessions.

unfortunately, this isn't actually true in rust, at least not if you're using the standard library's main glue.

the standard library allocates before main is called. i know because i tried (and failed) to remove those allocations.

Allocations before fork is not a problem though. The issue is allocating after fork and before exec in a multithreaded program. Reason being that some other thread might hold the alloc lock in the moment of forking. Since only the thread calling fork exists in the child, now that lock will never be unlocked there.

3 Likes

Summarizing this so that I can close out this thread...

If the parent process is single threaded at the time of the fork, then the child does not have to be async-signal-safe prior to (if ever) calling execve.

Even though fork(2) man page does not say this is the case.

Anyone disagree?

There are three things here:

Generally I would recommend aiming for one of the first two options.

What the standard says is that in the child you should only use async signal safe functions if you are multi-threaded. You can use anything you want on the parent side regardless of multi-threading. If you are not multi-threaded it doesn't say what you can use, which can be inferred to be "anything you like".

Just be sure there isn't any sneaky thread started by for example a static constructor in a liked C++ library.

It is also worth taking a moment to think about why the standard is written like this. The issue here (and with signals) is that there might be locks that will never be unlocked, causing deadlocks inside some functions. The reasons differ, but the problem and symptoms are the same.

The list of async safe functions is the short list that are guaranteed to work, there may be more that works in a specific implementation. Or you may get lucky and not hit the issue (until the code is in production, when you will hit it at 3 am christmas eve).

I would prefer if possible to use a high level abstraction such as Command to run a new process instead. Less error prone.

2 Likes

Is there a way in Rust to introspect the number of threads in the current process? I'd at least like to use a debug_assert! to test there's only one at the time of the fork. If I add forking back in as an option.

There's no cross-platform way as far as I'm aware. You'd have to tackle platforms individually. For example there's /proc/self/task on linux... if proc is mounted.

Here's one stab at Unixy targets. It's not perfect there. It doesn't check that /proc is really proc (but here's a project that does).

If you need "was never multithreaded" you're even more out of luck. (I'm not sure if you need that or not.)

There was more conversation in this IRLO thread and the linked issues. I'm afraid the conversation is quite large though. FWIW no such attempted check ended up in set_var.

Thanks for pointing out num_threads! My server app is *nix only. I'll accept that this added protection might not work on some platforms. I can put caveats in my doc about that. My primary concern with re-introducing forking is that, as new capabilities of the app are developed, something will be added that spawns a thread prior to the fork (which is what happened), and we end up with a very hard to debug probabilistic deadlock situation. But I don't anticipate that development and testing will be done other than on Linux with proper /proc setup, where num_threads should work properly as is.

I will add forking and daemonizing back in as a non-default #[cfg(feature...)] and non-default command-line options.

To summarize - if you want to fork and not exec in Rust:

  • Make sure the app is single threaded at the point of the fork - perhaps using the num_threads crate to check this prior to the fork
  • Some platforms won't be able to check for single-threaded-ness properly, so make sure they're documented, and that adequate testing of forking vs. anything that might spawn threads is done where the single-threaded-ness check works ("normal" Linux with proper /proc setup).
  • If you need a signal handler prior to the fork, use something that doesn't use additional threads, such as signalfd (the nix crate has this). This works well if your app sits in an event loop (epoll, for instance) prior to forking.
  • If possible, don't depend on forking - make it a non-default option, preferring multi-threaded or async instead.
  • If you really need forking, maybe you should reconsider exec-ing right after, or use std::process::Command. You lose inherited state, though. Maybe that can be serialized/deserialized to transfer it from parent to children?

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.