The usecase here is a library that wants to avoid leaking resources on dlclose, and thus registers fini handlers in its ELF file to deallocate global resources. However, those also get executed during exit, so if there are still threads running and sing the library when this happens, then the resources may be freed while they are still being used. For dlclose, the caller has to ensure that no code in that library still runs, so there's no race condition in that case -- but imposing similar requirements on exit is not feasible.
Here are some proposals for what could be done on some platforms:
For ELF-based platforms, -Wl,-z,nodelete does it. If the library knows its own name, dlopen(myname,RTLD_NOW|RTLD_LOCAL) and throwing away the result does it at runtime on any platform with dlopen. If you want to write code that's compatible with static linking into an environment that might not have dlopen, declaring a weak reference to dlopen and testing it before calling should work.
More broadly, however, this seems to be an open problem. Long-term, ideally platforms can provide a portable API to achieve this. I have no intention of being involved in that, but this her would be a good starting point for people to discuss how such an API could look like and how libcs could be convinced to adapt it.
Can this even be done from Rust? I was under the impression that Rust didn't really support weak linking.
As I see it, this needs to be solved cross-language and cross-OS/libc. That is a lot of coordination. But before we even begin to sketch a design to take to a wider audience: do we want to focus on fixing unloading (so that it works) or focus on fixing the API so you can avoid unloading?
There are probably use cases that really want to clean up resources on unload:
Code hot reload for development
Upgrading modules to newer versions (like Erlang but way unsafer in native code).
Loadable/unloadable plugins or game mods (without having to restart the software).
These might not be well served by the "leaking resource" option, which to me indicates it would be preferable to solve it so that it works (as far as possible, since e.g. Musl won't actually unload as I understand it).
Can the library register an atexit function that runs first, and configures it to leak the resources? (or perhaps instruct the process loading the library to install this, so that the atexit handler doesn't get unloaded)
I like that idea, but there doesn't seem to be any way to unregister atexit handlers, so it looks like it sets up a time bomb if dlclose is called first. It would require handling on the application side (as you pointed out), which is likely brittle.
In addition if you are following an existing plugin protocol you might not have the ability to introduce such a breaking change (e.g. PAM, or if you are writing a plugin for someone else's proprietary program).
I have a recollection that Borland solved a similar problem...
When the executable was shutting down (like from a call to std::process::exit), it would check each referenced library for an entry point with a specific name. I'll use __borland_exit_handler as an example.
If the __borland_exit_handler entry point was present, it would be called. If not, obviously, nothing special would happen.
That gives the library writer an opportunity to safely handle process exit.
The burden on the executable developer was zero. The code was automatically generate by the compiler + linker and the run-time library included the handler-caller.
The burden on the library writer was what they made it. If they wanted an exit handler they just exported a function with the predefined name.
I also recall they had something similar for dynamically loaded libraries. There were predefined entry points for initialization and shutdown that, if present, Borland's loader would call. That gives the library writer an opportunity to safely handle dynamic loading.
The drawback was the reluctance of folks not using Borland products to embrace the idea. But, nothing stopped them from using it.
Windows still has this for DLLs in the form of the DllMain function, and ELF has its .fini_array section for executables and libraries which is what this is about. The difference is that DllMain knows whether it's being called because of FreeLibrary or ExitProcess, while the functions in .fini_array don't know whether they're being called from dlclose or exit, so they don't know what it's safe to do.
I can't tell whether Borland C++ #pragma exit functions get called for FreeLibrary.
But it's worth noting that Musl isn't the only runtime to take a dim view of it.
Apple's dynamic linker will never unload libraries that contain any Swift or Objective-C code, any thread-local variables, or any userland DTrace probes. (Ref: source code.)
glibc won't unload libraries with pending destructors for thread-local variables. If your library initializes a thread-local variable on any given thread, it prevents the library from being unloaded until that thread exits. So for instance, if you initialize a thread-local on the main thread, you're probably never going to be unloaded. (Ref: fasterthanlime blog post.)
On many platforms (including glibc and Apple platforms), atexit magically determines which library is calling it and schedules the callbacks to be executed when that library is unloaded. They're treated the same as destructors. This prevents the time-bomb issue, but it also makes atexit unsuitable for detecting whether the process is actually exiting.
This can be worked around by directly calling __cxa_atexit with a null third parameter. But then it goes back to being a time bomb. Also, __cxa_atexit is technically not supposed to be called directly, and while it's reasonably portable, AFAIK that doesn't extend to Windows.
Here is a way that this could work: call __cxa_atexit pointing to a anonymous mmaped page with code you copied in (setting the right permissions), this can then detect if the library is still loaded by examining /proc/self/maps (on Linux, maybe other Unixes have something similar). Now you only leak a single page.
Bonus: To know if you get reloaded you can enumerate all executable anonymous pages and search them for a magic number you put there earlier to see if you need to create the mapping or not. (Thread safety is an issue, if some other thread unloads that mapping (and it wasn't related to your library), workaround: catch SIGSEGV)
Needless to say this is a terrible idea on many levels.