Static linking's runtime cost vs. multiple processes?

After successfully translating a C app to Rust ( :smile:!), I'm trying to determine the best way to package it. I currently have it as a single bin crate. But, the app in C also takes dll addons, which I've elected not to attempt in Rust. Instead I'm considering breaking the bin crate up into several lib crates that can then be compiled together as required to "simulate" (albeit at compile time) the choice and combination of addons. However, I'm having second thoughts because of the impact of static linking on runtime cost - and I'm not clear how this works.

If I have multiple processes running the same exact static linked binary, I think modern linuxes are smart enough to NOT duplicate the binary in runtime memory, and instead share that memory in the same way as it would share the memory of a shared lib. But only if the whole binary is exactly the same. However, I'm not at all confident of this. Does anyone know for sure? I've spent too many years writing C/C++ shared libs to remember the details of static linked binaries. I figure Rustaceans would be much more familiar with this.

Because if that's the case, then instead of saving memory by building multiple static binaries with slightly different addons, this would instead waste memory because those slightly different binaries, when running at the same time, would not share any runtime memory. So I should instead not bother working on re-crate-ifying my Rust app - and instead require all addons to be compiled in every time. There might be 10's of different processes running different addon combos of this app at the same time, so the impact on RAM and cache performance is not insignificant, especially considering Rust binary image sizes.

Investigating this myself...

By running 4 processes with the same Rust bin, and examining their /proc/PID/smaps contents, it sure looks like they share the binary in memory (comparing Rss to Pss sizes - each Pss is 1/4 Rss). If I'm interpreting that correctly, that might answer most of this. My guess is that linux doesn't bother attempting to determine if parts of a static binary are sharable when the whole is not. Because that's what shared libs is for (where they're possible).

This seems to suggest the most compact and cache-efficient way to run a bunch of Rust apps is to have them all compiled into one big static binary that can be run different ways. Busybox-style.

And it means I'd be wasting my time trying to repackage the app so that it can be compiled in multiple different addon configurations.

Is this correct?

That with symlinks or hardlinks and looking at either current_exe in std::env - Rust or argv[0] -- whatever actually works on your platform -- is a great way.

Why would there be any runtime cost for static linking?

1 Like

The question as I understand it is "what is cheaper - load a large binary at once or split it into dynamic libraries, then load them independently?" This "large binary" and consequent costs can be thought of as "runtime cost for static linking".

Yes. But also add: if parts of the app are dynamic libraries, then it may be the case that not all of them need to be loaded.

Suppose I have a base app and 3 dlls, each is 100K. Call the dlls L1, L2, L3. I have a scenario in which two processes run the app - one needs L1, the other L2, and neither needs L3. This would require 300K of RAM (1 copy in memory of: app, L1, L2). If I had static linked all 4 parts, it would require 400K. If I had static linked the app with separate L1 and L2 sources, that would also require 400K. But if I then add a process that needs the app and L3 - and static linked those - the total is now up to 600K. If I instead had static linked the app and all 3 libs, those 3 processes (or more) would only require 400K total.

Also, where there is RAM cost, there is also cache cost (more cache misses). And, loading time cost when new processes load anything not yet shared by already running processes.

The conclusion is that it doesn't make sense to package standard (static link only) Rust apps to compile in different ways depending on the need if one can instead static link all different parts together as a single app. Even though, typically, multiple processes will run the app, but may need different parts.

Static linking all parts together is what I call Busybox-style.

Something to keep in mind is that static linking does not necessarily mean a cost to loading everything, either. In most general-purpose operating systems, both executables and dynamic libraries are memory-mapped — only the parts of the program which are actually executing have to be copied into physical memory, and if there is memory pressure, then unused pages that have already been loaded can be discarded.

1 Like

Maybe there is an opportunity for a new cargo feature: the ability to static link together multiple Rust apps into a single binary and automate using the 0th arg (the command name) to select among them.

Although I am wary of attack surface issues.