I have a project that ultimately uses a build system other than cargo and would like to include cargo-built artifacts like any other dependency. I have been researching best practices and found this GitHub thread and this pre-RFC among other sources. The general pattern other native build systems (like Bazel and Buck) are using is to link in the rlib artifacts from cargo, as they include only the object files for the immediate crate. This leaves external dependency management and final linking up to the non-Rust ("native") build system.
One of the pieces I still need to create is a library with the set of global Rust shims in it (symbols like __rust_alloc, __rust_start_panic, etc). In the pre-RFC they recommend building a "no-op" crate to get these symbols:
Note (non-normative): At the time of writing, the -C emit-std-bundle=yes flag can simply be a no-op, as the Rust compiler can successfully create such staticlibs already by compiling an empty crate. The purpose of the flag is to ensure that this behavior is preserved in the future in an opt-in fashion.
I tried this and sure enough, the symbols I'm looking for are in there. However, there are also a ton of other symbols that I do not want (e.g., some mangled Rust symbols from std and core, pthread APIs, memcpy, etc.)
What would a crate whose artifact contains just the global Rust shims in a staticlib look like?
Of course, make sure you're using --release to ensure optimizations (e.g. inlining) are applied. After optimizations, in theory all that should remain in the staticlib at that point is necessary (i.e. reachable from the symbols you do care about).
I followed the pre-RFC steps to produce this staticlib:
$ cargo new --lib stdrust
$ cd stdrust
$ echo "[lib]" >>Cargo.toml
$ echo "crate-type = ['staticlib']" >>Cargo.toml
$ RUSTFLAGS="-C emit-std-bundle=yes" cargo build --release
$ ls -l target/release/libstdrust.a
-rw------- 1 pcwalton staff 17031504 Jan 19 19:37 target/release/libstdrust.a
... with the exception of the emit-std-bundle argument, which would be a no-op at this point anyhow. I can confirm I passed the --release flag to cargo.
Inside my target/release folder, my libstdrust.a is 14.3MB, and contains hundreds of additional symbols I would not expect to be in there (e.g., mangled std and core functions). Here's a screenshot of what I am seeing inside one of the object files I extracted from the library:
Clearly I'm either missing something from the build steps, or what I thought needed to be globally accessible to all Rust code is not accurate. Any help here would be much appreciated.
My assumption is that this is (nearly) the total set of global Rust "magic symbols". These would never be found in an rlib, and would be a small amount of code compared to the rest of a Rust artifact. Everything else is name mangled or otherwise is not shared between Rust components. Is that an accurate statement?
The mangled symbols are almost certainly not shared with other compiled objects. There are some cases where generic monomorphization​s are shared between codegen units, but I think it's always a case of using upstream's instantiation, and usually the generics are just duplicated in every codegen unit that utilizes the generic functionality.
You'll probably get a bit smaller of a staticlib with panic=abort, but the main code size contributor is probably the formating and IO machinery. These are necessarily utilized by the leaf-crate-emitted shims.
Out of mild curiosity, does that show the full list of symbols or just the exported symbols? I thought we had made rustc better at not just exporting every and all symbol from staticlib bundles. Maybe that was only for a target other than macOS.
You could also try throwing the output through strip to see if that impacts anything.