I've been working on a project where I have two macOS machines, one running on an x86_64 architecture and the other on arm64. I've observed that when I compile the same project on these machines targeting x86_64, a full Link Time Optimization (LTO) on the arm64 Mac finishes twice as fast as on the x86_64 Mac.
To leverage this, I created a build system that runs the rustc command on the x86_64 machine to compile all the rlibs, and then transfers the necessary files (rlib, rmeta, etc.) to the arm64 machine to perform a full LTO link.
However, I've encountered a compilation error stating:
Can't find crate for 'serde_derive' which 'thoth' depends on...
Common causes for missing std or core- You are cross-compiling for a target which doesn't have std prepackaged
The suggested solutions include adding a pre-compiled version of std with rustup target add or building std from source with cargo build -Z build-std.
These suggestions don't seem applicable in my case, as I've already added the x86_64 target to the arm64 Mac's Rust toolchain, and the original cargo command included '-Z build-std'.
As a workaround, I tried using Rosetta to enable the arm64 Mac to run x86_64 binaries and installed the x86_64 Rust toolchain on the arm64 Mac. The build system was modified to call the x86_64 version of rustc on the arm64 Mac. This approach works, but the speed advantage is greatly reduced, with the arm64 Mac only slightly faster than the x86_64.
Can anyone suggest how I might resolve this issue without losing the speed advantage of cross-compiling on the arm64 machine? Your insights would be much appreciated.
serde_derive is a procedural macro crate, so it is compiled to the host cpu, not the target cpu. I don't know how your build system works, so I can only guess. for cross compiling scenarios, certain crates are compiled for the host and others are for the target. notably, build.rs (which means all the build-dependencies crates too) is always compiled for the host, same as procedural macro crates.
maybe you could share more details about your custom build system.
My build system functions as a rustc command wrapper. It analyzes the rustc command to determine if it should be run on a remote machine. If so, it fetches all dependency files related to this command, transfers them, and then dispatches the command for remote execution. In this context, the x86_64 Mac serves as the local machine, and the arm64 Mac is the remote machine.
In light of your response, are you suggesting that I should direct the compilation of all procedural macro crates (and possibly others) to the remote arm64 Mac?
All proc macros have to be compiled for the same architecture as the rustc that will use it. With regular cross-compilation cargo will compile proc macros for the host as rustc is compiled for the host too.
Is there a reason you are not running rustc on the arm64 machine and using cargo's native cross-compilation support to use the arm64 toolchain to compile for x86_64?
Our decision is primarily based on the resources at our disposal. We have a substantial number of x86_64 Macs and a limited number of arm64 Macs. Given the cost associated with the M2 Ultra, it's currently more practical for us to utilize the resources we already have rather than acquire more arm64 machines.
how do you decide which command should be remote? by examining the --crate-type option, or the --target option? or some other logic?
it's not that simple unfortunately. since you mentioned lto, I assume you only dispatch to remote machine the final binary (or shared/dynamic library) which needs to invoke the linker, not the "normal" rust library crates that are dependencies of the final result?
for the proc-macro crate type, it should be compiled to the same architecture as rustc itself, because proc-macro is compiled into a dynamic library and loaded by rustc during compiling. that means, only the direct proc-macro dependency of the binary crate should be compiled to arm64, but not those proc-macro crates that are dependencies of intermediate rust libraries. this also means you can't decide whether a command should be remote by examine the command line only, you must also retrieve the entire dependency graph metadata from cargo.
also, since the remote command is cross compiling, you might need additional command line options beyond how cargo invokes the native rustc. I'm not an expert, but you can try compare the command line your wrapper received to the command line that of "real" cross compiling invoked by cargo.
-Z build-std should not be necessary for this target with an operating system, it is mainly used for bare metal targets or not yet fully supported targets.
That's correct, as of now, we only dispatch the final LTO link to the remote machine.
To address the proc-macro crate type issue, what if we were to compile the proc-macro crate type on the local x86_64 machine for the x86_64 architecture, and concurrently fork this rustc command (with potential modifications) to the remote machine for arm64 compilation? We could gather the dependencies for the link job, excluding those from the proc-macro crate, allowing the arm64 version on the arm64 machine to be utilized. Would this approach be valid?
Without the unstable -Zdual-proc-macro (which is a bit of a hack) it isn't possible to use crates that depend on a proc macro on another target. -Zdual-proc-macro will build the proc macros for both the host and target and record them both in the crate metadata of the crate that uses a proc macro such that dependent rustc invocations can load either version.
The option -Zdual-proc-macros you mentioned is specific to Cargo, correct? In my scenario, Cargo is invoked on an x86_64 machine, and the target is also x86_64, so it isn't recognized as a cross-compilation task. I suspect this option may not be applicable in this particular case.
However, I'm considering a different approach. What if I fork the rustc command from the x86_64 proc-macro, change the target option to --target aarch64-apple-darwin, and send it to the arm64 machine without altering any paths or other options? For the link command, the extern option would remain as --extern serde=./target/x86_64-apple-darwin/release/deps/libserde-xxxxxx.rmeta, even though the actual target would be aarch64-apple-darwin.
Would the rmeta ./target/x86_64-apple-darwin/release/deps/libserde-xxxxxx.rmeta be valid for arm64rustc in this context?