[FeedbackWanted] Idea: Cargo wrapper for "zero setup" cross compilation

Wouldn't it be nice if e.g. cargo build --target arm-unknown-linux-gnueabi Just Worked? Without having to install a C cross toolchain (e.g. arm-linux-gnueabi-gcc), cross compiled standard crates (i.e. rustup target add) or cross compiled C libraries (e.g. Glibc) on your system?

Well, I'm here to tell you that such thing might be possible and that cargo test --target arm-unknown-linux-gnueabi and cargo run --target arm-unknown-linux-gnueabi might Just Work as well.

Not with Cargo thought, but, instead, with a "transparent" Cargo wrapper like Xargo. Which I'm going to call "Dock".

Dock: ship your Rust code anywhere (*)

How will it work?

TL; DR You'll use it just like Cargo but it'll use Docker and QEMU under the hood.

When you call dock build --target arm-unknown-linux-gnueabi inside a Cargo project, Dock will fetch a Docker image (from Docker Hub) that contains the C cross toolchain and cross compiled C libraries and then run your system Cargo (and rustc) inside a Docker container based on that image. This container will be destroyed right after the Cargo command ends. Oh, and if necessary Dock will also call rustup target add arm-unknown-linux-gnueabi outside the container for you.

This ephemeral Docker container will have write access to your Cargo project's target directory and your system Cargo registry. Note that plain Cargo also has write permissions like these. However, the Docker container will NOT have write access to the rest of your Cargo project (e.g. the src directory) or any other directory. I believe this is a good idea because your Cargo build shouldn't touch any other part of your Cargo project that's not the target directory (as that messes up Cargo "freshness" check).

dock test --target arm-unknown-linux-gnueabi and dock run will also make use of a Docker container and additionally use QEMU user emulation mode to transparently run cross compiled binaries.

About the Docker images

There will be one Docker image per cross compilation target. Each image will contain all the packages necessary to cross compile and test/run the cross compiled binaries.

For maximum portability of the cross compiled binaries, the Docker images will be based on the oldest stable Ubuntu version to keep the Glibc requirement of the cross compiled binaries as low as possible.

To increase the number of Cargo projects Dock can compile, the images will also contain extra cross compiled C libraries that are widely used by Cargo projects. For example: OpenSSL.

These images will be versioned using tags and Dock will make use of an image whose tag matches Dock's version (dock -V). This way we'll be able to update the Docker images with each Dock release.

What if I need more cross compiled C libraries?

Dock will let you override the Docker images it uses via a Dock.toml file which it's local to each Cargo project:

[target.arm-unknown-linux-gnueabi]
# this image can be local or be hosted on Docker Hub
# if both exist, Dock will prefer local images
image = "user/image:tag"

It's recommended to base these images on the Docker images Dock would normally use.

FROM japaric/arm-unknown-linux-gnueabi:v0.1.0
RUN apt-get install -y --no-install-recommends packages more_packages

Will all of this actually work?

The "run Cargo inside a different Docker container for each target" approach have been battle tested for a long time in the libc repo and other repos.

The "use QEMU user emulation to transparently run cross compiled Rust programs/test suites" have been proven to work in the compiler-builtins repo.

The answer is: Yes, but someone has to sit down and put all the pieces together.

Dock 0.1.0 milestone will be: being able to dock build Cargo for x86_64, i686 and ARM. The stretch goal will be being able to dock test Cargo for the same architectures.

Caveats

  • Dock can only be invoked on x86_64 Linux (Note: you will be able o use Dock on a Travis worker)

  • (*) Dock can only cross compile/test Linux targets. (Targeting Android and MinGW might be possible too)

  • QEMU is good but not perfect and it may explode (read: segfault) while testing "some" Rust code.

  • If using dock run or dock test for the first time (during your current session), you may need to enter your password because transparent QEMU emulation requires installing "binfmt interpreters" in /proc and that requires root privilege.

  • Lots of disk IO? Continuously creating and destroying Docker containers seems wasteful.

Untested, future enhancements

"Wait! It gets better?"

panic=abort

It may be possible to leverage some of Xargo logic to make Dock compile the standard crates (up to std) using the panic=abort codegen option thus making it possible to build binaries 100% free of landing pads (all the official binary releases of the 'std' component are compiled with panic=unwind and thus have landing pads in them). This should produce smaller and faster binaries.

It probably only makes sense to do this if profile.*.panic have been set to "abort" in the project's Cargo.toml.


Thoughts? Questions? Would you use something like this? IMO, this could simplify lots of CI setups that have to deal with several, non-native compilation targets.

cc @alexcrichton @brson

(It's late; I'm going to answer questions/comments tomorrow)

4 Likes

This is probably not the comment you've been looking for, but dock is probably not such a great name and command because it will attract attention from Docker, Inc. (They do not have much choice in this matter, they need to enforce their trademark, or it will lapse.)

1 Like

Exciting!

Consider using Vagrant as an alternative to Docker. I use it for my embedded open source projects in both C++ and Rust. It has two main advantages:

  1. It simplifies managing a VM per project clone, compared to Docker (where VMs are inherently machine-global). (Note that you've also gotten a bug report from me where Xargo doesn't handle per-clone settings well... maybe I just have more clones than most.)

  2. It works on Mac and Windows (and probably 32-bit Linux as well).

May I recommend https://github.com/tailhook/vagga instead, which doesn't need a docker daemon running, but works with docker images?

(Note the section on OS X/Windows, which does leverage docker or Vagrant if available on those platforms: Installation — Vagga 0.8.1 documentation)

This seems pretty slick, I've often wondered if we can achieve this level of ease. I will agree with @cbiffle though that the major drawback here is being built on Docker, which excludes OSX and Windows users (as the native toolchains can't run inside the Linux containers). Docker on Linux though definitely seems like the right option as it's nicely configurable with lots of speed to boot.

I'd also be slightly wary of the QEMU emulation we've done up to this point. It's all the "user mode" emulation which in my experience it feels broken once you try to do anything fancy. We've gone to great lengths to avoid using, for example, threads, which tend to segfault in qemu user mode pretty quickly. I've experimented in the past booting up entire kernels within QEMU to run tests (e.g. the FreeBSD and NetBSD tests on the libc repo) but unfortunately the process is pretty slow and difficult to have all automation in-tree.

I personally feel we can get quite far setting aside the test portions of this and just focusing on the build for now. That is, let's get cross-compiles working seamlessly and continue to leave it up to the user to figure out how to run those binaries. That'd alleviate the QEMU question, at least.

Finally, this is also the initial vision of rustup, so there's some overlap here. But ideally rustup would provide this sort of automation to install toolchains and environments natively on the system for all manner of cross-compiles. Now it's likely surely much more messy than Docker, but it's cross platform which is ensuring that we keep all platforms to the same level of experience. @brson may have more thoughts here, though. The "native NDK" support hasn't seen a lot of love in rustup just yet, however.

I have also been thinking about how to improve cross-compiles. Specifically, thinking about when you use system shared libraries. Rust is great now for the compile part compared to C/C++ because you don't need the target file system with the headers due to the *-sys crates. However, linking is a problem because you need to libraries to link to.

Really though, you don't need the libraries for linking. AFAIK the extern blocks more or less have all the information you need, i.e., the symbol names and library name. From those you could create an "import library" to link against. This is easy for a windows target, just put these names in a .def file and use lib.exe/dlltool. The concept of import libraries doesn't exist for ELF though AFAIK. Regardless, you should be able to create a dummy .so with just the symbol table and soname to link against.

This would also help remove the GNU toolchain dependency even if you aren't using *-sys because of libc. Which, with the move to lld, won't be part of the toolchain.

Obviously, this would only work for dynamic libraries, for static linking you will always need the target libraries.

1 Like

This is something that I'd like to happen.

I think the container-based approach has some obvious advantages and makes sense to pursue. The big downside seems to be portability: you've got to be able to create a container that runs Linux. What's the workflow like for Windows and OS X users? I imagine that, if they can run docker, then they can cross to anything Linux can cross to, but e.g. the Windows->Android and macOS->iOS cases are not served by this design. Is there any hope of doing so? With those cases you also want access to your emulator's GUI, which is probably tricky with containerization.

Distributing the Android NDK / SDK in a container is possibly against the license, so that might require some special case logic.

I've also been intending to add more support for setting up and acquiring cross-compiling tooling to rustup, but I'm tempted to put it in the cargo ecosystem if it can be done nicely, leave rustup purely to dealing with rust binaries.

One of the things that I would prefer is if we didn't have to use a command other than cargo build, and this is one of the reasons to put cross-toolchain management in rustup - it can intercept cargo build.

I have other reasons for wanting to intercept cargo commands as well. I think we should define a rustup-compatible protocol for intercepting toolchain commands, so this kind of tooling can be done within the cargo ecosystem experimentally and not be blocked on upstream. There may be a number of things we want to do wrt tooling that can be accelerated by just making xargo and this tool look like cargo (would we even need std-aware cargo?).

I also am not crazy about the 'dock' name, even if it is intended to be tied strictly to Docker.

1 Like

Finally got a chance to implement this. The result is cross (Yeah, I change the name. If you have better alternatives for the name, I'm all ears).

As the design I wrote in the OP, cross uses Docker and QEMU. Right now, it supports 12 targets (more to come later) where each one has a different architecture (arm, x86_64, powerpc, mips, etc.).

Both compiling and testing has been implemented. 10 of 12 of the targets can cross compile Cargo. 11 of 12 of the targets can cross test the compiler-builtins crate.

I intent to keep the scope of cross on cross compiling from Linux to Linux. @brson has some some ideas about a rustup plugin system where (crates.io) tools like cross, xargo and dinghy could become plugins that rustup calls when one issues e.g. a cargo build command thus using the right tool for each scenario.

4 Likes