Experiences and troubles integrating Rust and Cargo into a statically compiled, polyglot project


#1

So at work I am rewriting a C component as an open source (as of yet not public) Rust component. It will be developed out of tree in a separate repository.

mycrate has the following directory structure:

.
├── Cargo.toml         (workspace)
├── mycrate            (Rust API)
│   ├── Cargo.toml
│   └── src
│       └── lib.rs
└── mycrate-capi       (C API)
    ├── build.rs
    ├── Cargo.toml
    ├── include
    │   └── mycrate.h  (generated by cbindgen via build.rs, checked into github)
    └── src
        └── lib.rs

We use the following in mycrate-capi/build.rs as suggested by cbindgen’s documentation:

extern crate cbindgen;

use std::env;

fn main() {
    let crate_dir = env::var("CARGO_MANIFEST_DIR").unwrap();

    println!("cargo:rerun-if-changed=src/lib.rs");
    cbindgen::generate(&crate_dir).unwrap().write_to_file(
        "include/mycrate.h",
    );
}

I’ve set up mycrate-capi/Cargo.toml to output mycrate.a and mycrate.dylib by specifying the following:

# ...

[lib]
name = "mycrate"
crate-type = ["staticlib", "cdylib"]

So now I want to add this to my company’s repository. I want to have a shared workspace for the repository, sharing a common top level target directory and Cargo.lock file. We might start adding proprietory Rust code as well, so more subcrates might be added in the future. But for now we just leave it blank. I’ll add mycrate-capi as a depencency:

[workspace]
members = []

[dependencies]
mycrate-capi = { git = "ssh://git@git@github.com/mycompany/mycrate.git" }

Here is where we run into our first problem! Cargo does not see the dependencies section in a virtual crate. So we need to add a hacky top-level crate to allow this:

[package]
name = "myproject"
publish = false

[workspace]
members = []

[dependencies]
mycrate-capi = { git = "ssh://git@git@github.com/mycompany/mycrate.git" }

Next problem is that although the build C API is now built to target/<target>/deps/mycrate.a, the header file is nowhere to be seen! In fact it lives up in the global cargo cache, in ~/.cargo/git/checkouts/mycrate-*/*/mycrate-capi/include/mycrate.h. This is kind of concerning! Seems like mycrate-capi's build.rs is altering the global state of my cargo cache. I also have no good way of accessing that output from my project.

Any thoughts on what I could do instead? Some ideas:

  • Develop the C API in-tree for now, just keeping the Rust code public until better workflows are figured out. This seems to be what Mozilla is doing for the URL parser.
  • Add a shell script to just copy the source code in-tree - kinda ugly, but this is what Mozilla seems to do with the mp4 metadata parser
  • Set an environment variable like MYCRATE_INCLUDE_DIR that could be preferred over CARGO_MANIFEST_DIR to allow me to output the headers in a custom location. In my project’s build system I could then run MYCRATE_INCLUDE_DIR=./target/$RUST_TARGET/include rustup run $RUST_TOOLCHAIN cargo build --all $CARGO_FLAGS. This is a bit of a hack, and is not very nice from the perspective of having a general solution for the problem across the ecosystem, but perhaps could get us over the line in the interim.

EDIT:

For now I’ve updated my library crate to have the following:

extern crate cbindgen;

use std::env;

fn main() {
    let crate_dir = env::var("CARGO_MANIFEST_DIR").unwrap();
    let header_out_dir =
        env::var("MYCRATE_HEADER_OUT_DIR").unwrap_or_else(|_| "include".to_owned());

    println!("cargo:rerun-if-changed=src/lib.rs");
    cbindgen::generate(&crate_dir).unwrap().write_to_file(
        &(header_out_dir +
              "/mycrate.h"),
    );
}

Related: How to retrieve .h files from dependencies into top-level crate’s target/

cc. @aturon @alexcrichton


#2

This seems like an interesting problem, so let me through in a couple of questions, despite the fact that I don’t fully understand what this all is about :slight_smile:

First, I think messing with sources from build.rs is not how things are supposed to work? Could you genrate mycrate.h into OUT_DIR? I am not really sure that generating headers should be a part of Rust (Cargo) build process: I would say that they should be generated manually and just committed to the version control.

Second, I don’t quite understand where the workspace comes from here… Specifically, I assume that the reverse dependency of you crate is the C code, which is build by some external build system. So adding you crate in [dependencies] does not make sense to me, because no Rust code depends on it. That’s why virtual manifests can’t have dependencies: there’s no lib.rs to put corresponding extern crate foo to.

I would think that to integrate this with C code, you’ll want to add your repository as a submodule to the C source tree, and then call Cargo from the C’s Makefile or alternative.


#3

Thanks for replying! Sorry if I wasn’t very clear :sweat_smile:

The trouble with generating the header in OUT_DIR is that it becomes quite tricky to add the header to the include path for the C build, because you end up with a bunch of directories under target/<target>/build/mycrate-<hash> with different hashes for past compilations. Perhaps if there was a way to query the build directory using cargo, it might help. Eg. cargo out-dir --package foo mycrate-capi - that way you could then hook it into another tool.

The C code is at the application level, so it will have a Cargo.lock checked in to the repo, alongside the virtual workspace. We will most likely be writing some proprietary code in Rust, so it makes sense to at least to have it for that, but they too will probably be in subcrates, included in the larger C project. The reason why I want to use Cargo is to offload the dependency management, and allow the Rust components to share their target folder to improve build performance. It would also allow for the following workflows:

  1. Working on the C project independently of mycrate (using the latest published version)
  2. Working on mycrate independent of the C project
  3. Working on mycrate in combination with the C project (using local version in a repository alongside, for bug fixes and initial development)

I’ve been trying to avoid submodules, tbh, but I did think of them. My concern was about causing undue workflow friction to my fellow developers and my past experiences with git submodules have been not so pleasant. I guess if we were to use them we could recover the shared target directory behavior by setting the CARGO_TARGET_DIR. Not sure how to do an override to a local clone - perhaps that could be a feature added to the C build system?


#4

Thanks, now this is much clearer!

So looks like what we hypothetically want is cargo install (with support for lock files), which works for staticlibs, cdylibs, and can package headers as well.


#5

Possibly. Ideally sandboxed to a project directory too to avoid polluting global state. But I’m wary of jumping to suggesting immediate solutions for Cargo itself. This post is both intended as a request for immediate suggestions (in case I’ve been missing something silly), but also as a record of some of the real-world pains we have been facing. Hopefully this can help as we move forward on figuring out how to make Cargo easier to integrate with the outside world!

The tricky thing in the design space is trying to not unduly complicate cargo even more - it’s a little hard to fully get one’s head around now. Just been through the wringer trying to refresh myself on all the various ways to configure cargo, through manifest format, environment variables, build scripts, config files, etc…