How to do "header files" / separate compilation in rust

One of the major differences between C/C++ and Rust is that in Rust the "compilation unit" is the crate, while in a C/C++ project it is a file plus all the header files that are included. That means that for large projects Rust is typically looking at a lot fewer and larger compilation units than C/C++, and this can be a source of slower build times. (Obviously this can go both ways, #include files have to be compiled many times so if you have a lot of code in there, looking at you C++, then compile times can balloon. But if you use header files the way they were originally intended in C then this is actually a fairly reasonable approach to parallelizing compilation.)

My question is: what idioms can you use in Rust to do the equivalent of "header files" in C? Just splitting a project into many crates often doesn't work because of coherence issues or the orphan rule, so there doesn't seem to be an easy pattern to get really fine-grained crates. It's not even really clear to me how to do a forward declaration, where one crate says "this function exists", lots of other crates call that function, and then a downstream crate implements the function.

I am aware that you can do some of this by essentially exactly mimicking C here using extern "C" external function declarations, but ideally there would be a way to do this that doesn't require a bunch of added unsafe.

The closest you can get (without just using #[no_mangle] and extern linkage) is something like the log or tracing crates provide, where you call some install function early in the program to install a global hook object which implements some trait.

There's some vague plans to maybe eventually expose a way to do something like #[global_allocator] in user code eventually, but no concrete proposals yet.

In general, though, if you just write less generic code (i.e. like you would in C), the compiler is fairly decent at incrementalizing compilation.

True, but that doesn't help with parallelizing full builds, since IIUC typechecking is single-threaded per crate.

Remember that cargo partially does this automatically now: Reporting build timings - The Cargo Book

It outputs the metadata needed to start compiling things depending on it -- basically like a header would -- so that you can start compiling crate B before LLVM has even started codegen for crate A. (And, like with inline things in C++ header files, the more generic code and #[inline] methods you have, the more ends up in those metadata files.)

So if you have something like fn get_the_thing() -> Box<dyn Foo>;, you get many of the same advantages you'd get with struct Foo; Foo* get_the_thing(); in C.

Do you have a specific project in mind where it's being too slow for you?

2 Likes

No, it is a recurring thing that has come up as a nice-to-have that I was reminded of when reading Reddit - Dive into anything . I mean, rust projects taking nonzero time and being nonzero annoyed at it is a problem that will always exist; it is difficult to tell as a user how much is essential complexity and how much is caused by bad project architecture, but when I consider the structural differences between C and Rust that might be contributing to compilation time, this sticks out.

I do have a project mm0-rs which takes about 4 minutes to compile, and I have used cargo build --timings on it and observed that about half of the time is spent compiling the single main crate (and I think codegen is only 30% of that time, so it's significantly bottlenecked on the single-threaded part), and I have felt guilty about not breaking it up, but Rust doesn't really make it easy to break up monolithic crates and I was wondering if anyone had any tricks for doing it with less pain.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.