In Rust, there are a lot of generic code, and a lot of this code lives in libraries.
My understanding is that if the library defines struct S<T> { ... } then final stages
of the compilation of this type (monopolization & code generation) will happen in the
downstream crate that uses the library. This means that if you develop an application,
which uses the library, then every recompilation will recompile some part of the library
as well, which increases (re)compilation time.
C++ works similar, and they have a trick up their sleeve to mitigate this: in C++, you can
explicitly instantiate your Sin the library for some type parameters (like S<i32>, S<i64>) and avoid recompilation if downstream consumers instantiate S with the same
parameters.
Is the same trick possible in Rust? In concrete terms, say I define a vector Vec3d<T> which
I want to be generic over the type of scalar. Will writing something like
Hm, I thought that this would actually just work. This means that my understanding of Rust's compilation model is wrong. I would very grateful if someone could explain what is wrong with my mental model
Let's say I have two files/crates: container.rs and main.rs
// container.rs
use std::fmt::Display;
pub struct Container<T> {
pub t: T
}
impl<T> Container<T> {
pub fn print(&self)
where
T: Display
{
println!("{}", self.t);
}
}
pub mod impls {
use super::Container;
pub fn make_i32_container() -> Container<i32> {
Container { t: 92 }
}
}
//main.rs
extern crate container;
use container::Container;
fn main() {
let c: Container<i32> = Container { t: 92 };
c.print();
}
And I compile then using the following two commands:
I think that libcontainer.rlib contains (optimized, if --release is present) machine code or bit code for non-generic make_i32_container function. If this is true, it must contain code for Container<i32> instantiation as well.
Then, when we compile main.rs, it seems to me that the compiler can see that we already have the necessary instantiation, and avoid specializing Container for i32 the second time. Or are template instantiations per-crate, and then the duplicates get removed during linking, like with C++ inline?
What do you mean by “Container<i32> instantiation” here? I imagine that from the codegen perspective it only makes sense to talk about the instantiation of functions (associated with the actual code), not structures. So in your case, there's no reason for libcontainer.rlib to contain a Container<i32>::print, as it's never actually called (since all the generic functions are instantiated on-demand).
actually causes <<container::Container<i32>>::print> to be embeded in the rlib. Unfortunatelly... if you add -O flag, the print disappears as it's entirely inlined into print_i32_container. Adding an #[inline(never)] annotation for print makes it appear again (and causes the whole print_i32_container to compile to a single jmp). So if #[inline(never)] doesn't bother you... go for it!
(Also note that the <Container<i32> as Display>::fmt is not being present the the rlib, as it's fully inlined).
So, to make this hacky "impl modules" work nicely, it would be nice to have an #[inline(never)] attribute on the function call itself.
There's yet one more solution:
pub fn impls_generator() {
::test::black_box(Container::<i32>::print as fn(_));
}
The black_box is not in stable, but I think there should exist some stable code to generate a blackbox-like behaviour.
So this was about forcing Rust to generate code for a function. Whether dependent crates can actually use this code is another question, but I think yes?
Layout of the struct depends on the generic types, so one could talk about that being "pre-determined". However, that's not where the bulk of compilation time is spent I presume.
I'm not entirely sure what the goal here would be. I suppose having a non-optimized IR/MIR/whatever representation already done could help compilation time. But, I doubt #[inline(never)] is what one would want just to preserve the function. You want the optimizations to apply, and inlining must occur for relevant functions. Also, once the function is inlines into your crate, codegen will differ based on context. So the only thing you can preserve here is likely the non-optimized (or minimally so with local optimizations) IR that you can reuse when linking that lib into others. I'm not sure how much time this would save because AFAIUI the LLVM passes are what takes a significant portion of compilation time. And given that you want the passes to work on inlined call graphs, I'm not sure what you'd save by preserving basic IR of a generic instantiation .