Hi guys, I'm trying to write a program where a large vector of u64's, with about 53000 elements of predefined values, needs to be stored and used. However, the program takes a long time to compile, about 24 seconds on my Ryzen 2700X machine. It was taking 47 seconds previously if I tried assigning the values after creating the matrices Vector.
I felt these compile times are rather long for such a program, so I wanted to ask your opinion. I've created a semi-barebones repository here. Just trying to compile this should take a while.
So I wanted to ask:
Is this compile times expected for this given scenario?
If not, is there any way to speed up the compilation? For example, the same code in C++ takes much less time to compile.
Should this be brought up with the Rust language team?
You may also want to post this on the internals forum. A lot more people working on rustc itself tend to hang out there.
I'm guessing this is because rustc needs to parse the long array literal and keep track of a non-trivial amount of extra information (e.g. span). This isn't great for memory usage or processing and a problem I encountered a lot when making include_dir!(). One possible solution would be to save all the numbers into a text file then use include_str!() to embed the contents of the text file into your library at compile time.
You'd need to parse the text at runtime, but I'm guessing you can use something like lazy_static!() to make sure that only needs to be done once.
lazy_static! {
static ref MY_NUMBERS: Vec<u64> = {
let text = include_str!("my_numbers.txt");
text.lines()
.map(|line| u64::from_str(line).expect("Parsing should never fail"))
.collect()
};
}
It's not an overly satisfying answer, but this workaround should help with compile times.
I have no idea but I a suspicion that the macro building the array is taking a long time to generate all the initialization code.
vec![
0x8000000000000,
0x4000000000000,
...
]
Is that matrix immutable in your finished code?
I have a program that uses a table of 40,000 prime numbers. I generate it as a global static array at build time. It takes no time at all for my build.rs to generate those primes. The resulting code gets compiled in no time.
Yes, that matrix is immutable in my code. In my case, those are precomputed values used in general while generating Sobol low-discrepancy sequences, so I don't even need to generate them; they're just there. So I wonder what's wrong...
If you can read them from a file you can have the build.rs script do that at compile time and create a static array of values which then gets compiled into your code. I reckon that will be pretty fast and then your executable does not need to carry that data file around.
You can also use include_str or include_bytes. You then either parse them or if you store them in binary form simply use a transmute from &[u8] to &[u64] (or whatever).
That's the equivalent of what he's doing at the moment. Although instead of creating the static array of values using a build script, they're part of a file that already exists. The problem is that rustc still needs to parse and compile that static array (which would be a normal Rust file with a static).
I don't see how that would work... If the array is created at compile time there's no way to use it from your program unless the static data is bundled in your executable somehow. You could read the file from the filesystem entirely at runtime, but that means you need to distribute your data file alongside the binary.
I'm not sure how equivalent it is. My huge static array does not need any macro to expansion. My example shows that parsing and compiling a huge static array is fast enough to not notice it is happening.
It works by way of Rusts "include" capability. Include statements in Rust do like they do in C. In my case the array is generated into a file by build.rs and then included into the right place when the actual compilation tales place.
See "Build Scripts" in the Rust documentation and my example. Links above.
Interesting. Cancel what I said above then. Sounds like something buggy in the compiler.
Note that this is assuming you need to mutate the matrices field afterwards, if it's just a constant then you shouldn't even have a field for it in the Matrices struct.
Also, matrices.get(i).unwrap() can be replaced with matrices[i] (the latter has more information in the panic if it does go out of bounds, but it's also simpler).
I'd say it's only not a solution if you really have runtime data in that vec![...].
The only reason this has terrible compile times is because it's similar to writing thousands of vec.push(...) by hand, in its semantics, just faster to execute at runtime.
While we could promote large constants in the same way we do for borrows, we don't always know sizes in MIR, so we'd need some heuristics instead for the general case.
But maybe @oli_obk and @spastorino / @ecstatic-morse can take a look at this (come to think of it, const-prop could technically evaluate this ahead of time and keep it in a ty::Const, because the type is fully known, in this very specific case).
It's not a focus right now, but sufficiently advanced const prop will probably figure this out at some point. We have a few intermediate steps that we need to resolve first though