Rust creates a 5GB 'query-cache.bin' file on every incremental build?

I have a crate that contains some data types for my project. They're serialized with serde to JSON. I've noticed that with this crate in particular, rust is generating a target/debug/incremental/<crate-name>/<uuid>/query-cache.bin file which is approximately 5GB whenever I do a build. This is filling up my hard drive pretty quickly!

The complete source code of this crate is defined here.

For comparison, the query-cache.bin files for other crates in this project are on the order of 1MB. Am I doing something particularly dumb in this crate which is causing it to grow explosively in size? This is rustc 1.67 under macOS 12.6.

3 Likes

On an unrelated topic, pub struct Coins(pub u32); seems like a good candidate for #[repr(transparent)].

The only thing catching my attention is the amount of monomorphization. Maybe that's triggering an explosion of code.

But that is pure speculation.

Adding #[repr(transparent)] is unnecessary as this codebase doesn't do anything unsafe on Coins. It's also not necessary to enforce calling convention within Rust code, as single field structures with a Rust layout are essentially repr(transparent) (Structs and tuples - Unsafe Code Guidelines Reference) when interacting with Rust code.

Note that this isn't guaranteed, while for the current compiler single field structures are essentially #[repr(transparent)], this may change in a future version of Rust - so this shouldn't be relied on by unsafe code. With that said, I doubt Rust developers would make a change that would make calling convention for cases like this worse.

2 Likes

Very true. The reason we don't give any guarantees for the default repr is that we want to make sure that we can always give safe code -- which doesn't care -- the fastest possible thing. We want the freedom to switch next year to something that hasn't even been invented yet.

(Now, for a single-field struct of an ordinary type like u32, it's unlikely that anything clever can be invented, but that's the general rule.)

I assumed from the long list of derives that the author intends Coins to be functionally the same as u32. Is my assumption incorrect?

On the potentially severe end of mitigations, you could turn off incremental compilation.

If you want that, you'd write it as:

type Coins = u32;

and in that case indeed it would be functionally same as u32, and it would have identical representation.

If you make it a wrapper type:

struct Coins(u32);

then that's because you don't want it to be functionally identical to u32, and then typically there is no need to worry about whether representations are identical unless you're doing something low-level that requires it.

2 Likes

I was able to get my crate down from 45,000 lines of generated code to 31,000 (per cargo expand) by removing the use of the enum_iterator crate on a particularly large enum -- it generates an exponential amount of code for each enum case! The rest seems to be serde boilerplate. This crate on its own generates more code than the rest of my project combined.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.