Why are builds of my crate's test not consistent (cargo build vs cargo test vs VSCode-Test-Explorer)

TL/TR
Why are build results of my crate's test not consistent - I see rebuilds happen for combinations of these without changes in between?

  • cargo build --all-targets
  • cargo test -p create_a
  • VSCode Test Explorer building tests for create_a

Background:

  • we use a rust workspace ~40 creates as members
  • cargo build --all-targets would build ~600 individual crates (i.e. dependencies + workspace crates)
  • one of the crates in workspace "crate_a" would build ~300 individual creates (i.e. dev-dependencies + the crate's test itself)
  • we have equivalent observation (see blow) no matter whether or not we use [workspace.dependencies] consistently (same versions and feature sets) for all creates in our workspace (including their dev-dependencies). (we tested for both cases)

Development practice

  • We use VSCode with rust-analyzer extension as development tool on windows workstations.

  • We use .vscode/settings.json having "rust-analyzer.testExplorer": true,

  • we use any of these when we want to verify/debug test cases:

    • from terminal: {workspace_root} cargo test (for complete workspace)
    • from terminal: {workspace_root}/create_a cargo test (for individual crate in workspace)
    • CodeLens: run individual test cases with or without debug triggered from inside source code editor
    • Testing view (Test Explorer): run all or individual test cases triggered
  • we usually do a {workspace_root} cargo build --all-targets before we run tests hoping for both:

    • a) quick feedback if your change compiles and doesn't break compilation of other test cases
    • b) have build results ready for running any number or selection of tests

We suffer from un-expected rebuilds (even without source code changes) in these scenarios. This way we suffer from slow turnaround times.

Observation
Here comes a scenario (Note: no changes to workspace happen in between)

  • cargo clean
  • cargo build --all-targets
    -> creates (among others) \target\debug\deps\crate_a-<hash_a>.exe
  • cargo test -p crate_a
    -> (unexpected) rebuilds some dependencies and creates
    target\debug\deps\crate_a-<hash_b>.exe
  • using test-lens to run a test case
    -> no rebuild (as expected), uses
    target\debug\deps\crate_a-<hash_b>.exe
  • using test explorer
    -> (unexpected) rebuild some deps + *.exe
    -> seems to re-create
    (target\debug\deps\crate_a-<hash_b>.exe)
  • using test-lens to run a test case
    -> (unexpected) rebuild some deps + *.exe
    -> seems to re-create
    (target\debug\deps\crate_a-<hash_b>.exe)
  • cargo test -p crate_a
    -> (as expected) no rebuild
    -> uses target\debug\deps\crate_a-<hash_b>.exe
  • cd .crate_a
  • cargo test
    -> no rebuild (as expected)
    -> uses target\debug\deps\crate_a-<hash_b>.exe
  • using test lens
    -> (as expected) no rebuild
    -> uses target\debug\deps\crate_a-<hash_b>.exe
  • using test explorer
    -> (unexpected) rebuild some deps + *.exe
    -> uses target\debug\deps\crate_a-<hash_b>.exe

Fazit
It seems there is an (for us) unexpected inconsistency/rebuild happening between
a) cargo build --all-targets (workspace) and the building/running crate_a's test
b) VS Code Test Explorer doesn't seem to 'accept' build result of create_a's test and re-creates it when alredy previously been build.

So for me it comes down to two questions

  1. Why does cargo build --all-targets and individual builds via cargo test -p crate_a don't create/re-use the same build result?
  2. Why does VS Codes's "Test Explorer" and the other ways to create build result for crate_a's do re-builds, even though thy apparently use the same build result?

When you perform a build, the set of enabled features is determined by what is required by all packages being built. Therefore, when you restrict the build to one of your packages with -p crate_a, fewer features are enabled, requiring a different build of the packages with those features and all of their transitive dependents.

I thought I had an answer to this, but I just tested and I don’t. In general, the common things that cause rebuilds besides the above are environment variables being different. Also, if you use cargo build --verbose you can have Cargo print why it thinks something needs rebuilding (as opposed to building a different feature configuration).

Thanks a lot for your helpful answer.
As of my Question 1) I am still trying to learn how to use cargo tree to corroborate and see the exact dependency with feature sets being different for both cases.....
As of my Question 2), your suggestion was really helpful. and I found indication, that rust-analyzer feature for TestExplorer uses env variable RUSTC_BOOTSTRAP, which causes re-builds of one of our dependencies: proc-macro2. For this I filed a more specific minimal reproducible example and question as follow up: VSCode - rust-analyzer cargo - annoying re-builds due to RUSTC_BOOTSTRAP changed.

Following up on Question 1)

using --verbose, CARGO_LOG=cargo::core::compiler::fingerprint=trace (and some help of AI), I found at least these two distinct root causes of the observation:
a) as indicated by @kpreid (thank you) we have a different set of features effectively used for at least one of our dependencies in both cases due to the different dependency trees that cargo uses for both variants (whole workspace vs individual crate's test only). Namely, e.g. for serde_core we had one additional feature in workspace compared to the individual crate's dependencies only.

However, we made another observation:

b) Not only is the set feature different, in our case also the profile for the unicode-ident create showed differences. We found that in the crate-only case, unicode-ident was only in build-time dependency as opposed to also being a runtime dependency in the whole workspace case, which apparently caused some 'optimization' implemented by cargo to compile it with no debug info (which caused a different fingerprint used in this case). I learned, that this fact is documented here Profiles - The Cargo Book, while it doesn't clearly tell (in a way that newbies like I would understand), what exactly is considered a build (only) dependency. Meanwhile I learned that dependencies via proc-macro are also maybe considered build-time-only dependencies in that sense.

Discussion:
It feels a bit cumbersome, that in a rust-workspace scenario, there is no single cargo build command possible, that would consistently (re-build robust) build all potential targets in the workspace. I have also not made up my mind what I would specifically wish for here:
i) cargo allows to build tests for each crate individually in one batch (like cargo test --no-run -p create_a) would do
ii) cargo test -p .... use the consistent workspace wide build configuration (even though this would include more features than necessary for certain dependencies, however would be more re-build friendly)

Fazit:
This resolves this issue for me. Question 2) still being followed up in the spin of thread.

Cargo has unstable support for resolving features in a consistent way: Unstable Features - The Cargo Book

There are workarounds on stable. The technique is call "cargo workspace hack" and cargo-hakari is one tool for managing it.

As for extra builds caused by us trying to reuse builds due to host/target profiles is interesting. That optimization in Cargo is complicated, limited, and blocks other optimizations. I wonder if it is worth revisiting it.