Extremely slow Windows compile times

My coworker who uses Windows is experiencing ultra-long compile times and I'm not sure why. On Linux and Mac, cargo check is almost instantaneous, but on their Windows machine check / build / build --release all seem to have insanely long wait times.

We've tried completely disabling Windows Defender, but that didn't change anything.

On Linux, after adding a newline to a file to get it to rebuild:

  • cargo build --release takes 1 minute
  • cargo build takes 21 seconds
  • cargo check takes 4 seconds

On Windows, with the equivalent newline:

  • cargo build --release takes 2 minutes and 29 seconds.
  • cargo build takes 2 minutes
  • cargo check takes 1 minute and 48 seconds.

Any help would be greatly appreciated.

Thank you!

That does seem absurdly long. I'm not certain what could be causing it but some possible reasons that I know have caused issues are:

  • using a non-local drive (either for the source or the target directory)
  • using a WSL drive from Windows or vice versa.
  • McAfee virus checker

Is this reproducible on most any Rust project (for example regex) for you? Or is it specific to the project you are working on?

One option is to try capturing a self-profile from the compiler and post the results (the output of summarize is probably sufficient).

After doing a self profile and looking at summarize, it says the CPU time was 20 seconds. However, when we built the project using cargo build it says 2 minutes.

Does this mean something on the OS is blocking rustc's CPU time?

If so, I'm not sure how to find that since we've already disabled Defender.

That is very strange! The self-profile times are based on wall-clock timers so even if the process isn't getting time on the CPU, that should still be tracked somewhere in the profile. Since it isn't, it's possible that cargo is waiting on something before starting the rustc process. You might want to try the cargo -Z timings flag to see if cargo is waiting on something.

1 Like

When it’s compiling, does any other process spike on cpu usage?

Windows IO is generally slow.
But situation will be better if you disable builtin antivirs (to completely disable it you have to use group policies)

Did you disable it via UI or via group policies?
Note that you cannot disable defender by normal means for certain

We disabled it via the UI. How can we check if it's still enabled via a group policy? I would have though if it were enforced by a policy Windows would just say "no you can't disable it".

If you have Pro or Enteprise version of windows you can use group policies to make sure real-time protection is disabled.
You can find policy here:
Computer Configuration > Administrative Templates > Windows Components > Microsoft Defender Antivirus

There you can find policy Turn off Defender Antivirus
Enabling this policy will disable antivirus protection and prevents you from enabling it via UI.
Note that I heard complaint from my friend who updates windows that it stopped working, but it works as intended on my machine

Even with Defender enabled, it should not add well over a minute to a four second cargo check. It's not that slow.

3 Likes

Before running this or the cargo rustc self-profile command, does it matter if we built the project already? Not sure if we should delete ./target beforehand.

It can make a difference because of incremental compilation but I would recommend just trying to reproduce the scenario that you originally reported about adding a newline to a file.

Although, if you can cargo check and then cargo check again with no changes and still see the same slowness, that is a clue as well.

We disabled Windows Defender as @DoumanAsh suggested, but no change to compile times were seen.

When we are compiling, Antimalware Service Executable is using about 0.1% CPU, so it doesn't look like it's doing anything.

@wesleywiser Without removing ./target and just adding a newline and running cargo check it finished in 1 minute 50 seconds. Running cargo check again with no changes finishes in 0.54s.

Running cargo rustc --manifest-path ./my-crate/Cargo.toml -- -Zself-profile :

  • Finished in 2 minutes 10 seconds.
  • CPU usage most of the time is less than 10%.
  • Here is the top portion of the summarize output since it was too big to include in this post:
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| Item                                             | Self time | % of total time | Time     | Item count | Incremental load time |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| LLVM_module_codegen_emit_obj                     | 20.50s    | 39.815          | 20.50s   | 257        | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| LLVM_passes                                      | 7.83s     | 15.204          | 7.83s    | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| codegen_module                                   | 7.58s     | 14.728          | 8.20s    | 256        | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| run_linker                                       | 7.53s     | 14.617          | 7.53s    | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| expand_crate                                     | 2.21s     | 4.288           | 2.40s    | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| LLVM_module_codegen                              | 711.77ms  | 1.382           | 21.22s   | 257        | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| LLVM_module_optimize_module_passes               | 571.49ms  | 1.110           | 571.49ms | 256        | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| monomorphization_collector_graph_walk            | 522.47ms  | 1.015           | 1.70s    | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| metadata_register_crate                          | 229.13ms  | 0.445           | 755.94ms | 305        | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| incr_comp_encode_dep_graph                       | 206.21ms  | 0.400           | 206.21ms | 663602     | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| typeck                                           | 199.19ms  | 0.387           | 986.55ms | 46         | 9.41ms                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| layout_raw                                       | 180.80ms  | 0.351           | 841.45ms | 62672      | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| copy_all_cgu_workproducts_to_incr_comp_cache_dir | 178.47ms  | 0.347           | 178.47ms | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| incr_comp_prepare_session_directory              | 175.07ms  | 0.340           | 175.07ms | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| evaluate_obligation                              | 145.02ms  | 0.282           | 204.20ms | 11532      | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| metadata_decode_entry_optimized_mir              | 134.04ms  | 0.260           | 136.34ms | 3093       | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| resolve_instance                                 | 123.98ms  | 0.241           | 301.46ms | 38708      | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| incr_comp_load_dep_graph                         | 123.85ms  | 0.240           | 123.85ms | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| free_global_ctxt                                 | 123.24ms  | 0.239           | 123.24ms | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| trait_impls_of                                   | 121.58ms  | 0.236           | 178.57ms | 241        | 0.00ns                |

It ends with Total cpu time: 51.4974245s

@wesleywiser Looking at cargo build -Z timings output, we compared removing ./target to just adding a newline. When removing ./target it shows all the dependencies, but that part is pretty fast. The worst offenders are our own workspace crates, which take up almost all the time.

There are two workspace crates which are library dependencies of one binary. That last part of the timings graph shows 54 seconds for the first library, 54 seconds for the second library, and 18 seconds for the binary itself.

It looks to me like something is causing incremental compilation to think it needs to rebuild your entire project. There are 256 instances of codegen_module and by default, we try to create 256 compilation units per crate when in incremental mode which suggests incremental thinks every compilation unit is dirty.

Do you by chance have any build scripts in your workspace which aren't using the rerun-if commands?


Can you also do cargo rustc --profile check --manifest-path ./my-crate/Cargo.toml -- -Zself-profile to see what is causing cargo check to take so long?

Thanks!

Sorry, we actually realized the summarize output I posted was incorrect (it was for a previous build where was HAD removed ./target).

Here is the correct output based on just adding a newline:

+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| Item                                             | Self time | % of total time | Time     | Item count | Incremental load time |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| run_linker                                       | 2.74s     | 30.607          | 2.74s    | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| expand_crate                                     | 2.15s     | 23.968          | 2.25s    | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| codegen_copy_artifacts_from_incr_cache           | 992.42ms  | 11.075          | 992.42ms | 256        | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| monomorphization_collector_graph_walk            | 518.25ms  | 5.783           | 1.45s    | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| copy_all_cgu_workproducts_to_incr_comp_cache_dir | 186.33ms  | 2.079           | 186.33ms | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| incr_comp_prepare_session_directory              | 167.44ms  | 1.869           | 167.44ms | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| incr_comp_encode_dep_graph                       | 131.02ms  | 1.462           | 131.02ms | 659024     | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| metadata_register_crate                          | 122.04ms  | 1.362           | 422.20ms | 305        | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| metadata_decode_entry_optimized_mir              | 113.73ms  | 1.269           | 115.82ms | 3080       | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| incr_comp_load_dep_graph                         | 107.18ms  | 1.196           | 107.18ms | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| resolve_instance                                 | 105.09ms  | 1.173           | 249.81ms | 37463      | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| trait_impls_of                                   | 99.58ms   | 1.111           | 141.78ms | 241        | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| evaluate_obligation                              | 75.29ms   | 0.840           | 80.08ms  | 6556       | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| free_global_ctxt                                 | 62.99ms   | 0.703           | 62.99ms  | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| codegen_crate                                    | 61.24ms   | 0.683           | 1.82s    | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| assert_symbols_are_distinct                      | 60.49ms   | 0.675           | 83.82ms  | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+

summarize says Total cpu time: 8.9611183s but the build says it finished in 1 minute and 51 seconds.

Do you by chance have any build scripts in your workspace which aren't using the rerun-if commands?

No we don't have any build scripts at all. The only thing I can think of is we added this to our project's .cargo config:

[target.x86_64-pc-windows-msvc]
rustflags = ["-C", "target-feature=+crt-static"]

However, we did remove that and tested without any noticeable difference in compilation time.

Results of cargo rustc --profile check --manifest-path ./my-crate/Cargo.toml -- -Zself-profile (also just adding a newline):

+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| Item                                             | Self time | % of total time | Time     | Item count | Incremental load time |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| expand_crate                                     | 2.15s     | 80.148          | 2.25s    | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| metadata_register_crate                          | 107.29ms  | 4.000           | 366.63ms | 305        | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| trait_impls_of                                   | 53.25ms   | 1.985           | 73.64ms  | 129        | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| metadata_decode_entry_adt_def                    | 28.14ms   | 1.049           | 43.64ms  | 3916       | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| free_global_ctxt                                 | 23.65ms   | 0.882           | 23.65ms  | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| incr_comp_encode_dep_graph                       | 22.22ms   | 0.828           | 22.22ms  | 155199     | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| visible_parent_map                               | 21.99ms   | 0.820           | 44.07ms  | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| incr_comp_load_dep_graph                         | 20.86ms   | 0.778           | 20.86ms  | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| metadata_decode_entry_item_children              | 17.24ms   | 0.643           | 17.24ms  | 14016      | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| incr_comp_serialize_result_cache                 | 16.15ms   | 0.602           | 38.52ms  | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| metadata_decode_entry_type_of                    | 15.74ms   | 0.587           | 57.97ms  | 3979       | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| metadata_decode_entry_item_attrs                 | 14.41ms   | 0.537           | 14.41ms  | 3937       | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| encode_query_results_for                         | 14.30ms   | 0.533           | 14.30ms  | 25         | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| crate_lints                                      | 12.28ms   | 0.458           | 81.13ms  | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| late_resolve_crate                               | 10.89ms   | 0.406           | 27.12ms  | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| typeck                                           | 10.55ms   | 0.393           | 11.25ms  | 0          | 10.55ms               |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| item_types_checking                              | 8.63ms    | 0.322           | 116.07ms | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| implementations_of_trait                         | 8.40ms    | 0.313           | 15.33ms  | 39345      | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| metadata_decode_entry_implementations_of_trait   | 6.91ms    | 0.258           | 6.91ms   | 39345      | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| hir_lowering                                     | 5.40ms    | 0.201           | 5.40ms   | 1          | 0.00ns                |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+
| incr_comp_prepare_session_directory              | 5.14ms    | 0.192           | 5.14ms   | 1          | 0.00ns                |

summarize says Total cpu time: 2.6826241s but the command that generated this said it finished in 1 minute 45 seconds.

Ok, those results seem much more reasonable. It sounds like then that the issue lies outside rustc but perhaps in cargo? I would try running this and see what the graphs cargo generates looks like

https://doc.rust-lang.org/cargo/reference/unstable.html#timings

I think we just figured out the cause, but it's still weird since it only affects Windows. We use SQLx, and setting SQLX_OFFLINE=true on the machine in question fixes everything and compile times are back down to normal. If this isn't set, SQLx will typecheck our SQL queries against a live database. Setting it to true will use a file cache of the queries instead.

On Linux, we don't need to enable SQLX_OFFLINE to get fast compile times. cargo check on Linux takes about 3 seconds.

I'm going to look into this more Monday to figure out if this is some Windows related issue in general by testing on other Windows machines, or if it's just affecting the one we've been testing on. If there is something throttling the Postgres network requests then that would explain it.

So my only question now is - shouldn't those network request times be included in a timings profile?

3 Likes

So specifically, the slowdown was library-provided proc macros.

Yeah, it seems like proc macros should be accounted for in compiler timing profiles! I think the bug should be filed on rustc, not cargo.

3 Likes