Is possible to speedup `cargo build` command?

qarmin · December 25, 2023, 10:01pm

Hi,

When adding single space to slint file in my project and then compiling it again I see that this takes ~30 seconds.

I used:

cargo build command
mold instead llvm(at end is visible that linking takes ~1 second)
cranelift instead llvm

There is visible ~10 seconds build script work - slow performance reported in Why adding space to slint file, cause running 8s build script? · slint-ui/slint · Discussion #4215 · GitHub (build script is responsible for generating rust code)

but still there is ~20 seconds of single threaded operations in rustc.

Hotspot results

I wasn't able to compile app with RUSTFLAGS="-Zself-profile -Zcodegen-backend=cranelift" CARGO_PROFILE_DEV_CODEGEN_BACKEND=cranelift cargo +nightly rustc to check internals of rust compilation times
due error(I already added cargo-features = ["codegen-backend"] at the top of Cargo.toml)

RUSTFLAGS="-Zself-profile -Zcodegen-backend=cranelift" CARGO_PROFILE_DEV_CODEGEN_BACKEND=cranelift cargo +nightly rustc
error: config profile `dev` is not valid (defined in `environment variable `CARGO_PROFILE_DEV``)

Caused by:
  feature `codegen-backend` is required

  The package requires the Cargo feature called `codegen-backend`, but that feature is not stabilized in this version of Cargo (1.77.0-nightly (363a2d113 2023-12-22)).
  Consider adding `cargo-features = ["codegen-backend"]` to the top of Cargo.toml (above the [package] table) to tell Cargo you are opting in to use this unstable feature.
  See https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#codegen-backend for more information about the status of this feature.

Project(Krokiet package) - GitHub - qarmin/czkawka: Multi functional app to find duplicates, empty folders, similar images etc.

Attempts to improve performance from above I took from endler.dev/2020/rust-compile-times/

steffahn · December 25, 2023, 10:30pm

There’s also an unstable option to have more multi-threading in rustc, perhaps it might help?

qarmin · December 25, 2023, 10:50pm

Times looks a little better(20s -> 16s), but still for me this is too much

jumpnbrownweasel · December 26, 2023, 12:35am

There have been quite a few recent threads on speeding up compile time using various mechanisms, and various problems people have had.

https://users.rust-lang.org/t/soft-question-significantly-improve-rust-compile-time-via-minimizing-generics/103632

https://users.rust-lang.org/t/compile-time-efficiency-of-c-templates-vs-rust-generics/103532

https://users.rust-lang.org/t/compile-time-help/103397

https://users.rust-lang.org/t/is-rust-compile-time-really-that-slow/102863

https://users.rust-lang.org/t/extreme-long-compilation-time/101885

https://users.rust-lang.org/t/rust-compile-times-and-dependency-graphs/101209

https://users.rust-lang.org/t/can-wasm32-compile-time-often-be-10x-faster-than-x86-64-compile-time/100313

https://users.rust-lang.org/t/dramatic-increase-in-compile-time-with-fat-lto-in-release-build-causes-and-troubleshooting/99539

https://users.rust-lang.org/t/compile-time-issues-after-merging-microservices/98484

anon80458984 · December 26, 2023, 4:19am

I don't know if this solves your problem (I never had slow build.rs issues), throwing some suggestions since some of my posts got linked. My overall strategy for minimizing Rust compile time is:

Are all my CPU cores being utilized?

Yes => Need to buy more cores.
No => We can speed up compile time via increasing parallelism.

How do increase parallelism ?

Run cargo build --timings -- stare at the graph, look at points where CPU cores has low utilization. This is because there is a small set of crates {foo, bar, blah} that is compiling, and everything else is WAITING because they depend on {foo, bar, blah}.

This then becomes a game of "can I break foo, bar, blah into smaller crates" ? Can I make things that depend on "foo, bar blah" not depend on them or depend on something smaller ? This ends up being a game of moving around structs / enums / traits so that your "dependency" graph of crates is as flat as possible.

Shallow DAG = lots of things can run in parallel
Deep DAG = lots of stuff running in serial

In the end, I achieved non-incremental (recompile every crate why any dependency changed) build times of around 5k-10k LOC / second. If you significantly beat this, I'm interested in learning how (and how many cores you are using).

Also, in my experience, "dumb" macro_rules! and procedural macros (that I wrote) did not hurt me as much; but some advanced #[derive(...)] from popular packages (not naming names w/o benchmarks) were a bit slow.

In my experience, generics could also be really expensive. "zero cost abstractions" often have compile time costs, so lots of fn blah<T: ...>(...) were replaced with fn (blah: &dyn T) when possible

qarmin · December 26, 2023, 9:27am

Slint generates a lot of rust code which I cannot change, and from my part I use only generics in performance sensitive code.

cargo build times - monomorphization_collector_graph_walk - takes less than second, maybe this is big number, but still a lot of lower than rest

+-------------------------------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| Item                                                                    | Self time | % of total time | Time     | Item count | Incremental load time | Incremental result hashing time |
+-------------------------------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| LLVM_module_codegen_emit_obj                                            | 11.57s    | 23.320          | 11.57s   | 3          | 0.00ns                | 0.00ns                          |
+-------------------------------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| live_symbols_and_ignored_derived_traits                                 | 10.23s    | 20.610          | 10.38s   | 1          | 0.00ns                | 1.32ms                          |
+-------------------------------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| LLVM_passes                                                             | 7.85s     | 15.828          | 7.85s    | 1          | 0.00ns                | 0.00ns                          |
+-------------------------------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| finish_ongoing_codegen                                                  | 4.95s     | 9.974           | 4.95s    | 1          | 0.00ns                | 0.00ns                          |
+-------------------------------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| run_linker                                                              | 1.86s     | 3.750           | 1.86s    | 1          | 0.00ns                | 0.00ns                          |
+-------------------------------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| codegen_module                                                          | 1.67s     | 3.364           | 2.08s    | 2          | 0.00ns                | 0.00ns                          |
+-------------------------------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| expand_crate                                                            | 1.65s     | 3.324           | 2.01s    | 1          | 0.00ns                | 0.00ns                          |
+-------------------------------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| monomorphization_collector_graph_walk                                   | 860.51ms  | 1.734           | 1.86s    | 1          | 0.00ns                | 0.00ns                          |
+-------------------------------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+

Most of this links are not very useful, because they apply mostly to the situation of compiling the entire project with various additional optimizations.

qarmin · December 26, 2023, 9:33am

The problem I have with recompilation of my app after change, not with recompilation of all crates.

Cargo timings

I already split app into multiple crates, so it is no longer available

the8472 · December 26, 2023, 10:48am

Since it's a check build that's all time spent in the frontend. Running with RUSTFLAGS="-Z time-passes" might provide some additional information.

If slint generates code from scratch I wonder if incremental compilation even makes sense or most of it gets invalidated on each build. Try comparing incremental and non-incremental build times.

anon80458984 · December 26, 2023, 11:30am

Is this correct:

you have one crate, krokiet that is causing all the problems
krokiet/build.rs is taking up 7s
krockiet/src is taking up 17.3 s, of which 17s looks like to be PARSING ?

Questions that come to mind are: what is the build script doing, and why can you not split up krockiet/src ?

anon80458984 · December 26, 2023, 11:32am

one more question:

Is krokiet a crate you wrote or a dependency?

If a dependency, why is it constantly rebuilding ?

If your own code, why can you not split it up ?

bjorn3 · December 26, 2023, 11:59am

It is the execution of the build script that takes this much time, not compilation.

I am prettty sure it is czkawka/krokiet at master · qarmin/czkawka · GitHub, which is a UI frontend they wrote using slint.

For the build script, the only thing done is recompiling the slint ui using the slint_build crate. The build script should only be rerun when actually changing the ui definition as cargo:rerun-if-changed is used by slint_build.

system · March 25, 2024, 11:59am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
This is a real example of Rust's slow build times for development, can you spot the issue? help	10	571	February 6, 2025
Why does parsing 1 line take 0.43 s?	9	496	April 4, 2024
[Solved] 30 secs to compile 1200 lines of code?	8	1502	January 12, 2023
Extremely slow Windows compile times help	21	3904	September 3, 2021
Faster than "cargo run --release"	3	5315	January 12, 2023

Is possible to speedup `cargo build` command?

Related topics