This is a real example of Rust's slow build times for development, can you spot the issue?

I created a small example of what I'm trying to achieve on my machine here: GitHub - frederikhors/clean_architecture_test at crate-2-model-5.

The only issue I'm having is the long wait times between a file change and the compilation with cargo run for development.

The so-called "edit -> build -> run" cycle. I need to run because is a web backend.

In this really small project example you can reproduce my issue by following these steps:

  • cargo build

  • edit the file src/crate1/app/model4/create.rs removing dbg!("try me");

  • cargo build again

  • on my machine it compiles in 8 seconds:

cargo build
   Compiling crate1 v0.0.0 (C:\project\src\crate1)
   Compiling services v0.0.0 (C:\project\src\services)
   Compiling graphql v0.0.0 (C:\project\src\graphql)
   Compiling exec v0.0.0 (C:\project\src\exec)
    Finished `dev` profile [unoptimized] target(s) in 8.36s

And this is just an example project: there is no business code, there are no dependencies between files, there is nothing: it is just a skeleton!

On the same machine a slightly larger project on each change takes 1 minute or more (without cranelift).

Now my question is: how can I avoid the recompilation chain for every change at least in the files under /app (where business logic lives)?

I'm ready to radically change the structure. These are just tests I'm doing, I do not have a project in production. I'm deciding if Rust is for me or not (and with these slow times it certainly can't be!).

on my linux laptop, I repeated the build twice and got:

📦[nerditation@tumbleweed clean_architecture_test]$ vim src/crate1/src/app/model4/create.rs 
📦[nerditation@tumbleweed clean_architecture_test]$ cargo build
   Compiling crate1 v0.0.0 (/tmp/clean_architecture_test/src/crate1)
   Compiling graphql v0.0.0 (/tmp/clean_architecture_test/src/graphql)
   Compiling services v0.0.0 (/tmp/clean_architecture_test/src/services)
   Compiling exec v0.0.0 (/tmp/clean_architecture_test/src/exec)
    Finished `dev` profile [unoptimized] target(s) in 3.82s
📦[nerditation@tumbleweed clean_architecture_test]$ vim src/crate1/src/app/model4/create.rs 
📦[nerditation@tumbleweed clean_architecture_test]$ cargo build --timings
   Compiling crate1 v0.0.0 (/tmp/clean_architecture_test/src/crate1)
   Compiling services v0.0.0 (/tmp/clean_architecture_test/src/services)
   Compiling graphql v0.0.0 (/tmp/clean_architecture_test/src/graphql)
   Compiling exec v0.0.0 (/tmp/clean_architecture_test/src/exec)
      Timing report saved to /tmp/clean_architecture_test/target/cargo-timings/cargo-timing-20250203T015802Z.html
    Finished `dev` profile [unoptimized] target(s) in 2.27s

the timing report looks like:

Unit Total Codegen Features
1. exec v0.0.0 bin exec 1.7s
2. crate1 v0.0.0 0.4s 0.1s (18%)
3. graphql v0.0.0 0.1s 0.0s (12%)
4. services v0.0.0 0.1s 0.0s (15%)

almost all time is building the binary, I'm suspecting it's the linker that takes the majority of build time. maybe try to use a different linker other than the default system linker.

1 Like

Ok. Imagine the project is real. And you get 30 seconds on each save. Can you please help me about the architecture of the files?

can you explain why you are doing the actual build on every save? typically you should use cargo check on saves, and that's the default for rust-analyzer. cargo check is usually very fast.

1 Like

1.7 second build time on my machine. The linker time is still a significant contributor (1.3s spent on the exec bin). I'm using rust-lld. [1]

If you cannot upgrade your hardware (you probably should, anyway) then the best thing you can do about it is build less code. crate1 is a dependency of exec, and also a transitive dependency via graphql and services. Making a change to crate1 causes a cascading rebuild of everything that depends on it. Change the architecture (and your workflow) so that your primary rebuild cycle occurs in the top-level crate.

Another idea: try the cranelift codegen backend. Reduces the build time on my end by about 20% to 1.4 seconds.


  1. The linker is under a lot of stress even in this so-called "skeleton" project. At least 200 dependencies need to be linked together. I get a lot of misplaced criticism for recommending thin dependency trees, but there's an undeniably good reason for it. ↩︎

4 Likes

It looks like this question is also cross-posted to reddit.

When asking questions, please be so kind and indicate cross-posts, so there isn’t a chance that people answering your question would need to come up with answers that were already given elsewhere. It’s the least you can do to value everyone’s time. Furthermore, seeing replies others have already given can allow someone to write better answers themself, it allows elaborating on or correcting things others have said, and it ensures all the context from things like questions directed at the OP, etc… is present.

15 Likes

On windows, a common slowdown is Defender scanning each generated file, every time. You could try exempting your dev folder, to see if that helps.

4 Likes

Given it's running well and fast for others, which OS are you on / what is your setup? Are you using a devcontainer or something similar? Is this running on windows or mac in a linux VM with a local drive mounted into the VM? Also, is this on an HDD?

If you're on straight linux and have enough RAM, just to be sure, could you put your project into /dev/shm (should be mounted memory) and try to compile there and report back?

According to the cargo build output, it's on the C: drive of a Windows installation.

1 Like

Yeah, that's the slowest place. Any other drive in windows will have fewer installed filters, and thus be faster.

(Of course, you can also Set up a Dev Drive on Windows 11 | Microsoft Learn to avoid even more overhead.)

6 Likes

The linker suggestion is good. As far as I know the fastest linker for x86-64 Linux is mold: GitHub - rui314/mold: Mold: A Modern Linker 🦠

I'm alreay using cranelift.

the best thing you can do about it is build less code. crate1 is a dependency of exec , and also a transitive dependency via graphql and services . Making a change to crate1 causes a cascading rebuild of everything that depends on it. Change the architecture (and your workflow) so that your primary rebuild cycle occurs in the top-level crate.

In the github project there is also the branch with only one model and one crate: GitHub - frederikhors/clean_architecture_test at crate-1-model-1

Can you suggest how?

I fixed this. Is not defender.

You can experiment with cargo build --timings and cargo rustc -- -Zself-profile to try finding the slowest part of the build process.

If it's caused by monomorphization then working around this issue is straight forward. You may need to patch many of the transitive dependencies that the project uses, though. There are around 200 of them.

cargo llvm-lines -p exec gives me (this is the top-N with at least 1000 lines of IR):

 Lines                 Copies               Function name
  -----                 ------               -------------
  278457                13948                (TOTAL)
   12090 (4.3%,  4.3%)     65 (0.5%,  0.5%)  async_graphql::resolver_utils::container::Fields::add_set::{{closure}}
    4700 (1.7%,  6.0%)     18 (0.1%,  0.6%)  <futures_util::stream::futures_unordered::FuturesUnordered<Fut> as futures_core::stream::Stream>::poll_next
    3718 (1.3%,  7.4%)     13 (0.1%,  0.7%)  async_graphql::resolver_utils::container::resolve_container_inner::{{closure}}
    3508 (1.3%,  8.6%)    159 (1.1%,  1.8%)  core::result::Result<T,E>::map_err
    3060 (1.1%,  9.7%)     18 (0.1%,  2.0%)  <futures_util::future::try_join_all::TryJoinAll<F> as core::future::future::Future>::poll
    2977 (1.1%, 10.8%)     13 (0.1%,  2.1%)  async_graphql::resolver_utils::container::Fields::add_set
    2791 (1.0%, 11.8%)     49 (0.4%,  2.4%)  <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter
    2754 (1.0%, 12.8%)     18 (0.1%,  2.5%)  <futures_util::stream::futures_ordered::FuturesOrdered<Fut> as futures_core::stream::Stream>::poll_next
    2736 (1.0%, 13.8%)      8 (0.1%,  2.6%)  async_graphql::resolver_utils::list::resolve_list::{{closure}}
    2536 (0.9%, 14.7%)     16 (0.1%,  2.7%)  async_graphql::resolver_utils::list::resolve_list::{{closure}}::{{closure}}
    2502 (0.9%, 15.6%)     18 (0.1%,  2.8%)  futures_util::stream::futures_unordered::FuturesUnordered<Fut>::new
    2472 (0.9%, 16.5%)     42 (0.3%,  3.1%)  alloc::vec::Vec<T,A>::extend_trusted
    2442 (0.9%, 17.3%)     66 (0.5%,  3.6%)  <alloc::sync::Weak<T,A> as core::ops::drop::Drop>::drop
    2416 (0.9%, 18.2%)     96 (0.7%,  4.3%)  std::panic::catch_unwind
    2124 (0.8%, 19.0%)     18 (0.1%,  4.4%)  futures_util::stream::futures_unordered::ready_to_run_queue::ReadyToRunQueue<Fut>::dequeue
    1980 (0.7%, 19.7%)     18 (0.1%,  4.6%)  futures_util::stream::futures_unordered::FuturesUnordered<Fut>::release_task
    1919 (0.7%, 20.4%)     89 (0.6%,  5.2%)  core::option::Option<T>::map
    1907 (0.7%, 21.1%)    103 (0.7%,  5.9%)  core::iter::adapters::map::map_fold::{{closure}}
    1905 (0.7%, 21.7%)     92 (0.7%,  6.6%)  <core::result::Result<T,E> as core::ops::try_trait::Try>::branch
    1873 (0.7%, 22.4%)     35 (0.3%,  6.8%)  <alloc::vec::into_iter::IntoIter<T,A> as core::iter::traits::iterator::Iterator>::fold
    1854 (0.7%, 23.1%)     18 (0.1%,  7.0%)  futures_util::stream::futures_unordered::FuturesUnordered<Fut>::unlink
    1746 (0.6%, 23.7%)     18 (0.1%,  7.1%)  futures_util::stream::futures_unordered::FuturesUnordered<Fut>::link
    1677 (0.6%, 24.3%)     65 (0.5%,  7.6%)  async_graphql::resolver_utils::container::Fields::add_set::{{closure}}::{{closure}}
    1582 (0.6%, 24.9%)     18 (0.1%,  7.7%)  futures_util::stream::futures_unordered::FuturesUnordered<Fut>::push
    1529 (0.5%, 25.4%)     59 (0.4%,  8.1%)  <alloc::boxed::Box<T,A> as core::ops::drop::Drop>::drop
    1448 (0.5%, 25.9%)     16 (0.1%,  8.2%)  tokio::runtime::task::core::Cell<T,S>::new
    1408 (0.5%, 26.5%)     19 (0.1%,  8.4%)  <&T as async_graphql::base::OutputType>::resolve::{{closure}}
    1383 (0.5%, 26.9%)      1 (0.0%,  8.4%)  async_graphql::http::multipart::receive_batch_multipart::{{closure}}
    1278 (0.5%, 27.4%)     18 (0.1%,  8.5%)  <core::slice::iter::IterMut<T> as core::iter::traits::iterator::Iterator>::fold
    1267 (0.5%, 27.9%)      1 (0.0%,  8.5%)  <crate1::ports::graphql::model1::Model1 as async_graphql::resolver_utils::container::ContainerType>::resolve_field::{{closure}}
    1253 (0.4%, 28.3%)    148 (1.1%,  9.6%)  core::ops::function::FnOnce::call_once
    1251 (0.4%, 28.8%)    179 (1.3%, 10.9%)  alloc::boxed::Box<T>::new
    1248 (0.4%, 29.2%)     80 (0.6%, 11.4%)  tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
    1241 (0.4%, 29.7%)     10 (0.1%, 11.5%)  <axum::serve::WithGracefulShutdown<L,M,S,F> as core::future::into_future::IntoFuture>::into_future::{{closure}}::{{closure}}
    1234 (0.4%, 30.1%)      2 (0.0%, 11.5%)  alloc::collections::binary_heap::BinaryHeap<T,A>::sift_down_range
    1232 (0.4%, 30.5%)     16 (0.1%, 11.6%)  tokio::runtime::task::harness::poll_future
    1186 (0.4%, 31.0%)      1 (0.0%, 11.6%)  <async_graphql::extensions::apollo_persisted_queries::ApolloPersistedQueriesExtension<T> as async_graphql::extensions::Extension>::prepare_request::{{closure}}
    1170 (0.4%, 31.4%)      1 (0.0%, 11.6%)  <axum::serve::WithGracefulShutdown<L,M,S,F> as core::future::into_future::IntoFuture>::into_future::{{closure}}
    1151 (0.4%, 31.8%)     18 (0.1%, 11.8%)  futures_util::future::try_join_all::try_join_all
    1114 (0.4%, 32.2%)      1 (0.0%, 11.8%)  async_graphql::schema::prepare_request::{{closure}}
    1105 (0.4%, 32.6%)      1 (0.0%, 11.8%)  pest::error::Error<R>::format
    1064 (0.4%, 33.0%)     18 (0.1%, 11.9%)  <futures_util::stream::try_stream::try_collect::TryCollect<St,C> as core::future::future::Future>::poll
    1056 (0.4%, 33.4%)     96 (0.7%, 12.6%)  std::panicking::try::do_catch
    1036 (0.4%, 33.7%)      9 (0.1%, 12.7%)  async_graphql::types::external::optional::<impl async_graphql::base::OutputType for core::option::Option<T>>::resolve::{{closure}}
    1034 (0.4%, 34.1%)     22 (0.2%, 12.8%)  core::iter::traits::exact_size::ExactSizeIterator::len
    1028 (0.4%, 34.5%)     18 (0.1%, 13.0%)  <futures_util::future::try_maybe_done::TryMaybeDone<Fut> as core::future::future::Future>::poll

async-graphql looks like it might be compile-time-hostile. This is not the full story, as Result::map_err() also shows up in the top-5, and that could be used by anything.

1 Like