I am building a program to wasm32-unknown-unknown and I'm trying to improve the overall compilation time. I'm attempting to compile these wasm modules on the fly for dynamic use. This works pretty well but the compilation time can be slow. One thing I did to speed it up was to take binary dependencies and invoke rustc directly. I've been trying to get a better sense of why the linking might be slow, and what I can do about it. On the whole, the program I've generated doesn't seem to have that many symbols. If I look at the output binary wasm module with twiggy, I see that it only have around 4000 functions. How does one debug or understand link time?
I've confirmed that this has nothing to do with wasm by also compiling the file and modules to the native x86, and it was even a tad slower.
I've disabled lto, and I've set the optimization level to 1. Any other flags I should be passing?
The timings are as follows:
time: 0.000; rss: 85MB -> 87MB ( +1MB) setup_global_ctxt
time: 0.000; rss: 89MB -> 89MB ( +0MB) crate_injection
time: 0.047; rss: 89MB -> 132MB ( +42MB) expand_crate
time: 0.047; rss: 89MB -> 132MB ( +42MB) macro_expand_crate
time: 0.008; rss: 132MB -> 132MB ( +0MB) AST_validation
time: 0.002; rss: 133MB -> 133MB ( +1MB) finalize_macro_resolutions
time: 0.052; rss: 133MB -> 153MB ( +20MB) late_resolve_crate
time: 0.004; rss: 153MB -> 154MB ( +0MB) resolve_check_unused
time: 0.005; rss: 154MB -> 154MB ( +0MB) resolve_postprocess
time: 0.065; rss: 132MB -> 154MB ( +22MB) resolve_crate
time: 0.011; rss: 195MB -> 195MB ( +0MB) drop_ast
time: 0.113; rss: 154MB -> 186MB ( +32MB) looking_for_derive_registrar
time: 0.165; rss: 154MB -> 188MB ( +34MB) misc_checking_1
time: 0.173; rss: 188MB -> 235MB ( +47MB) coherence_checking
time: 1.375; rss: 188MB -> 335MB ( +148MB) type_check_crate
time: 0.791; rss: 335MB -> 443MB ( +108MB) MIR_borrow_checking
time: 0.151; rss: 443MB -> 459MB ( +16MB) MIR_effect_checking
time: 0.031; rss: 460MB -> 460MB ( +0MB) module_lints
time: 0.031; rss: 460MB -> 460MB ( +0MB) lint_checking
time: 0.037; rss: 460MB -> 460MB ( +0MB) privacy_checking_modules
time: 0.103; rss: 459MB -> 460MB ( +1MB) misc_checking_3
time: 0.018; rss: 460MB -> 462MB ( +1MB) monomorphization_collector_root_collections
time: 0.530; rss: 462MB -> 489MB ( +27MB) monomorphization_collector_graph_walk
time: 0.050; rss: 489MB -> 499MB ( +10MB) partition_and_assert_distinct_symbols
time: 0.000; rss: 499MB -> 500MB ( +1MB) write_allocator_module
time: 0.371; rss: 500MB -> 693MB ( +192MB) codegen_to_LLVM_IR
time: 0.970; rss: 460MB -> 693MB ( +232MB) codegen_crate
time: 0.023; rss: 702MB -> 607MB ( -96MB) free_global_ctxt
time: 4.685; rss: 611MB -> 533MB ( -78MB) LLVM_passes
time: 0.001; rss: 530MB -> 524MB ( -6MB) join_worker_thread
time: 4.534; rss: 607MB -> 524MB ( -82MB) finish_ongoing_codegen
time: 0.049; rss: 525MB -> 362MB ( -162MB) run_linker
time: 0.051; rss: 524MB -> 362MB ( -162MB) link_binary
time: 0.051; rss: 524MB -> 362MB ( -162MB) link_crate
time: 4.585; rss: 607MB -> 362MB ( -244MB) link
time: 8.376; rss: 26MB -> 110MB ( +84MB) total
The rustc invocation looks like this:
rustc
'-Z' 'time-passes'
'-Z' 'self-profile'
'-C' 'lto=false'
'--crate-name' 'codegen_template'
'--edition=2021'
'/tmp/.tmpBHV6qW/generated.rs'
'--crate-type' 'cdylib'
'--emit=link'
'-C' 'opt-level=1'
'-C' 'embed-bitcode=no'
'--out-dir' '/tmp/.tmpBHV6qW'
'--target' 'wasm32-unknown-unknown'
'--cfg' 'feature="generated"'
'-L' 'dependency=state/dependencies'
'-L' 'native=state/dependencies'
'--extern' 'anyhow=state/dependencies/libanyhow-a0947b93a25d3bfc.rlib'
'--extern' 'fallible_iterator=state/dependencies/libfallible_iterator-9baa7ab087c54c45.rlib'
'--extern' 'go_snapshot_common=state/dependencies/libgo_snapshot_common-cc4395f07b82ce0d.rlib'
'--extern' 'serde=state/dependencies/libserde-ede1b502919b56a5.rlib'
'--extern' 'serde_json=state/dependencies/libserde_json-80284d4b15e94187.rlib'
'--extern' 'uuid=state/dependencies/libuuid-cafa71ee460ce900.rlib'
'--extern' 'wit_bindgen=state/dependencies/libwit_bindgen-520f0af3a488af65.rlib'