Request: derank crates.io results by number of dependencies, including transitives

Rust crates are trending towards way too many dependencies, like NPM packages. This increases the attack surface, decreases maintainability, and significantly increases the amount of time spent waiting for packages to build in CI/CD pipeilnes. Most of my compile time is twiddling my thumbs waiting for hundreds of extraneous transitives to finally finish compiling.

Request that crates.io begin downranking results by the number of dependencies, including transitives. Given the rustc compile time for a single crate is slow, this problem becomes exponential when consuming typical crates in the wild.

This change may not have the effect you want it to have. In particular:

  • It would create an incentive for libraries to “vendor” other libraries by copying those libraries into themselves, resulting in duplicated instead of reused compilation of that copied code, making overall compile time worse.

  • When considering the effect of the dependency graph on compilation time, unused code is the worst thing. Unused code can sometimes be trimmed out using features/cfg, but a single library with many features is often harder to correctly[1] maintain than separate, smaller libraries. But your proposal would favor large libraries (with or without features) over small ones. Large libraries are more likely to have significant unused code.

  • Regarding security, larger libraries are also harder to review/audit, and require bigger re-reviews when they are updated.


  1. (feature additivity and semver) ↩︎

20 Likes

You might be interested in Lib.rs

Unlike crates.io, it makes an (opinionated) effort in ranking crates by their "quality", which does include dependencies. You can read more about the ranking algorithm in the About page.

5 Likes

Derank by number of programming language tokens, including transitives.

That's not a well defined metric.

Do you include cfg-out code? Then multiplatform libraries are penalized for having multiple copies of the same code, of which only one will ever be compiled at a time.

Do you include the code from expanding macros and generating code in build scripts? Otherwise you could just implement everything in macros, or spam derive macros, which overall slow down compilation. If you want to do this though you start having to compile code and pick a specific platform to do that.

Do you include codegen of generics? That can easily multiply the result of codegen without increasing too much the size of source code.

All of this is about size of code/codegen, but speed of compilation is not influenced just by that. Parallelizing compilation is a pretty big speedup, at least on high-end cpus, and splitting into different crates usually helps with that, at the cost of slightly increase code/codegen. Do you want to penalize this approach of speeding up compilation?

How do you decide what's extraneous? Your logic applies to the crates you write as well. You can run cargo tree and start pruning dependencies from your projects that aren't necessary.