Does cargo download a full copy of the index?

An API request to the index (needs JQ as written, or remove it):

wget -O - http://api.github.com/repos/rust-lang/crates.io-index --header='Content-Type:application/json' | jq .size
# 546MB

Shows this is 500 MB unless I am misread the result.

  • Does cargo download the full index? In that case, what is the reason?

  • Separat question: when installing Rust binaries: should one (as a general rule) use the package manager like homebrew, or APT (linux) or simply run cargo install ?

currently cargo uses sparse index by default for the crates.io registry. the legacy version used to use git repository, which is quite large for first pull. see:

https://crates.io/data-access

Regarding your second point, maybe this section from the Rust CLI working group helps.

It’s best to use this for distributing tools that are targeted at other Rust developers. For example, a lot of cargo subcommands like cargo-tree or cargo-outdated can be installed with it.

Thanks! I ignored the "sparse" text subconsciously. I'm unsure how to get the size of it though?

I will take a look! From the quote I don't really know what the meaning is. I meant mostly for installing, say, the Alacritty terminal emulator.

The sparse index protocol involves requesting the index data for individual packages (that’s what makes it “sparse”). There isn’t ever any reason to download all data in the index, that way, so there is no reason to care about its size. (If you did want to download all data, Git is the right way to do that, I believe.)

cargo install is a very limited package manager, primarily used to install developer tools for people already working with Rust code. In particular, it cannot install any files besides one or more executables; it cannot install man-pages, .desktop files, a macOS .app bundle, or anything else an application might ideally be bundled with. Therefore, if you have the option of installing the application you want through a regular package manager you are already using, that might be better.

The one thing that cargo install can do that other package managers probably won’t is compile the latest version of every dependency of the package you install. This can be useful if those dependencies have bug fixes you care about.

2 Likes

Note that the 500MB includes all the history of the repository. A bare checkout should be much smaller.

Just adding as a ref for myself, regarding the quote above:

From FAQ - The Cargo Book

Nightly version of Cargo Home - The Cargo Book

registry Packages and metadata of crate registries (such as crates.io) are located here.

  • registry/index The index is a bare git repository which contains the metadata (versions, dependencies etc) of all available crates of a registry.

So I'm unsure now how the "sparse" fits that description.

Is the metadata for packages in use downloaded only, or all of them?

If I understand correctly, by default Cargo doesn't populate registry/index directly: Configuration - The Cargo Book Where is the metadata actually stored, not sure - maybe this deserves an issue to Cargo for clarification?

But there aren't any other directories with metadata, from the book as well ? Cargo Home - The Cargo Book

Just .cargo/registry