I have published a crate since last year on crates.io: Rust Package Registry, with 13 versions. However, looking at the statistics, I see that older versions are still getting downloaded! And more strangely, they get downloaded almost equally! I think bots, AI systems, or IDEs’ intelligence may scrape the package.
Usually, peaks (on the chart) happen with each release of Rustc.
One "organic" reason is the existence of Cargo.lock, which, when present, instructs cargo to continue using the locked version of each crate even if a newer, compatible release is available. This is deliberate, and not every downstream project can be expected to adopt updates at the same cadence (or, in fact, to adopt updates at all) as a result.
There almost certainly is some non-cargo traffic reflected in there, but I would look for mundane explanations, not adverse ones, initially.
The other question I have is, what do those figures mean for you? What decisions are you making because of them?
I can't speak for others, but I put an action in my repositories to launch tests once the version of Rust has changed, just to make sure they're still fine. That should normally load the dependencies and count as a download for them. Chances are I'm not alone in doing this.
These make sense, but even on small packages I've uploaded with little to no announcement (and zero dependents, nor any larger projects I've used them in on Github), I've seen the same steady pattern of downloads.
There's definitely a fair amount of bots downloading every crate on crates.io. Why? No idea. But if I were to offer some crackpot theories: I suspect some are mirroring all the crates for use in airgapped environments. And some are mirroring it for archival reasons. And I'm sure there's a fair amount of research on Rust and its ecosystem going on, so there's probably some university projects downloading every single crate when they run their "I invented a new kind of linter" tests.
I've suggested that crates.io make it possible for bots to identify themselves as such, so their downloads wouldn't count (actually getting everyone running a download bot to use this feature is left as an exercise to people who love impossible tasks). To maintainers with hundreds of thousands of downloads it makes no difference, but it matters to me because I would like to be able to use the download counter to know if people are actually using my crates, so I know what to prioritize.
The Cargo.lock does not preserve the patch and move to the latest version (most of the time, it is the default and the preferred behavior). But, for minor/major versions, Cargo.lock actually can be a reason! And your scenario is correct.
However, looking closely at the download counts for older versions, it is obvious that all versions get downloaded equally! Which implies there are machines here scanning crates; look at the index of a crate, download the list, and compare or learn from updates.
Whatever, I publish my crates with a MIT license; anybody (anybot, anyAI) with any purpose can use them. They are free.
What decision do I want to make? I am curious to know whether my changes actually settle down and benefit people and bots. Nothing more.
I don't know how crater manages the downloads (I won't be using cargo add though), but it downloads the crates and runs cargo check, cargo build or cargo test depending on what kind of build it is.
Thinking about it some more, this will have a double impact - not only is it downloading all the crates (although I'm not sure if the initial download would be recorded), but if it's doing a cargo build then it will download all the deps of that crate as well.
Cargo.lock will preserve the exact version of the crate you're depending on (it includes a cryptographic hash of the crate so if crates.io tried to supply different code it would presumably error). The version changes you're talking about only happen when there's either no Cargo.lock or when you explicitly ask for it, either by changing Cargo.toml or by running cargo update or similar.
I'd expect crater to cache the crates that have already been downloaded and built, though I haven't checked its code to be sure (what I saw is that it purged its cache when the disk was 50% full, which seems to imply that, and this: "To reduce computation time crater does not reset to a pristine environment between crates so that the builds of dependencies can be reused. So build artifacts accumulate and can fill up the disk.").
Preservation of version occurs only if the version starts with =, such as =0.2.1; this is referred to as the Comparison Requirement. However, I haven't seen many people use this versioning method, though. It is recommended to let the last part (the patch) of the version float and round to the latest.
Cargo, by default, ignores the patch number and always uses the latest; 1.2.3 means >=1.2.3, <2.0.0 and for zero major version: 0.2.3 means >=0.2.3, <0.3.0. Therefore, cargo won't freeze the version number; on the contrary, it ignores the Cargo.lock version for the patch part (by default).
I think you wrote a similar comment already some days ago -- which I could not understood exactly, but it confuses me. This statement is confusing me even more. All what I read in the last 18 months was more like
The Cargo.lock file is used to freeze the exact versions of dependencies that your project has resolved and built with. By default, Cargo does not ignore the versions in Cargo.lock for the patch part. Instead, it uses the exact versions recorded in Cargo.lock unless you explicitly run cargo update, which may update the versions in the lock file to the latest compatible version
This summary is from Perplexity -- I had some trouble finding it in the Cargo book. If you think you are really correct, please provide an exact link to the sources -- a few books and other Rust related text then might need updates.
Indeed, it requires the --locked option to use the same exact dependencies (ref), otherwise it will select the latest possible dependencies. I used to believe that the file alone was sufficient to do that, too.
See above, or with cargo help install.
EDIT: It's not entirely clear. The cargo help lists that option regardless of the command, but I don't think it requires it all the time. I believe that for cargo build, it will follow by default the content of the lock file when it exists, which spares the programmer any disrupting update when they're busy testing their code. That's not the case of cargo install, however.
" Cargo uses the lockfile to provide deterministic builds at different times and on different systems, by ensuring that the exact same dependencies and versions are used as when the Cargo.lock file was originally generated."
So it is really difficult to understand all the details. I will try to study it in more detail soon.
If my memory is correct, you recently posted an issue in the GitHub issue tracker of the official Rust book about the conflict of major zero version as described in SemVer specification (for major zero, arbitrary API changes are allowed) compared to Rust, where major zero releases are handled in a special way, as pointed out by @raeisi above.
Yes, I was only giving a reference for cargo install vs cargo.lock, not for the versioning. That's the reference I gave in the other thread about versioning.
That bit hints at this behaviour, though I'm sure I saw it elsewhere. I'm sure it's in The Rust Programming Language book because it struck me when I read that the first time, though it's not a reference strictly speaking (but it has enough scrutiny).
This guide uses the terms “major” and “minor” assuming this relates to a “1.0.0” release or later. Initial development releases starting with “0.y.z” can treat changes in “y” as a major release, and “z” as a minor release. “0.0.z” releases are always major changes. This is because Cargo uses the convention that only changes in the left-most non-zero component are considered incompatible.
EDIT: And that's the relevant section in the "Book". Although the excerpt above in the Cargo book is precise enough, after all.
From what I see in the sources of Cargo, the build process starts with resolving the versions, which is guided by Cargo.lock when available.
The default value of the workspace's ignore_lock field, which makes Cargo ignore the Cargo.lock file, is false, so by default, it doesn't ignore it.
The install command explicitly sets that flag depending on the lock options and whether they can be fulfilled.
The build compilation process just looks at the option.
Again, that makes sense when someone is working on a project not to update the dependencies unless asked explicitly. Likewise, when you have a CI chain running, you don't want to start chasing problems due to constant updates of the dependencies.
That's why crates have usually a lot of downloads of previous versions, even when there are newer SimVer versions available. For example, see syn and the introduction of versions 2.0.100 and 2.0.101 (ignore the blue ones, which may include version 1):
As a baseline for bot downloads on crates.io, consider my bears library. This is a good baseline because the code doesn't do anything yet, and definitely has no users. Yet it has over 4k downloads. The spikes over 100 correspond to new patch releases:
If roughly a hundred companies were keeping mirrors of crates.io for offline or internal availability, this might explain the activity. Anyone developing an LLM for Rust by downloading my code off crates.io as training material is definitely going on a wild goose chase.