What is the best way to address the crate versioning hell problem?

Hi there, I am writing some code that want to minimize the dependencies. However, I have realized that in the dependency trees there are multiple versions of same crate is compiled.

For example, my crate depends on crate A and crate B. And crate A depends on C with exact version 1.0, and B depends on C with version 2.0. Thus both 1.0 and 2.0 version of C has to be compiled.

Since most of the crate today describes it's dependency as

crate-name = "^version"

Thus all the crate requires its own versions of dependencies.

Of course, I can tweak the version of dependencies of my code to reduce the number of versions of the same crate we have to compile into my binary. However, this is not a long term solution and I think it's highly possible in the future, there are a version hell problem which blows your binary size up.

So is there any good way to address that ?

:neutral_face: I wouldn't call that "versioning hell". As long as the compiler builds an application that works, isn't that the opposite of any kind of "hell"?

As for building multiple versions of the same crate, it's practically unavoidable. This is a feature of Cargo, which allows the application to build, instead of creating a real "dependency hell" that causes builds to fail because of breaking API changes.

Using the example you provided, the public API exposed by crate C version 2.0 is almost certainly incompatible with the public API exposed by crate C version 1.0 (following semver conventions). So trying to build only one version of the crate in a diamond dependency graph will fail with very high probability. Even if by some chance it succeeds now, what happens when you pull in another dependency later which (perhaps transitively) depends on crate C version 3.0? Now you're in real hot water. The only way to fix this is to get the maintainers of crates A and B to update to the latest version of crate C. You might want to fork them both and do this upgrade, referencing your forks in Cargo.toml.


But I don't think this is really the question you are asking. You want to reduce the binary size. Ok, so there are a few things you can do for that. First, check out the invaluable min-sized-rust repo to help you strip out most of the binary bloat that Rust provides by default. And second, perhaps far more importantly, install cargo-bloat and use it to find the biggest offenders.

The next steps are a little subjective and definitely needs to be handled on a case-by-case basis, but these may be useful tips...

In my case, cargo-bloat told me that half of my executable was from goblin's Object file parsing for Windows and MacOS, which I knew I did not need (I only wanted ELF parsing). A quick dig through the docs and I discovered that using goblin::Object::parse() pulls in all of the Windows PE and Mach parsing and handling code, and that I could get rid of it by using the more specific goblin::elf::Elf::parse() directly. TL;DR: half of my binary size was dead code and it was easy to find with cargo-bloat: https://github.com/rust-console/cargo-n64/commit/8cd9123fd176a96e68b0c0edcaf72788775137c2

In another project, the biggest contributor to bloat was clap. At which point I realized that I don't need clap at all, because std::env is good enough, even if the code doesn't look quite as pretty. clap was adding like 100KB of uselessness to my 25KB executable. :sob:

9 Likes

I use cargo tree -d to identify where duplicates come from, and then bug their authors to upgrade.

5 Likes

Thanks for the reply. But I would say a few things regard the answer.

I use the term doesn't means it breaks. I think you have already got what I mean by that. I mean having multiple version of same crate compiled into the binary may be undesired. So I am not saying anything about the breakage at all.

I am not saying having multiple version of the crate into the same project is bad or a bug. But thing may be out of control right? And this feature doesn't always yield desired result. For now it's not a problem for sure, but what if things get out of control in the future ? I am not arguing this is bad, but I am asking how to manage this.
Ideally, all crate owners make their dependencies up-to-date thus there's no problem at all. And I am totally fine with having just a few versions of same crate compile. However, it's highly possible that a lot of crate is depending out-of-dated crates. And as more and more the version lag happens, plus diffierent crate owner may on different pages of the same dependencies (relies on different old versions, etc).
I am not saying it's currently an issue, but I am worry about that it would finally get out of control
So my actual question is: How we can manage this undesired dependency issue or How you can convenience me this is desired

For my personal experience, it's already have multiple most commonly downloaded crates in different version compiled when I run cargo build, this actually increase the build time and binary size as a negative impact. I am ok with everything currently, but I feel I may be bothered in the future, because of this, if we don't have an effective way to control this.

I really like the binary size control stuff you shared. But I think my question is another one.

Yes, but we can not expect this works every time. Plus if we don't seriously consider this, is it possible we have to ping a hundred of different authors to update in the future. Just feel worried.

One of the things, which I noticed about cargo tree is that projects which use the "semantic-versioning hack" to implement a backwards compatible API atop a new version of the crate.

These still show up as duplicates, even though there is some underlying magic implementing everything from the same crate.

If I recall, some common crates like rand and regex use this. And makes dependency tracking tools a bit context sensitive.

Try programming in a language that doesn't allow multiple versions of packages (the traditional form of version hell). What you see in rust is more like version purgatory.

6 Likes

This really comes down to an issue that I call "code hygiene". Some examples of things that fall under the hygiene category are lack of tests, coverage, CI, documentation, and most importantly maintenance. So asking to fix the maintenance burden is almost like asking how to fix poor documentation or missing tests. It's the human element at play, here, and that seems to be the hardest problem of all.

2 Likes

I think one reason why Rust dependencies tend to fall out of date is that no one is telling crates that their dependencies are getting outdated, and Cargo goes for a conservative dependency version choice by default.

As a fix for that, I have started playing with dependabot. Works pretty well.

8 Likes

I agree with you somehow. But what I mean is this out-of-control is undesired.
I have no doubt, this feature which support multiple version of crate is cool.

However, this in turns give all developer more tolerance on their version lag. What I mean is what's the best way manage that, within both the ecosystem and my crate.

Just think about C and C++, versioning hell is always an issue. But this in turn, pushing the developer keep their dependency up-to-dated.

Again I am just ask way to manage that, I am not saying have different versions of same crate is bad at all.

Hmm, that's awesome. This may be want I am looking for. :slight_smile:

Start using it, looks really nice. If every crate maintainer take effort on this, it's not a big deal at all.

1 Like

That's true. But a huge dependency tree will likely somehow has the issue unfortunately. The difference of Rust and C on this issue is. C library maintainers are being pushed to update their dependency constantly because of the versioning hell issue, but in Rust since we have this nice feature, I am more worry about it.

I think the bot is the one of the thing that try to manage this.

…or are prevented from updating, because most other users have an old version, and an update would be too disruptive for everyone.

Anyway, think what the C situation implicates:

  • inflexibility increases amount of pain that library users experience,
  • so that users stuck with an unusable set of libraries put more pressure on maintainers,
  • so that maintainers put more effort or increase urgency of updating dependencies.

Getting maintainers to do work by annoying their users is just unhealthy for everyone involved. In Rust you can say pretty please directly to maintainers, or make a PR (or patch or fork if everything else fails), and still have deps updated without intentionally breaking builds and annoying people.

1 Like

I doubt I can convince you, but I believe that the current situation is desired. Currently, if I write code that works, there is no need for me to modify my code. That is good, because it allows me to create and share more crates, by focusing my time on fixing bugs and making improvements. [Unrelated but hilarious aside: Google's keyboard thought maybe I'd be fixing my code and making tortillas] Less effort modifying working code to work around API changes in other crates is a huge net plus.

A workaround is for crates to make fewer breaking changes, which everyone will agree is also a net plus. But another benefit (which I'm not the first here to point out) is that authors needn't be afraid of making a breaking change.
Unless their code creates types that show up in the API of many other crates, in which case we do indeed have version hell, as would be the case with any programming language.

1 Like

:man_cook::burrito:

That's a problem I reported here : https://github.com/rust-lang/crates.io/issues/1794

Dependabot is indeed a fix but it requires crate authors to use it. I feel that notifying authors when they stop being up-to-date would be a better default.