Too many low-level crates are still at 0.x.x and unstable

I didn't say 0.x.x is inappropriate for production. However, if they would publish a 1.0.0 version, then that would clearly indicate to me that the package is "as ready as it is ever going to be". I've worked at institutions that enforce "no 0.x.x versions in production" as a hard rule (they often fail to look at transitive dependencies, possibly with the excuse that the burden is taken by the direct dependency that includes them), but I advise them to be more flexible because some 0.x.x releases are much better than other packages that publish higher numbers. To me 0.x.x means: the author(s) think that this package is not yet fully stable, 1.0.0 means that the authors believe that the package is stable enough. I know there are no guarantees, neither before nor after 1.0.0, but the explicit declaration of intent does help. I do encourage package maintainers to aim for a 1.0.0 release better sooner than later, because it helps all projects that consider your package for inclusion with their decision making process.

I know you didn't. I'm talking about a property of your proposed crates.io feature.

The problem is that 1.0 does not have a canonical meaning. You have your opinions about what it means and I have mine. They aren't the same.

I'd adjust the 0.x.x to mean that the author thinks maybe the package to be improved, and add that x.y.z where x>>1 means to me that probably the package is unstable and the author doesn't understand semver, and the version will number will never be able to tell me if the package is stable. Obviously that would depend on how old the package is, but given the age of rust right now, odds are good that a package with version number 5.x.y is less stable than an equally widely used package at 0.3.x.

It is complicated, my Cargo.toml contains elasticsearch = "7.10.1-alpha.1". Apparently they decided to align with the version of their Java API. And I have to admit: that API is very stable. So in this case I think I understand what that version means and I hope Cargo does too. I'm not too worried whether is is philosophically entirely correct. When the major version number is high compared to the age of the ecosystem, that is a reason to look for an explanation, but I sure hope that it doesn't mean that it is likely to be less stable.

I also have rsa = "^0.3" (I should upgrade to 0.4) and sshkeys = "^0.3" and, although there are lots of indications that these packages are of high quality and in fact quite stable, I would find it comforting if the authors would consider what separates these packages from what they think is needed for 1.0.0 version and set it as a goal to get there. My Cargo.lock file has 161 0.x.x dependencies out of 220. That is not an exception that I can investigate to reassure myself. It is the common pattern and checking out all of them is not realistic.

I did not read this whole thread and I am not sure that my comment is relevant or appropriate here, but my main issue with third-party packages is that there is so much functionality overlap, no clear winner or best practice, and no clear future maintenance of anything. What is becoming obsolete? What is already obsolete? Which libraries are the future? How can anybody know?

Believe it or not, learning the rust language is relatively easy compared to dealing with the library diaspora.

I also seem to have used a package that changed a config setting that caused cargo clean to try to truncate /tmp, which makes me very leery of any third-party packages. Yet there is no way to do anything significant with rust except with risky third-party crates, and it is completely impractical to expect every developer to do a comprehensive source code audit on every dependency, especially when that code could change at any time. I fear that this situation reduces the rust community: the language is a bit of a challenge, but the libraries are worse.

I guess my real issue is with the shallowness of the standard library, which I believe is relevant to the entire issue of immature code sitting out at the repositories. Maybe there is something between standard and third-party, something like "rust-approved-and-maintained", for critical things like (for me) at least JSON, HTTP server, and HTTP client. I realize that these needs vary by project. Another part of a partial solution (because there cannot be a comprehensive solution without serious investment and change) could be to limit third-party package submissions/availability. Then focus on getting everything in whatever can that is to a stable release state.

I understand the rust philosophy around this, but I find it impractical and even risky in reality.

1 Like

I don't think this is a problem unique to Rust. As an example, NPM has over 1.5 million packages and has the same problem where there are multiple similar packages with overlapping feature sets, and often no clear "best" option.

After a while you start to get a feel for which packages are better and will recognise which authors consistently produce high quality stuff. There also are packages which are commonly accepted as the go-to solution for a particular problem, for example serde is the go-to serialization framework, use serde_json for working with JSON, reqwest if you need a HTTP client, and so on.

In my opinion, having some sort of "rust-approved-and-maintained" label would be harmful to the ecosystem in the long run...

  1. Having centrally approved packages puts too much power in the hands of a small number of people
  2. It's expensive - Most projects are done in people's spare time for fun and the author reserves the right to walk away or ignore PRs whenever they want. If you are endorsing a package as being maintained then you to ensure it'll actually be maintained, which means you need to provide an incentive (e.g. money) and a guarantee (e.g. employment agreement). Would you be willing to help sponsor people to do this?
  3. It smothers growth and experimentation - why would I try to develop a better HTTP client when I know nobody will use it?
  4. "The standard library is where modules go to die" - it's hard to iterate/improve when a package is officially endorsed. You need to religiously maintain backwards compatibility, people will complain loudly when you try to kill off poorly designed APIs, and so on.
  5. You are trying to outsource your dependency selection to someone else - who says it's the right fit for you or they haven't tried a supply chain attack?

Unfortunately, there is no single "best" solution for dependency management and for any possible solution I'm sure we'd be able to find problems with it. I mean people laugh at how a simple "hello world" web app in JavaScript requires thousands of dependencies due to its small/non-existent standard library, they sing praises for Go's extensive standard library while 3rd party dependencies were effectively no better than git submodules until 2018, Python has a huge standard library with several built-in HTTP clients yet people still recommend a 3rd party library.

5 Likes

That category exists. The library team does maintain some crates under the rust-lang umbrella. regex and libc fall into that list.

The problem is that we don't have the resource capacity to maintain a bunch of stuff. And for cases like JSON, where the standard is serde_json and is maintained by a library team member (but not under the umbrella of the library team), there is no obvious advantage for the library team to go build a parallel implementation. In theory, the library team could adopt serde_json (and perhaps even serde), but obviously that requires the library team to have the resources to do so, and of course, the willinginess and consent of current maintainers. And even aside from that, the current arrangement seems to be working quite well, so there's really not a ton of reason to do it in the first place.

These things are things that "sound" like good ideas, but when you actually try to put them into practice---where the rubber meets the road---things get a lot murkier.

I agree, but I don't think it's impractical to do due diligence on the dependencies that you bring in. I certainly do that for my applications and libraries. I don't audit every line of code, but I do at least the following:

  • Look at the primary maintainers. Do I know them? Do I trust them? What else have they done? e.g., I'm not going to have any trust issues using something from David Tolnay, so I'm less likely to scrutinize his libraries in more detail. But if it's someone who just published their first crate, yeah that might trigger "audit every line."
  • Look at the commit log, issue tracker, README and docs. Do they look healthy? If not, maybe other parts aren't so healthy.
  • Look at the transitive dependencies that get brought in. Do they look reasonable? And possibly apply due diligence to each such dependency.
  • Look at who else is using the crate. Maybe I don't know the crate author, but maybe someone else I know and trust has used their crate. That's a good sign.

And so on. The cargo crev project is a lovely idea for systematizing something like the above process through a web of trust. I love the idea. I just wish I had more time to be a more active participant in crate reviews.

13 Likes

No disagreement, just perspectives.

I have spent a great deal of time in software sales, which has given me a different perspective that is less based on the advantages of a technology than its viability. In terms of adoption, good technology is often less important than good sales and marketing, which rust does not have at all. The optimal position is to have all three. It is also critical to focus on the competition.

Right now, I feel that rust itself has good technology. If you think of C and C++ as its competitors, there is certainly no question, but those languages have had decades to evolve and supply/consolidate libraries, publish books, develop toolchains, get into codebases and universities, and so forth, which makes rust seem very immature and incomplete. For the larger market, the competitors probably include things like Java, PHP, JavaScript, Python, and C#. Of those, the only one that I personally would consider to be not a disaster is C#, which either shows my bias or my preferences. The JavaScript dependency graph for hello world makes NodeJS a complete non-starter for me, much worse than the rust library situation. I see significant potential challenges maintaining large codebases with the others.

One of the reasons that I like C# is specifically because of the libraries. .NET core standalone has an incredible depth of libraries which include HTTP client, server, and JSON. In addition, it has a major vendor behind it, which is obviously relatively unfair. Yet still, if someone can do JSON better than Microsoft, there is nothing stopping them (I only recently switched from Newtonsoft).

Microsoft is not stupid, and we should learn from them. One of my early problems with Java was no standard libraries for basics like XML, which .NET has always had. Only after Newtonsoft (a third-party JSON library) had about a billion downloads did Microsoft implement System.Text.Json. Microsoft basically stole from Newtonsoft, optimized, simplified, etc. I know that this seems unfair, but it is a successful tactic. While I actually prefer rust to C#, it is impossible to commit completely to a platform that appears to lack the most basic inherent functionality for modern programming. Imagine requiring a third party to convert strings to integers, which is basically what JSON deserialization, HTTP client, and HTTP server are in this context at this time. Imagine depending on a third-party to invoke any API, which is currently the situation for all HTTP API, which is unacceptable. The technology world moves quite quickly and rust risks falling even further behind. When you see this, it makes you want to walk away.

While I like serde, rust itself has almost none of the functionality of .NET, which scares me as a developer – for what other components of my critical enterprise project might I need to depend on risky, non-commercially-supported, partially-documented, ever-changing third-party components? How much third-party code do I want to audit? If I do not have time to check the code, I certainly do not have time to validate the parties behind it, who may change or disappear at any time, and may even be bad actors biding their time.

I really feel that if we want to see rust adoption, we need to address these core issues. I may be wrong, but I was under the impression that google was promoting rust, which could address the resource issue. If there really is a resource issue, then rust simply is not viable for enterprise, as few will commit their billable resources to addressing those issues under the obvious risks.

It would take much less work for developers to implement a single HTTP client than to maintain dozens. Consolidating this effort should help to push the most fundamental libraries forward.

I've used C# quite a lot in past jobs and really liked how you can get quite far without needing any third party libraries, so can definitely relate to where you are coming from.

However, it feels like your experience with the .NET ecosystem has skewed your perspective a little and you are conflating "a reliable, trusted library for X doesn't exist" with "a reliable, trusted library for X hasn't been written by Microsoft/the Rust foundation/Google/SomeBigCorporation".

Big companies have a lot of inertia and are often quite slow to change, but we've seen a massive uptick in Rust use from your typical FAANG companies over the last year (or at least, they're now public about it).

I feel like we aren't far away from having the sort of visible enterprise support you are talking about. These companies are already invested in Rust and even if it may not look like they have published many fundamental crates, they still sponsor quite a few Rust library authors and projects.

As one example, David Tolnay (author of serde and 4 of the top 10 most downloaded crates on crates.io) has 126 GitHub Sponsors, where one of the publicly visible sponsors is Microsoft.

7 Likes

Yes. This issue--a large fraction of crates being 0.x forever, even after seeing wide production adoption--has been brought up many times. I think it's many years past the point where just encouraging developers to behave differently can plausibly be expected to make a difference.

Why is it like this? I think the root problem is Semver. Semver is fundamentally broken.

According to the Semver rules, when you move to version 1.0, you're not just promising that version 1.0 will have a stable API. You're promising that every version you ever make in the future will have a stable API. I.e. you're no longer allowed to try out experimental API changes unless you do a major version bump for every experiment!

From the Semver FAQ:

Q: If even the tiniest backwards incompatible changes to the public API require a major version bump, won’t I end up at version 42.0.0 very rapidly?"

A: This is a question of responsible development and foresight. Incompatible changes should not be introduced lightly to software that has a lot of dependent code. The cost that must be incurred to upgrade can be significant. Having to bump major versions to release incompatible changes means you’ll think through the impact of your changes, and evaluate the cost/benefit ratio involved.

This is condescending nonsense. Someone released version 1.x, and now a few years later they want to iterate and get user feedback on new APIs before releasing version 2.x?

According to the Semver FAQ, they are irresponsible and lack foresight.

This does not match how good APIs are developed. E.g. Rust itself has Nightly, which is allowed to have unstable features that can be iterated on, without forcing a major version number bump in Stable. Rust's version numbers would be in the tens of thousands by now if it had to follow Semver rules.

There are other options, e.g. I think some projects use even minor versions for stable releases, and odd major versions for unstable releases. Or something else. But the Semver rules are the direct cause of the version 0.x situation today, and I don't think it will ever get better without a change.

[edit: Sorry, "fundamentally" was too strong, it's a minor mistake with significant consequences. I think the consequences only show up for Rust libs because it's the first ecosystem that's even partially capable of checking the rules in software.]

[more edit: @CAD97 points out that in fact SemVer does allow what I said it didn't, via pre-release tags and/or continuing to use 0.X for experiments even after 1.0 has already shipped. Mea culpa.]

1 Like

Semver does have a blessed solution for post-1.0 experimentation, though! It's just not immediately obvious (which is in and of itself still a problem). The major options AIUI are:

  • Out-of-band experimentation that isn't versioned under the same published server version. This is the Rust Nightly model, where night nightly's version is "what day is it" and offers absolutely no compatibility guarantees.
  • In-band experimentation by just using more 0.X versions. Nothing in the semver spec mandates that versions are strictly increasing; it just mandates which versions are API compatible and in which direction.
  • Pre-release experimentation by using pre-release tags (e.g. 1.5.3-experiment.2.8). Prereleases in strict semver are equivalent to the 0.X versions: absolutely no compatibility guarantees. However, Cargo Semver does upgrade within prereleases, using lexicographic or numeric sorting.

I think the interesting takeaway is that (per the semver spec) 0.X is just a "free" prerelease space for 1.0. I think it's unfortunate that Cargo will auto update from 1.0.0-experimental-glam.8.2 to 1.0.0-experimental-serde.1.0 with the default/^caret compatibility. If cargo offered a mode that only upgraded within a single prerelease track, it'd be easier to use prerelease tracks for independent pre-release testing; as of current you're effectively limited to absolutely no stability in prereleases (and somewhat arbitrary lexicographic automatic upgrades, basically limiting you to alpha/beta/pre), rather than having an extension like 0.X that allows for marking breaking changes (or more importantly, compatibility) within prereleases.

The version specifier 1.0.0-experimental-glam.* currently is an error. Even if Cargo decides to keep the current sorting and automatic upgrades for default/^caret and ~tilde version restrictions (which currently work, and give the anything-goes upgrades) on prereleases, I think it would be good for Cargo to support wildcards in the prerelease string to match any dot-separated-identifiers, allowing upgrades within a specific prerelease "track".

I published funver with the discussed prerelease versions, so you can test Cargo's behavior.

10 Likes

As a library author, I have started going straight to 1.0.0 whenever I publish a crate that I think other people will use. The ability to distinguish between minor and patch releases (e.g. 1.1.0 versus 1.0.1) is enough of a benefit to outweigh the downsides, for my use cases. And when I do end up needing breaking changes, I can bump the major version.

I encourage other library authors to consider doing the same. At the very least, it would prevent more situations like libc where it becomes too disruptive to bump the version to 1.0 later on.

On the other hand, most of my open-source libraries are fairly small and self-contained. I imagine that larger, multi-crate ecosystem projects have different trade-offs to consider.

4 Likes

That doesn't work so well when you don't have a reliable test suite or any intention of detecting regressions. In my mind, that's what "experimental" development means. There are reasons to publish experimental crates, and also reasons to advertise them. But I still wouldn't call anything 1.0 if there isn't meaningful regression testing going on.

As one of RustCrypto maintainers (sha3 is part of our project), I would like to add a bit to @BurntSushi's comment. Even without the need for advanced const generics we have plans for certain breaking changes around low-level APIs. So, yes, we do plan to change its API relatively soon.

I think that for any non-trivial crate it's worth to have a cool-down period of at least 6-12 months. If during this time no need for breaking changes will arise, then the last v0.x release can be promoted to v1.0 (and with the semver trick migration can be made really painless). I have seen a number of crates with v2.0 release only several months apart from v1.0 (and in some cases it has been only days!). It's especially frustrating when those v2.0s exist only because of relatively minor breaking changes. So I personally prefer to be a bit more careful in this regard.

Also in RustCrypto we have a relatively conservative MSRV policy (i.e. MSRV bump is considered a breaking change), so ideally we would like for MSRV dependent version resolution to be implemented first as described in RFC 2495. In future it would allow us to bump MSRV of our crates without making new major releases for each such bump.

5 Likes

I'm working on a hobby operating system that's still at 0.1.0. It consists of two crates -- the main kernel crate and libk. The libk crate is (also) stuck at 0.1.0. I do this primarily because I don't want to constantly be updating the version number on every change, and the crate changes pretty quickly (though its slowed considerably since college, GSoC, and life have taken the majority of my time away from it). I know semver and I like it... Its just knowing when to actually change the version that trips me up. If I did it on every change (I tend to test my code before committing and to ensure that I always push code that at least builds) I'd probably on version 0.550.0 by now. :slight_smile:

Yeah, unless you are publishing on crates.io (or an alternate package registry), the version number doesn't really matter.

We do the same at my company for our internal Rust crates which live in a workspace inside a Git repo: They are all at version 0.1.0 forever, because there was no reason to change them from the default. If we publish one of them on crates.io someday, then we will start bumping its version number.

6 Likes

Some commentary from myself as a prominent Node.js package author, and prior to that a prominent jQuery plugin author, on what Rust can expect going forward, and some things to do to mitigate the issues:

To summarise specific to this issue. Taking the patterns from other markets and open-source ecosystems, then applying to the rust community:

  1. an ecosystem is created
  2. lots of indie devs and early adopters create many independent packages
  3. indie devs are not funded, packages get stale [where rust just left]
  4. companies and money starts moving in [where rust is now]
  5. consolidators/plagarises move in, and reimplement existing packages, by most popular first
  6. original work by indie devs is eventually mostly replaced by consolidated cohorts, be it companies or large open-source groups/individuals

This works, as it consolidates fragmented resources and attention, into a consolidated maintenance effort of shared/pooled resources (including attention and contribution and sponsorship) to historically a select few companies/individuals/groups - however if implemented proactively, can be more community orientated.

For rust, what they can do to speed this up:

  • setup a cohort, sort the packages sorted by most used first, and start reimplementing them - ideally with the original maintainers joining the cohort and distributing the funds received by the cohorts brand to the maintainers

If this isn't done collectively, then the opportunists will, so at least a concerted effort should be made that at least reflects democratic values rather than purely market forces.

You are using a lot of jargon that I don't understand in this context. What do you mean by 'plagiarism'? How does the MIT not permit 'plagiarism'? What resource is being taken in the land grab? Package names? And how does that relate to semantic versioning of Rust crates?

You write things like:

"[...] then we have the [...] power to negotiate our rights back"

But it is unclear to me what rights you think you've lost or who you want to negotiate with.

Fair enough. That external post is in a forum focused on licensing and lawyers. I've updated my OP to contain a summary applicable to this forum.

Re semantic versioning, others in this thread have already been addressing that.

Moderation note: Folks, this topic is veering off course. It's becoming a dumping ground to discuss just about any ecosystem issue, but the original topic is about crate stability and the presumption of its connection with version numbers.

10 Likes