Rust should have a big standard library and here's why

C++ is a bad example here. Here's a problem that the C++ Standards Committee is debating now (emphasis mine):

Current implementations of std::regex​ are often 2x slower than equivalent use of regular expressions in interpreted languages like Python and PHP, and are roughly 10-100x slower than equivalents in modern systems languages like rust or go.

C++ ended up with poorly performing regex implementation, and C++ is stuck with a bad implementation, because fixing it will break ABI. Rust hasn't matured to the point of worrying about ABI yet, but the same class of problems applies to all standard libraries: once something is in, it may be not be possible to fix or improve it.

Rust had built-in serialization already. It's is now the rustc-serialize crate. If that was blessed to be in the standard library, we'd be stuck with it, instead of having much nicer serde.

So I think Rust is doing much better than other languages. Instead of having to tell users "no no no, don't use this part of the standard library! Use the faster better leaner 3rd party crate instead!" it can skip the first part, and go straight for "use the faster better leaner 3rd party crate".

39 Likes

Coming from Java it took me a bit of getting used to small stdlib. However, I love it now (that's not to say I don't see any problems in crate system, but that's another story). Often, precisely because stdlib doesn't have this-or-that, community recommends a well known, well tested crate - the same for everyone so we don't have such split as in Java. I mean, lack of features in stdlib means ecosystem evolves faster to a better solution than stdlib would have ever be.
One thing I would consider though, is breaking backwards compatibility once in a while. Regardless of how small stdlib will be, sooner or later a mistake will be made and we'll have to live with it forever. I would accept a situation where once a year about 1-3% crates is broken. It would be easy enough for maintainers to stay up to date, and those unmaintained crates shouldn't be used anyway. As long as deprecated code doesn't compile (instead of "compiles but doesn't work") personally I'm fine with it. It's even better because it gives me hopes that language will evolve and stay modern. I wish some things were gone from Java.

2 Likes

Even C has problems with it's standard library. And that is a pretty minimal library.

For example the ancient and extremely error prone string functions the use of which is discouraged now a days.

Or various thread unsafe functions.

It makes a lot of sense to me to let the Rust developers and user community develop whatever libraries they need, outside of a minimal and essential standard.

Then, as long as the original libs are available forever programs can use them forever. No breakage.

As and when new improved versions of the same functionality come along, with whatever breaking changes, people can upgrade their programs to use them. Or not.

In a world with package managers like Cargo and package repositories like the Crates system it becomes unnecessary to include everything and the kitchen sink into a big language + standard library blob.

Oddly enough I have seen Bjarne Stroustrup campaigning for more stuff to put into the C++ standard library, so that C++ can hold it's head high compared to Java. Perhaps that desire is now realized to be a bad idea, what with the arrival of 'modules' in C++.

10 Likes

The problem is that breakage isn't limited to library crates, but extends to everyone using the language. When I've written a tool that works, I don't want to have to check every year to see if it's been broken. I don't even know quite how many tools and web apps in total I've written using rust, but it's wonderful that they do keep working, and even the unmaintained crates they use keep working.

I just found myself debugging a little queueing system I wrote a few years back. It turned out that the bug was in clap, and upgrading to recent crates fixed it. It was code that I hadn't touched for 15 months. A few weeks ago I ran across a code I needed that had been touched less recently. I am able to work on more different projects because there isn't work needed just to keep the existing projects working.

9 Likes

@droundy That's nice and I do understand that but even in a small standard library mistakes are prone to happen it's human error and we can't control it so, tell me this: Would it be better to keep something that's flawed in the standard library for backwards compatiblity or would it be better to improve upon it even if it means breaking backwards compatiblity? A good system to do so would be to keep the breaking changes in the nightly branch of Rust for a while so developers have time to adapt to the changes before it's pushed to the stable branch but long-term knowingly keeping flawed components in the standard library is definitely not the way to go. Also a side note: For everybody saying official crates are the way to go instead of putting things in the standard library tell me why? Because the way I see it it's all standard but just under a different namespace so what's the point in that? Why not make it simpler and put it all under the standard library?

I would say to keep it. It doesn't hurt (much) and users can go with a third party create. mcsp channels are the classic example here. The best approach, of course, is to be extremely conservative on letting anything into the standard library (i.e. keep it lean) since external crates can use versioning to avoid breaking backwards compatibility, but the standard library can't.

4 Likes

@droundy It definitely does hurt and over time it'll add up if nothing's done about it, things that are added to the standard library are always reviewed before put in but mistakes still get past so being extremely conservative isn't gonna cut it, it's like saying using C++ is okay to use if you're super careful, a memory bug will eventually wiggle it's way into your program just like how a mistake will eventually wiggle it's way into the standard library.

We learned from python that the breaking change in the language hurts more. I agree keeping bad things in stdlib is bad, but breaking change is worse as it will split the ecosystem. Ecosystem split is one of the key reasons we think why Dlang haven't grow much.

6 Likes

This is precisely the question I want to ask you: Why should the Rust ecosystem put all this effort and time (both of which are scarce resources as it currently stands) in maintaining something of which we already know that parts will definitely be implemented with deficiencies? And moreover, something of which you acknowledge that it's mostly nothing more than a namespace change?

What justifies putting all that effort into this?

Definitely not, because it degrades the good reputation of the standard library.
In Python currently I don't know which parts of the standard lib are good and which are not, and the consequences of that are 1. I tend not to use Python anymore (the scripting I did with Python are either now back to Bash, or I use Rust for them) and 2. If I have to use Python, I avoid its stdlib altogether.

A blunt-but-not-inaccurate way of putting the above is "(in Python) the stdlib is where code goes to die".

3 Likes

It's also worth noting that Rust's standard library is growing larger over time, but very slowly. It's small size is partly a factor of its relative youth (combined with the conservative approach to growing it).

There are definitely a few things that I think could usefully be moved into the standard library after being proven and stabilized in external crates, and at least some of them might, some day. (The areas I'm thinking of include generic math traits; more date/time features; more unicode processing; scoped threads…)

14 Likes

The reason is to allow for semantic versioning separately from the language/stdlib.

You say you want more things in the stdlib. So think of it this way: the std library is those things well known / canonical / obvious / intrinsic enough that their API isn't going to change, and they can stay stable forever. Other "standard but separate" crates (serde, rayon, syn) are under different namespaces because those namespaces each have their own opt-in breaking changes. And with the "semver trick" where old versions are implemented by delegating to the new versions, you can even have a breaking change in API while keeping only one actual implementation of the crate (for some kinds of breaking changes).

Rust already has a "opt in breaking change" mechanism for the language in the edition flag. You can think of semver-major versions as opt-in breaking changes at the granularity of the top-level library namespace.

Semver is the best of both worlds of stability and breaking improvements. Somehow "blessing" externally versioned libraries (independent of the forever-stable language) (the standard library theoretically could get it's own distinct version, though this is highly unlikely)
is the best of both worlds between a lean perma-stable vocabulary std and breaking changes to ecosystem resources.

The solution to breaking improvements isn't too silently (or even loudly) break old, correct code. It's to create a new, opt-in version (a semver-major version bump) and transition users to it, while still allowing code written against it to work (and ideally, still get the new internal improvements, if not the new APIs). And it is infinitely easier to do so if the libraries are versioned at more granularity than "everything," I hope is obvious.

10 Likes

@CAD97 do you think we'd see terminal coloring and a clean case insensitivity implimentation in the standard library since those are things that'll stay the same.

1 Like

This may break with creation of new terminal or the semver-major release of existing one.

What is the reason to say str::to_lowercase and friends aren't "clean"?

@Cerber-Ursi I never even thought of using to_lowercase for case-insensitivity! I've been using UniCase this whole time, thanks for that! Also couldn't the implimentation for coloring detect the terminal to deem if it's compatible or not? Maybe there could be a function that could be called that returns a bool to determine if coloring is supported so the developer could act upon on that however they'd like.

Excuse me while I laugh for a second.

On case insensitivity: yes, the API of comparing two strings case insensitively is simple..... Or is it? Unicode is complex, and comparing with a full case fold isn't always what you want from a "case insensitive" comparison. Plus, even what glyphs are the same when case is changed differs between locales. Unicode offers a good default choice, but it isn't always the one you want. Plus, then you'd have to carry around the Unicode folded comparison tables around, even if you only care about ASCII case insensitivity. Case insensitivity is only simple if you pretend languages other than English don't exist. Even the to_lowercase solution is imperfect, because it requires a new lowercase string for both sides (that is theoretically unnecessary) and doesn't even consider ä and ä as equal, because they are made up of different codepoints (composed and decomposed). As far as I'm concerned, there are two correct ways to deal with user-facing strings: as a blob, not touched by the program, or with full i18n and l10n.

On terminal coloring: sure, you could have a canonical ANSI API. It could even be one that supports jit optimization of ANSI codes. But there are terminals that don't support ANSI, and there will be new terminals with new APIs in the future. As soon as a new terminal invents a new way of doing colors that isn't compatible with your API, you have to make breaking changes to support it along with everything else. (If you even want to support the Windows Console (default for cmd.exe), you cannot just use print! to print anymore; you need to talk to the console through the console API rather than stdout.)

Many things seem obvious at first glance, but there is actually quite a lot of (somewhat) hidden complexity as soon as you dig a little and consider forwards compatibility with the unpredictable future.

18 Likes

@CAD97 unicase - Rust UniCase has Ascii only too and for terminal coloring the API would remain relatively the same if a new terminal has a new API. For example:

fn main() {
    // colorSupport() returns a bool to detect if the current terminal supports coloring with supported APIs
    if colorSupport() == true {
        println!("{}", "Hello world!".color(foreground(red), background(rgb(100, 37, 58))));
    } else {
        println!("Hello world!");
    }
}

Using coloring like this is super easy to use for production because of how simple coloring text would be and because how simple checking for support would be. If a new terminal isn't compatible the developer can use colorSupport() to remedy the issue while the stdlib gets updated to support the new API. The public part of the API would remain the same so nothing would break, things would change in the background to keep supporting newer terminals but I don't see how that would ever cause code breakage for somebody who's using it.

Are you suggesting that the Rust standard library should carry around, forever, all the mechanics of displaying colored and formatted text?

To this day Linux carries with it support for driving every terminal ever created. Most of which nobody has seen for thirty years. See Termcap - Wikipedia

Luckily that is all handled by libraries, ncurses, rather than being dead weight in the C standard library.

6 Likes

Now, see, we already have a problem with that rgb function, because there are plenty of Linux terminals that support colors, but do not support 24bit color (being instead limited to a palette of 16 or 256 colors). The standard default 16 color palette does not even contain an orange color!

People may customize their colors (through mechanisms completely unknown to the stdlib), so the standard library cannot even try to approximate rgb(100, 37, 58) to one of the existing colors.

3 Likes

How about different level of promises?

I'd suggest let's don't call it std, stdx or anything that contains the keyword "standard" in it, and then, make a weaker promise about it's compatibility.

For example, for everything "standard"ized, it must be kept the same during the entire major version (the 1 in 1.42 for example). But for something else, it can be deprecated and removed 20 minor versions (the 42 in 1.42) / 12 months later.

Also, when something is deprecated, make sure IDE will do a great job on annoying user for an upgrade.

Personally I never liked the idea of absolute backwards compatibility, but the severity of drawbacks that come with breaking changes is incredible, as many users in this topic have already pointed out.

So one point I wanted to bring up, that I didn't see mentioned here, is that Rust has a system in place for introducing breaking changes without breaking backwards compatibility: Editions.
When I first heard about the idea of editions, I was kind of mindblown, because this allows Rust to keep evolving without splitting the community, being able to work together across edition boundaries is amazing.

That said, editions can introduce breaking changes to some things, but not to other things, e.g. it can't provide a new API for things in the standard library, as far as I know.

So for all intents and purposes, let me introduce this scenario to you:

  • Rust has made a promise to keep backwards compatibility and we want it to keep its promises :laughing:
  • expanding std without being able to make breaking changes to it might be a bad idea
  • a system to introduce breaking changes without breaking backwards compatibility would solve both problems

Maybe, we should discuss how to introduce breaking changes to std through editions, before we decide on expanding std, considering all the problems this could introduce.

Personally, I think the suggestion I'm offering here should be introduced no matter if we want to expand std or not, after all, things like mpsc are things we already want to improve, among other things.

More on the topic though, I don't want std to expand :smiley:.
Mainly for one big reason: I think the resources Rust has are already stretched thin as it is.
I also agree with most of the arguments that were mentioned by opponents of expanding std in this topic, no need to list them again.