Why are there no concurrent collections in std?

Hi all! I want to migrate something I have implemented in Java for performance gain. However, I could not find concurrent collections in rust std for concurrent computing, especially concurrent hashmap. I did find several immature ones on the internet, but I am not sure if it's properly tested.

Since it's pretty challenging to implement concurrent collections and such collections are useful in concurrent scenarios, it would be good to provide a version in std as what Java did. Would you think this makes sense for Rust?

2 Likes

If you can use what ::crossbeam has to offer, go for it. Else cc @anon15139276, who should be very well suited to answer that.

::crossbeam does not support hash table for the moment.

I think it's really the best to include all concurrent collections into std, so that all the contributor could join their forces

1 Like

It's also hard to design, which is why we don't want std stuck with some particular API.

I would hope most external crates would welcome collaboration as well!

2 Likes

I agree Rust is worse than Java in this respect for now. People are working on it. The issue to subscribe is https://github.com/crossbeam-rs/rfcs/issues/32.

1 Like

A similar bad example: the c std library does not even have hash table.

Of course, but it's much more efficient and practical for the moment to let all the contributors work on the same library and polish it heavily

Why is the rust-lang/rust repository the only location that all the collaborators can work on the same library and polish it heavily? If anything, it's significantly more difficult to iterate on code in rust-lang/rust due to the size and CI times.

27 Likes

Concurrent collections are as fundamental and necessary as collections. Why does not rust-lang/rust excludes collections from it to make it lighter, to increase CI times? It's not a problem for Java to include concurrent collections, so I don't expect it to be a problem for Rust.

People have been discussing, e.g. lock-free hashmap, long ago https://github.com/rust-lang/rfcs/issues/659. However, I still did not see a mature one to use.

chashmap is fairly widely used and actively maintained.

It's completely possible that concurrent collections will be added to libstd in the future, but this can only really happen after they are developed and stabilized outside of libstd.

12 Likes

The most significant feature of Rust is its memory-safe and thread-safe. I don't see a reason why people here think that concurrent/thread-safe collections are not important for std. I am gonna create an issue in RFC for further discussions.

RFC issue: https://github.com/rust-lang/rfcs/issues/2679

Rust makes extremely strong guarantees around backwards compatibility, so if the wrong API gets added to std, we'll be stuck with it for a long time (see the Error trait, which has deprecated methods that can effectively never be removed). Making breaking changes is much easier outside of std - you can just release a new major version, and nobody's existing code will break because of semantic versioning.

So I wouldn't say the reason these things aren't in the standard library is because people think they're not important - quite the opposite, they're not in std yet because they're so important that we need to be sure they're done right :slight_smile:

26 Likes

Agree, but at least there should be something in experimental (e.g. rust-lang-nursery?) since this issue has been there for more than 4 years https://github.com/rust-lang/rfcs/issues/857.

Concurrent data structures is a great test ground for rust lang as well. It seems harder to design and implement such data structures given the existence of ownership. Rust-lang itself could also benefit from designing and implementing it.

The std is kept lean and small on purpose. I mean, there was even quite a strong push against having a linked list in there, on the basis that that is a niche data structure.

Furthermore, depending on what you do and how you do it, concurrent data structures might or might not be important. The way I've seen things designed in Rust, I seldom see the need to use a concurrent hash map. I'm not saying never, but possibly your point of view is just a bit biased by your experience.

And last, crossbeam is considered de facto the standard library for concurrent data structures. Why does it have to live in nursery when people already consider it the library to go to?

4 Likes

std is a necessary evil, not a place to put useful stuff in.

We have a cautionary tale already — Rust has put channels in the standard library. They work, but they turned out to be relatively slow, unnecessarily limited, and missing important functionality. And now it's nearly impossible to do anything about it due to libstd's backwards compatibility guarantees. OTOH crossbeam-channel has broken compatibility 3 times (which is doable for 3rd party crates), but evolved to have more flexible interface, more features and faster implementation.

This is not a unique case. Similar things have happened many times in many languages. Python's standard library has even been called a place "where modules go to die".

So "why concurrent collections aren't in std" is asking "why won't you commit to maintaining forever an implementation that will be deprecated in favor of a better crate on crates.io?". Rust skips the first part entirely — just go to crates.io and find an implementation that suits your needs.

29 Likes

Note that there is a proposal to replace std channel implementation with crossbeam-channel. I think std should have channel. It is a basic primitive.

https://stjepang.github.io/2019/03/02/new-channels.html

Your argument is in general, but I want to talk specifically about concurrent collections.

  1. Maintaining concurrent collections is as important as maintaining standard collections nowadays. I don't see why that would be deprecated instead of being improved. I don't see Java is suffering from its concurrent library.

  2. Unfortunately, I don't really see a mature create for concurrent collections. The guys from crossbeam are doing a great job since 2015, but they are only 4 people, probably work in their spare time.

  3. Thread-safe and performant collections are important and necessary.

This is my last reply in this thread. Let's see several years later if people would like to put concurrent collections into std of a new language.

Because backwards-incompatible API changes are forbidden on principle, and not everything can be improved just with additions and internal changes.

If there is no existing implementation that you'd be happy with, there's probably no obvious API and implementation that std could adopt that you'd like. There's no reason why an std implementation would be any better. And std has to get it right on the first try.

That's true, but IMHO these have nothing to do with being in std. Crates.io is full of crates that are important, necessary, and often more performant than std.

The other way to look at it is that std is a glue code and an interface to the compiler. crates.io is Rust's standard library.

20 Likes

Moving the collections into std would not solve this problem - it'd still likely be the same people working on them (as they're the ones with the necessary expertise - concurrency is hard!), only now they'd be hamstrung by the fact they can never make breaking changes.

19 Likes

You should note that, even rand is a external crate which located outside of the std. Rust stdlib is intensionally planned for foundation types which is necessary for communicate between crates like String or Vec, or to support language syntax like Iterator or Future. Because unlike C/++ Rust installation is bundled with built in state-of-the-art package manager, you can get those libs in just few keystrokes!

5 Likes

My favorite example of this is how Python has not one, not two, but four ways to do string formatting. You have the old-school % formatting, then you have string.Template with $-interpolation, then you have str.format which Rust borrows heavily from, and now you also have f-strings which are similar to str.format but are baked in to the parser and can interpolate arbitrary expressions. And of course that's in addition to the host of third-party string templating libraries that came into existence because the standard library is inadequate for one reason or another, but can't be changed because its API and behavior is fixed.

(For the record, Python is one of my top 2 favorite languages and the "batteries included" stdlib really shines for things I use Python for. But I think the minimal stdlib approach is better for Rust.)

Others have already responded to this, and you've said you won't keep posting, but I find it curious that you consider this a reason to put it into std. If anything, I think it cuts the other way: surely it would be better to have a mature, time-tested crate first, and then migrate it into std, than to create a new, immature implementation and then fix it "forever" (well, until deprecation) as a part of std.

I don't object to the idea of putting concurrent collections in std; I'd just rather it be done right than done right now.

14 Likes