Hi community,
I'm a C++ developer for 25yrs now. Started learning Rust 2 weeks ago and I'm already stumbling over the confusing "hm let us invent the wheel again" standard library and the custom crates, where others try to do it a bit different, so that it creates maximum overhead, if you simply want to exchange your underlying container.
I would like to know, what is your opinion about this?
Are there other developers who would like to have and/or implement such containers like it is done in STL?
I came from C++ to Rust, and I honestly like the Rust Stdlib much better than the C++ one. My only complaint about the Rust standard library is that it is missing things that are IMO critical, such as an async runtime.
The whole vibe with crates.io can start to feel really npm-like, and that takes some getting used to. But it majorly helps to realize that unlike in the case of NPM, most Rust crates exist because the smart author wrote a bit of reusable and nice code and wants others to not have to write the same thing.
Welcome to Rust!
Checking the source code I found that after this 2 weeks of learning I stumbled upon this: https://github.com/rust-lang/rust/issues/82766
My compiler didn't propose me the functionality I was looking for: pub fn try_insert()
P.S could it be that reserving of new vector space doesn't reallocate but alloc/copy/dealloc? (mod.rs::Allocator.grow())
Well. I have newsflash for you: C++ is not the only language in existence.
Rust standard library borrows, heavily, from ML and ML as wikipedia tells us is 52 years old.
Thus we may turn the blame around and ask why C++ renamed Map into transform and fold into accumulate (it used the natural name reduce in C++17, though).
But yes, if you think that C++ does it right and literally everyone else does it wrong… then yes, it's confusing.
I think it is important to note that Rust comes with a "built-in" build tool, cargo, which makes building your Rust applications very easy (no need for a separate Make, CMake, etc. pp.) Also, because cargo is tightly integrated with crates.io, adding "external" crates (libraries) to your Rust project is piece of cake. Much less painful than adding 3rd-party libraries to a C++ project!
So, the fact that Rust's std lib tends to be a bit "slim" is not really a drawback, since you can very easily add the "missing" features that you need via the corresponding crates (from crates.io).
Excuse me, I didn't talk about naming, I talked about consistency.
Let me give an example for function insert():
In C++ STL: on vector you get an iterator, on a set, you get an iterator, on a map, you get an iterator.
In rust std: on Vec, you get nothing, on BTreeSet, you get a bool, on a BTreeMap, you get Option<V>
Same story as with C++: returned types are depending on the useful info that may be returned back. Instead of pair of iterator and bool you get Option, because it makes sense. In C++ it wasn't possible to return Option, but, worse yet, old value is simply lost.
Rust's standard library goes after being useful instead of trying to pretend that different things are identical.
This is very nice when you don't need to work with generics, but quite a problem when you need to do that.
My problem with C++ STL is that you have types called vec and unordered_map instead of Vec and HashMap. I am not even joking, name of collections/data structures shouldn't be fully lowercase O.O
If you've worked for 25 years with a technology, it's reasonable that you find that more intuitive, and that a new technology can be confusing. There doesn't need to be any fault with that, neither on your part, nor with any of the two technologies.
Spend some time looking through Rust's std and there's good chance you'll come to terms with it. I think it has very high-quality documentation, which helps.
Regarding the crate ecosystem, it can be daunting at first. There's a lot of excellent crates out there, but of course also a lot of... Let's say less excellent ones. How do you find out? Over time you'll learn to recognize the crate authors, so you'll know who to trust. But at the beginning, it helps to have a look at the number of downloads per month and number of other crates that depend on that crate. Popularity of a crate is of course not a guarantee per see, but it's often a decent proxy.
First, thank you for this activity and all the proposals! I think I have to go into more details.
I have a tree, where every node contains a BTreeMap and a string key. The Map contains strings for looking up the children. You might see that now I have the String within the node and the same within the BTreeMap of the parent node.
As I don't want to duplicate Strings nor want an Rc around them, I thought about using references (&str) in both elements, and have a global lookup container, which holds the Strings I can reference on. I decided to use a BTreeSet to avoid string duplications.
But now, after I inserted the (&str, Node) into the map, I need to return a mutable reference to the inserted Node, because only at the end of the loop I know whether the last node is reached, which has to be marked as a leaf.
Side note, I implemented this in C++ and thought it would be a good deep dive for learning Rust. Here I just used a set. That was my first idea also here, but somehow I stumbled at how to store Nodes into the BTreeSet, where I can search for with &str
I uploaded my project here, if you want to take a look: github
How do you plan to avoid stale references if you don't use Rc and don't use owned data structure?
It wouldn't work. References are non-owned data structures in Rust. In C++ they sometimes are used that way, too — but can be long-living. In Rust that simply just wouldn't work.
I'm afraid you would need to think about design some more, because just from description it sounds like half of planned functions contain some hidden invariants that user is not supposed to break (e.g. user is not supposed to remove Strings while they are needed)… and Rust doesn't like these.
Don't. Just don't. That's not quite “I guarantee you'll fail to implement your little project”, but close to it. In the extremely unlikely chance if you successfully finishing it (1% probability) you would be able to claim that you conquered, in your first project, do to something that most rustaceans don't even attempt in first year or two. But much more likely chance is that you would simply fail to finish it and the whole thing would collapse under its weight.
Just go with Rc (or Arc if you want multi-threading). You would be able to speedup that code later if it would really be needed that badly.
Just use RefCell here, don't try to play these games. Again: not impossible, but extremely tricky. And usually not needed.
You are trying to write C++ in Rust, this simply wouldn't work. Rust makes things safe that are extremely dangerous in C++… by simply making the majority of them illegal.
Better to write working program first, then try to refactor it, rather then try to write perfect program and fail.
You may be interested in "string interning", an approach often used by compilers that need to have tons of duplicates of the same identifier. There are several crates for this in Rust. But depending on your exact usage pattern that might not be a good fit. Some of the interners are ref counted, others only allow for adding but not removing entries (unless you start over with a new interner). So it all depends on how long lived your process is, how many duplicates can be expected, etc.
In this case, I think the optimal solution in both c++ and rust is strong interning. There are many such libraries on crates.io. I won't recommend any because they are all slightly different so you must decide which one works best for your use case.
This is probably the biggest difference between c++ and rust. In c++ (and other languages like C) we freely and frequently return pointers to the internal contents of some container (or iterators, which are kind of just fancy pointers) but those can be invalidated at any time, for practically any reason. We then rely on sometimes convoluted invariants stated in the functions documentation about exactly which conditions must be met for that iterator to continue to be valid. Since those conditions are not checked by machines, they follow no unified conventions and can be arbitrarily complex. In rust, such functions are rarely used, instead we get more basic operations with stricter requirements, but those are completely machine verified (at least in safe code).
This is, after all, the main value proposition of rust - formal checking of conditions about the validity of references built into the compiler. The major gain of this is eliminating an enormous class of bugs commonly found in programs.
One negative consequence of taking which has traditionally been human-verified and machine-verifying it, is that those humans who are domain experts in that verification process feel frustrated because they cannot do things which they have always done successfully. Machine verification cannot possibly prove that all valid programs are in fact valid, so to gain the property of eliminating some class of bugs, it must reject some invalid programs.
Ultimately, there is an escape hatch. rust has a mechanism for saying "I know this is right but I cannot prove it", so we have not really lost anything, only gained safety in the 99% of cases which the compiler can prove are valid.
Unfortunately, the differences are even deeper than just reinventing the wheel. Things that seem similar between C++ and Rust are often quite different language features that have different purposes. Rust doesn't have a common ancestor with the C family of languages, only copied some superficial syntax from C++.
Rust's Iterators are not supposed to abstract over different collections/containers as whole objects. They're a low-level primitive for loops. They are only a single-use, temporary, opaque state for reading a stream of elements once (a syntax sugar for item* next() function called repeatedly until it returns nullptr and then can't be called ever again). It's for chaining loops, not for representing containers.
Rust doesn't have an abstract concept of a collection/container. Instead, Rust has traits that define more specific things you can do with types (iterate, slice, clone, extend, create), and types implement a subset they can. You can define your own trait and make types implement it (it's not a base class, so you can add your own trait's implementation to any existing type, or even abstractly implement it for types that don't exist yet).
Rust's references are loans, which are restricted to a temporary scope[1], and very strictly control access as either shared (with no or limited mutability) or completely exclusive. So a "mutable reference" in Rust is probably going to be frustratingly disappointing for you when you apply it like a C++ reference or a C pointer. It's less of a pointer to an element, and more like a compile-time zero-cost non-copyable exclusive write lock on an object it borrows from, making the source frozen and completely unusable for as long as the loan exists. References also come with a built-in impossible-to-remove restriction that they can never ever leave the scope from which they are borrowing. These restrictions are super useful for making multi-threaded code safe. They make capturing closures foolproof even in complex scenarios. They help make minimal and hard-to-abuse library APIs (e.g. you can expose a mutable reference to a field in your object, and still have 100% certainty that the caller won't keep it and won't be messing with your field while you use it). But in most cases they're not the right language feature for just "referencing" stuff in your implementation of a data structure (they're not even allowed to reference data from the same object they're stored in), and what to do instead is "it depends".
Ok,
first let me tell you that the subject is not really, what my problems come from. Was digging too deep and had to go back.
I found I'm not the only one having the issue of sharing even immutable types. See [Tracking Issue for `BTreeSet` entry APIs · Issue #133549 · rust-lang/rust · GitHub] - here he wanted to have a struct containing a string and a BTreeSet which uses this string for lookup.
Regarding the subject. I wanted to have a Map container with "shared" traits. Means if I want to replace the BTreeMap with another one reflecting the same behaviour, I would like to have the same interface.
When searching for a BTreeMap replacement I found other crates, flat_map and lite_map. Unfortunately it's not easily possible to replace BTreeMap e.g. with LiteMap as the range() function is missing. Or no, it's not missing, but it has another name and a bit different parameter.