Things in std with strong alternatives

There's certainly a lot less cruft to Rust's standard library than other languages', but there's certainly cases where libraries end up doing a better job than std for certain applications. What parts of std might you replace with a 3rd party lib?

I've got a couple of the notable ones to start us off:

std::sync::mspc is pretty generally understood to be suboptimal at best. Both flume and crossbeam-channel are popular choices for replacements that outperform std's channels in most ways. IIRC, flume's been under consideration for uplifting into std.

std::collections::LinkedList is almost never actually what you want. Most of the time, you're just good and probably better with Vec instead. In the cases you actually benefit from a linked list, you probably need more invasive control than std's LinkedList can give you.

A fun one, std::env::home_dir can give bad results on common platforms, and as such the std library suggests using a 3rd party crate for the functionality instead. Where to store files is a surprisingly difficult question to ask, let alone answer, and has potentially unexpected issues cross-platform. The home crate is what's used by cargo/rustup. Do your own research here, though; I'm very uncertain what the tradeoffs are.

It used to be that std's Mutex, Once, etc. weren't all that great, due to specifics on how they were using the OS primitives to ensure safety, and parking_lot was just generally a better choice. However, std's synchronization implementations were improved significantly, and now it's much more of a trade-off between different implementation choices, with parking_lot having the benefit of greater cross-platform consistency, but std benefitting from using the OS's primitives.

Similarly, std's float parsing used to be suboptimal, and likely will be again in the future in the lag between new algorithm development and std adopting them. Before the most recent iteration of the cycle, lexical greatly outperformed std both in accuracy and speed, but currently std's implementation is in a very reasonable position for the tradeoff between code size and performance.

And of course, we've got to mention hashbrown, which beat std's HashMap implementation so concretely that std now quite literally uses hashbrown's implementation. The only reason to use HashBrown directly currently is if you want/need to use the raw entry or raw table access it provides.

More in the trade-off zone, you might consider using camino instead of std's Path{Buf}, if you can guarantee that your application won't ever need to deal with non-wellformed-Unicode paths.

25 Likes

I remember when people used to use std::io::Error as a catch-all error type because Box<dyn std::error::Error> wasn't very useful.

The std::fs::canonicalize() function returns UNC paths on Windows (which is 100% correct - we're just asking the kernel for the canonical path), but a lot of programs assume paths look like C:\blah and will choke when you give it \\?\C:blah. If I ever need to do canonicalisation on Windows, I'll always reach for the dunce crate. That one isn't really an issue with std though - it's mainly because developers use string manipulation when working with paths rather than an abstraction that handles these subtleties.

4 Likes

As a bit of history, it used to be that these \\? paths would bypass the MAX_PATH limit in Windows because the user space path handling in win32.dll (or whatever it was) and get passed directly to the kernel's path handling, and it's the user space that actually had the limit.

At least Node (due to the silly lengths node_module paths got to) used that form a lot. Perhaps that's why that form got used by Rust too? MAX_PATH doesn't apply as of early Windows 10 IIRC, so there's not as much cause to see these now.

Anecdotally, on Windows I've made myself a small 5 lines program because I might be one of the only persons who actually regularly[1] need the UNC path. The fact that the std does that by default made it so that I didn't have to search and reach for anything else.


  1. once every four months maybe ? ^^ ↩︎

The ticket is File a issue with the standard library? · Issue #18 · zesterer/flume · GitHub. I'm partial to flume, but crossbeam-channel seems likely to cross the finish line first.

Some crates I would consider here are directories, directories-next, and pathos.

Trading off in a different direction, if one wants to ensure through the type system that the paths one constructs are canonical, there is canonical-path.

fs-err wraps the standard fs API to give more informative error messages; I don't think it has downsides.

To give due credit, I came to canonical-path and fs-err because they're re-exported by abscissa_core.

2 Likes

It still does. You have to enable long path awareness in the executable manifest and change a system register value to avoid the MAX_PATH limitation.

3 Likes

Oh, that's new to me. I thought (until now) parking_lot was generally faster.


Another thing to note regarding parking_lot: It supports Sendable lock guards using the send_guard feature, while std::sync::MutexGuard is always !Send.

2 Likes

std now uses futexes on Linux, which is great if you only care about Linux, but AFAIK all other platforms are still suboptimal.

2 Likes

Except code size?

1 Like

The ultimate reason is that DOS-style paths (like C:\path) in modern Windows are emulated on top of NT paths (e.g. \Device\HarddiskVolume3\path). This emulation is lossy because not all NT paths can be represented as DOS-style paths and DOS-style paths do lossy things like trim trailing spaces and dots (this made more slightly more sense in an old dos shell). In short, canonicalize needs to be able to return a correct path even if it's not representable as a normal DOS-style path. Removing path length limits (well up to i16::MAX u16s) is definitely a useful bonus though.

So, yeah, canonical uses \\?\ as a shorthand for "this is an NT path pretending to be a DOS path, please don't do anything weird with it". This makes converting NT to DOS simpler and also makes it easy to turn it back into a proper NT path. (side note: the only change is that \\?\ is converted to \??\ which is a special NT directory full of symlinks like C: which points to whatever the real path for the C: drive is).

3 Likes

They improved Windows and a number of BSD as well. Still pthread on MacOs as apparently priority inheritance is more important.

Tangential, but that issue is an example of where the standard library preserving flexibility payed dividends. It changed the behavior of what happens when you try to double-lock a Mutex on at least some platforms.

7 Likes

Small note about LinkedList VS Vec, VecDeque is amazing and can give you great performance for removals at the start and arbitrary removals

The GhostCell research by @RalfJung et al. claims to avoid (some of?) the trade-offs of the standard library's cell types:

An apparently separate implementation[1] is on Crates.io:

Credit @mathstuf for pointing me to this on IRLO.


  1. "The official implementation can be found at https://gitlab.mpi-sws.org/FP/ghostcell/-/tree/master/ghostcell, along with examples. The current implementation will be upgraded soonish, now that I'm aware of it." —the README ↩︎

In case it helps anyone, there exists an equivalent LCell<'id, T> type from the qcell crate, which also defines similar QCell<T>, TCell<Q, T>, and TLCell<Q, T> types which track the owner through runtime checks. The latter types can often be simpler to use than lifetime brands, while still keeping the property of being able to pass around a small (for QCell) or zero-size (for TCell and TLCell) owner object.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.