Weird std lib implementations from zero overhead / C++ perspective


#1

Hi,

I’m very new to rust, just working my way through the Book and trying out various small’ish examples trying to get impression of the language.

But I couldn’t help to notice some of the surprisingly inefficient or even strange implementations in std from the point of view of C++ person. Makes me wonder if there is still library modernisation talks happening and whether I should try to contribute :wink:

Few of such the examples I noticed today are composing errors types and std::time::Duration/Instant.

Current Duration/Instant implementation seems plain absurd. Split second/nanos storage makes it heavier than it needs to be, plus introduces nontrivial overhead on every operation for normalisation. Duration also seems to be defined as positive only, which sounds like WTF? And then std::sys::unix::time::Instant has completely different representation, which makes interop even more expensive.
Compared to C++ chrono implementation, where duration is just int64_t type-tagged with precision. And time_point is yet again duration with type-tagged epoch. This makes all (most) operations extremely cheap and efficient. And on top of it all durations easily and efficiently generalise to calendar manipulation, as can be seen in https://github.com/HowardHinnant/date.

Another example is propagating errors. From what I seen so far in the book recommended way is to re-wrap multiple errors on library boundary into composite enum, and add impl Error for it. But this also looks quite a bit more labour-intensive and less generic than C++ std::system_error (Guess I should try implementing it in rust some day, it might work here even better than in c++).

Some of the benefits of C++ system_error:

  • type is fixed size & lightweight: 2 words
  • It is polymorphic in underlying error type, we store actual error code + pointer to static object corresponding to category.
  • It’s super-cheap to test for concrete errors – just compare actual code value & category ptr directly.
  • allows to check for more generic “error conditions” by dynamically mapping codes. But this is more expensive check.

#2

int64 nanoseconds from the epoch doesn’t have enough range to be a good representation; it has “only” ~4 times the range of int32 seconds, the origin of the year 2038 problem. struct timespec has separate seconds and nanoseconds fields for the same reason, though C++ chrono does not.


#3

If you look closer, standard doesn’t hardcode neither precision nor accuracy.
Current default choice is int64 nanos, which seems to cover interesting range of history with excessive precision.

But it’s super easy to either increase bitness (int128_t ?) or reduce precision if you want to work with historical periods. https://github.com/HowardHinnant/date is actually great example, where it defines days as int32 with 1-day granularity.

Although I think this particular trick isn’t doable in Rust yet, I don’t think it can take integrals as type parameters yet. Though it may be emulated I guess.


#5

The time library in the standard library is incomplete. The Rust Project Developers offer a more complete variant of time that features a better Duration that supports negative durations. In addition to that, there is also a chrono library for Rust.


#6

This is really confusing… I would have never looked for other time libraries, if the standard library offers an API.
Will time or chrono ever been merged into the standard lib?


#7

Better implementations of standard library implementations almost always exist in every language.

I would imagine that eventually, at some point in the future, the advantages of these external libraries will be merged into the standard library after they have matured.