Rust beginner notes & questions

haha, totally agree.

to me, this is what peter_bertok is trying get at, and that in his view rust is already starting to slip into some bad habits that led to complicated c++, the latter of which has had many years to accumulate such cruft. Perhaps that just the nature of an aging programming language that has to evolve with other advances (we don't really know since this is all pretty new to us as humans, maybe it'll be normal in a few hundred years that programming languages just need to die :woman_shrugging: )

I tend to think about things in terms of "sophistication == active refining simplicity" and "complication == active emerging conflict"... and "simplicity" as "achieving a goal with least resistance". I think here the goal is for Rust to be sophisticated without being complicated. writing code is really a struggle between sophistication and complication, and its never ending.

2 Likes

My point is that these bad design decisions were entirely predictable, and could have been avoided with a tiny bit of experience and foresight. It's just a matter of inspecting the history of other languages.

In my experience, most popular languages go through a life-cycle that goes something like:

  1. No templates, trivial APIs. Everybody is impressed at how lightweight and simple everything is. There are many convenience wrappers around OS concepts such as the POSIX or Win32 file descriptors wrapped in a convenient Stream class, hiding some of the 1960s legacy behind a thin facade. A half-arsed attempt at i18n is grudgingly included, but there is clearly an anglo-centric feel to the language. That's okay, all the initial developers are in the western world, so you can get away with this. This is 1990s C++, C# 1.0, Java 1.0, and Go currently.
  2. That template itch just won't go away, so it is hacked into the language. A lot of APIs are duplicated, such as IEnumerable vs IEnumerable<T> in C#. Legacy APIs like Stream are left byte-only, because it's just too hard to fix it now. This is C# 2.0, Java SE5, early 2000s C++, etc...
  3. There's a grudging acceptance that the rest of the world will refuse to learn English, so the i18n APIs are aggressively expanded. Now there's two versions of the String APIs, one with the old defaults and one with the new comparison options. There's usually several of each of of the date, time, and calendar types now, because the real world is complicated. Sometimes things just get shredded because an intermediate library hasn't been updated to handle DateTimeOffset or whatever. Bad luck!
  4. Turns out templates are hard! They need all sorts of restrictions such as being constrained to value (copy) types, types that implement a particular API, and so forth. The C++ guys are still trying to work this out. Rust has had this since before v1.0 via Traits. <- You are here
  5. At some point someone realises that with the advances made in step #4, a lot of APIs can be rewritten to be vastly more elegant. "Obviously", making things like Stream a lightweight synchronous wrapper around a byte-only file descriptor was a mistake, so now the entire language is slowly reinvented piece by piece, leaving an absolute mess of incompatible APIs next to a bunch of legacy garbage. Modern C++, dotnet core 2.x, and Java 10 are here now.

First, please read this article because it spectacularly illustrates my point: Pipelines - a guided tour of the new IO API in .NET

The dotnet core guys came up with this, and it's great, but now 99% of the code out there is based on the old stuff, so this won't be used much. Third-party libraries and large enterprise systems will continue to use Stream, directly or indirectly. For example, XmlReader will probably never get properly rewritten in terms of the Pipe API, because it would be a breaking change to make proper use of the efficiencies, such as allowing consumers to use Span<char> instead of heap-allocated String instances. It's too late. The language was built up incrementally, and you'd have to make a new -- incompatible -- version to really make use of the features. (Just look at Python 2 to 3 or Perl 5 to 6 to see how easy that is!)

We know what a stream API should look like in any language that's gone past stage #4. My point is that Rust essentially started at stage #4, its developers had all of this history to reference, yet std:io::Read looks very much like the C# 1.0 Stream API that is finally getting replaced. It inherently makes copies, it is inherently byte-based, and it mixes in unrelated string APIs that prevent a backwards-compatible upgrade to a template-based version. Etc, etc...

In fact, as an API the Rust version is objectively worse, to the point that I can 100% guarantee you that it must be eventually thrown out and replaced by something better thought out.

For example, the Rust Read trait forces UTF-8 on you, so even if you just want to read UCS-16 out of a binary stream, then you... have to go down a completely different API path! Err... wat? Even C# 1.0, back in 2002, got this right! It has a separate TextReader classes to wrap byte streams in a specifiable encoding. The underlying Stream class makes no such i18n assumptions. Remember... not everybody speaks English and not everything is UTF-8, no matter how hard we want this to be true!

This is what disappoints me about Rust. It got a running start compared to other languages, it was developed with decades of history to reference, yet it seems to insist on repeating the same mistakes...

PS: I'm not the only one with this point of view: https://www.reddit.com/r/programming/comments/8vjjgu/pipelines_a_guided_tour_of_the_new_io_api_in_net/e1ow1an/

PPS: I take it all back, I just had a play with the C# Pipelines API, and it turns out that it is not template based, its data stream is always made up of bytes. I was tricked by seeing SomeClass<byte> in code samples, but that's just code reuse. The API itself does not generalise to other value types such as char or whatever. Sigh...

6 Likes

A more accurate moniker is "difficult for beginners". Once you know the language, you are as fluid in it as any other (even more so in comparison with some like C++ where you have to stop and look over your shoulder every now and then).

2 Likes

Note that I consider this a far lower "cognitive tax" than having to run the borrow checker entirely in my own brain like I do in C++. As a result I'll at least try borrowing things (and .clone() if it gets awkward) in Rust where in C++ I'd just copy-construct from the get-go since it's so hard to be sure I didn't break something.

It's certainly true that designing in a way that best leverages the checks is something that needs to be learned, and that some things (certain kinds of graphs seem to be the usual example) just aren't an elegant fit for the model.

I'd be great to hear exactly what they were from them. This thread has a ton of things in it, from tiny to philosophical, so I suspect their list would be somewhat different.

1 Like

Some things can't be changed now, so there's no point in distressing about them. Some other things can be fixed or added with a deprecation. So my suggestion is to write down a very long list of the things you don't like, remove the ones you believe are impossible changes, and open a separate RFC for each of them for the Rust 2018 edition (or where possible even for Rust 2015). And then be humble when people tell you some practical problems. Most ideas will be closed or shot down, but if even very few see the light in Rust 2018 you will have a positive and multiplicative impact on future Rust users :slight_smile:

10 Likes

I’m curious if you’ve been able to get over your std::io::Read gripe and look at Rust some more, beyond your initial post in this thread.

I also think it’s incredibly unrealistic to expect a language and/or its stdlib to be flawless, regardless of how many have come before. Doubly so if you actually intend for people to use it rather than sit in someone’s imagination.

2 Likes

Perhaps you could start an RFC and we could all iterate on it to create a "proper" Stream API for Rust? I think you've pointed out a lot of really good ideas and points that should be addressed. I pretty much agree with your analysis, though, I would not currently have had the foresight to so succinctly categorize the issues.

1 Like

Rust has many warts. Its feature-set does appear less cohesive and consistent when compared to some languages (C# comes to mind). There are countless other problems mainly due to its youth. Then there's the borrow checker. So feeling negative emotions is almost a rite of passage for a Rust beginner. I don't want to belittle your feelings, but in my experience once you get past this initial despair and when you get to coding for production instead of doing toy programs is when you come to appreciate what a life saver and how brilliant this language is. This is because Rust's strengths which far outweigh its faults unfortunately become apparent only when you've done any real-world work in it. And that'll perhaps forever be Rust's curse.

8 Likes

C# is a very high bar, it's one of the best designed languages out there (still, I think Rust is better for my usages).

3 Likes

I love Rust - let me preface with that.

Unfortunately, I think the tears at the seams (i.e. seeming incohesion/inconsistency) are visible at both the early/beginner stage and also at a later stage, although they're for different reasons. I am very hopeful that, over time, they'll be ironed out to a point where they're barely noticeable.

That said, no language is perfect. Rust is doing something novel, certainly so for any language that can be called mainstream. I think it's understandable that it'll have some growing pains, both at the lang and stdlib levels. There's just no way around it. I think the communities' (and Rust core teams') priorities align very well with mine (i.e. robustness, expressiveness, correctness to borderline pedantry, and performance).

Rust will definitely not be for everyone, just like no other language is universally praised or liked. The people that will like it are the ones holding the same core values as Rust and willing to put in the time to learn it, with all its quirks and idiosyncrasies.

4 Likes

Absolutely. I would the say the same about C#. That said, one thing that can be mentioned in Rust's defense is that C# has GC which makes a lot of decisions easier. We only have to look at the state of the art prior to Rust when it comes to non-GC languages to appreciate that Rust is a huge improvement. That said (recursively), not all of Rust's difficulty or unsightly parts stem from the memory-management challenge it's set for it itself.

4 Likes

Yup. The rough edges don't really go away with experience. However, you kinda learn to live with them since you get so much in return.

I also happen to really like C#, and used to use it quite extensively. But, it's also not perfect :slight_smile:. The bifurcation of reference vs value types is there, and there are some footguns with using value types. Some of you might remember how lambdas used to desugar in for loops, capturing the value only for the last iteration in some cases (that was fixed at some point). The .NET standard lib used to be extremely allocation happy and it was a challenge to write performant code. Although methods were sealed by default, classes weren't which ought to have been the better default. C# made the same mistake as Java of having a volatile field keyword, which is completely backwards in today's thinking - it's not the field that's volatile, but accesses to it (and each of those may have their own memory ordering requirements to boot). There was no good memory model for the language (not sure if there is one now, on par with say Java's memory model). null is still present. Exceptions are unchecked, which is fine if you hate incorrect usage of Java style checked exceptions, but makes it incredibly difficult to write robust code. And so on.

7 Likes

It's definitely not all, but it's very close to it I think. Particularly if you generalize "memory management" to soundness. A lot of the difficulty is being a low level language trying to marry high level features while being sound at compile time. That's a very hefty (and praiseworthy!) goal.

4 Likes

I think to call C# a really well-designed language is a bit of an overstatement. As far as I'm concerned (and I like both Java and C# for what they are) it is just Java with a little better support for value types. They frankly got it wrong with exceptions (as you mentioned). They got it wrong in how they handle "null" (as you mentioned). They got it wrong wrt to volatile (as you mentioned). It really is only a marginal, at best, improvement (and that is even debatable) over Java. I really can't see much real advantage of C# over Java for most cases. Java tends to push as much as possible to libraries/JDK whereas C# tends to incorporate new language features more often, but, I really don't see that one is necessarily that much better or worse than the other. I prefer Java exception handling over C# exception handling, but, I like the Rust way of error/alternative handling even better. I think Rust is getting A LOT Of things right, but, there is definitely room for improvement and the comments by the OP can help inform the discussion (even if they do, at first reading, come off a little snide or combative).

5 Likes

IMO, C# is a well-designed language. Is it perfect? No, as mentioned. But I don't know any perfect language. I've not followed it too closely in the last few years, but I recall in the beginning there was nice consistency and "flow" to features added in version N and how they enabled something else in version N+1. There's a lot right about C# if you don't mind a GC/JIT/managed runtime.

To call C# "just Java with a little better support for value types" is ... disingenuous at best :slight_smile:. I really don't want to sidetrack this thread into Java vs C# (or Rust vs C#, for that matter), so I'll stop here. But I've used C# and Java extensively, and their comparison ends right on the surface for me.

5 Likes

std::io::Read doesn't forces utf8. In fact, it does not imply any encoding - It's just a stream of bytes. It can be a utf8 encoded text file from local disk, euc-kr encoded html from gunzip stream, or even a jpg encoded picture of kitten from the internet.

Read is used for low level abstraction in io context. It only cares bytes, because everything in memory are bytes! Arbitrary typed generic iterator, which should be std::iter::Iterator, can be constructed on top of it.

I think what makes you feeling such is std::io::BufRead::read_line(), which assumes that input stream is utf8-encoded. This is just a simple shortcut for common case, as most streams we handle line-by-line are utf8 encoded. But if it's not your case, you can always bypass such highlevel api and handle bytes directly.

2 Likes

Read in std::io - Rust is what doesn't really belong there, but I suspect was added as a convenience. An implementation that doesn't have UTF8 strings internally can return an error for that method, but that method ought to not be there in the first place.

1 Like

I honestly wish I could do exactly that, but I don't use Rust enough to really contribute meaningfully. I've dabbled with it just long enough to determine that it won't help me in any future projects.

Right now, for the kind of work I'm doing, the runtime overhead of C# for me is relatively unimportant compared to its productivity, which is the best of any language I've personally used. My next step up would be switching to F# on dotnet core, as that would both boost my productivity and performance significantly. The extra 20%-50% runtime performance from Rust just isn't worth it compared to the drop in productivity.

For example, Rust Windows interop is... not pretty right now. There just isn't the same kind of pre-packaged, ready-to-use wrappers around the Win32 APIs that C# has. Does it have the ability to call COM yet? DCOM+? Can you create a socket server with Active Directory Kerberos authentication? Can I validate a certificate against the machine trust store? Last time I checked, there were blocking issues for most of my use-cases, and to be honest I gave up after getting bogged down in all the niggling little issues related to UCS-16 string handling.

At the end of the day, 90% of desktops are still Windows, and well over 50% of all enterprise servers run it too. Rust is very Linux/POSIX centric. All the performance or safety in the world doesn't help if I can't get off the ground and make productive progress on a useful project...

2 Likes

Regarding Read and UCS-16: you always can write an extension trait which will implement convinience UCS-16 methods while using raw bytes IO under the hood. Should UCS-16 methods or methods which will accept different encodings be in the std? Personally I don't think so, but it's a good idea for a crate. (maybe it already exists?)