Awkward guessing of types in Rust

As someone else who had a major struggle learning Rust, even after decades of programming in dozens of other languages, I think the above quoted comment is a bit harsh. I recommend reading what has become one of my favorite testimonials about learning Rust. I find particularly noteworthy this observation/question near the end:

What if Rust is only the means to teach good programming, by forcing good style on its users?

5 Likes

Yes but then every language claims that, in one form or another. I have seen them all come and go: "recursion is the solution, program as data is the solution, object oriented programming is the solution, declarative programming is the solution, functional programming is the solution....".

I am playing the devils advocate here to stimulate discussion on behalf of those who might try Rust and fail. Isn't Rust just saying: "extreme typing is the solution"? Discuss.

1 Like

No, Rust is not saying "extreme typing is the solution" any more than Ada or other strongly-typed languages. For me Rust is saying that attention to ownership, and mutation and other access rights, and the lifetimes thereof, is an essential element in writing correct code for the age of multi-core processors (rather than the PDP-11-like single-core derivatives that sequential programming evolved to serve).

9 Likes

To paraphrase the idea: perhaps it's not Rust that is hard, rather, writing correct programs is what is hard.

6 Likes

Ha! Yes.

However I have been banging my head against the issue of writing correct programs for over three decades. In application areas from embedded system process control to secure military communications systems to avionics controls. Using languages like Coral, PL/M, Lucol, Ada and C, to C++ in modern times.

Which is why I'm very enthusiastic about Rust. Finally a language that addresses many of the concerns I have had with all the above and more. The first interesting language I have learned since I discovered ALGOL at uni or the advance features that I never suspected were in Javascript.

Turns out Rust is a rather large language. It's a lot to get ones head around. A lot of the features and terminology used to describe them are alien if one comes from a background of "normal" languages such as the ones I listed above.

As an example, things like lambda functions and closures are something I only became familiar with when having to use Javascript a couple of years ago.

No, in this discussion it's the difficulty of getting into the Rust language itself and it's libraries that is the topic for me.

As it happens, I thought I was getting along rather well. In the two/three months since I discovered Rust I already have a system up and running that is Rust end to end. From the remote embedded sensor nodes to the services they connect to in the cloud to the Rocket web server and database interfaces. In fact we are betting the whole company on Rust at this moment. A total switch from node,js and C++.

All that without having any idea about those weird lifetime annotations or how to write macros! Let alone the async/await stuff and a lot more.

Hmm...perhaps I should take time out from creating stuff in Rust for more detailed study of the language itself :slight_smile:

4 Likes

Maybe it's just a question of familiarity or background, then?

I don't consider myself an expert in Rust by any means, but I never felt I was struggling with the whole language per se. There have been a few moments of irritation with lifetimes, of course, but which language doesn't cause any irritation?

I have to admit though that I have been programming in C++ for a long time before having discovered Rust, and all the idioms C++ programmers try to adopt in order to increase the chance of writing correct code are basically what Rust's ownership/borrowing rules codify in a more unified, formalized manner.

1 Like

Well there we go. C++ is also a huge and complex language. I don't believe there is a human alive that understands all the details of it's syntactic and semantic parts and how they all interact together. I have seen Bjarne himself faultering at understanding a problem with a few lines of C++ on a screen at a presentation. (From some of Bjarne's statements he is worried C++ has jumped the shark himself)

Despite having used C++ extensively I'm sure that if I was quizzed on it you would find I know almost nothing about it. Template meta programming, the subtleties of inheritance and virtual functions etc,etc, etc.

Frankly, I decided years ago that C++ had long since passed the point of absurd in complexity without offering anything new that I wanted and building on foundations of sand, I did not actually want to learn all that stuff anymore. Since then they have added things like lambdas and closures and move semantics... I'm out of that asylum.

Rust on the other hand has shone a beacon calling me from the beginning, all the way from the simplicity of main(), to the cargo system, to it's rigorous type system, to it's excellent error messages, to it's C like performance and most importantly it's emphasis on memory safety. I will be studying it a lot more, if I get time off from actually using it :slight_smile:

That whole deal with ownership does not seem to be a big deal, I love that the compiler enforces the rules. I have been writing threaded code since forever so I get the idea. My first multi-processor system experience was a rack of 8 bit Intel 8085 processors that had shared RAM and all the code written in assembler. The only mutual exclusion issue I remember from that time was a bug in the hardware of the RAM arbitration logic that I fixed!

4 Likes

I’ll share that I find using an IDE with first class support for Rust absolutely essential to my experience. I love being able to pull up type information at will for exactly this reason. I’ve been really, really pleased with CLion, and I’ve certainly learned faster because of it.

2 Likes

I find it to be the exact opposite. I've been on a C codebase at work, trying to parallelize it and fix some bugs. Runtime errors are the absolute best case scenario because most of the time you can just backtrace them. Usually what happens instead is you just get garbage data because function write_foo was reading from array while function spawn_task was filling it with the next set of data. Except you didn't know array was being shared among threads because there's no type level representation of that kind of information, so you spent a week looking for bugs in your logic before figuring it out, once a user detected the bug which you never found because it's data-dependent and none of your test cases happened to exercise it. By the way, array is actually buried deep in a struct that contains both shared and non-shared data of the same type, and constness doesn't propagate through pointers, so once you've identified the problem, there's no good way to make it read-only so the compiler can help you fix it; you just git grep for things that look like they might be modifying array in a bad way and hope you didn't miss any.

(None of the above is an exaggeration of real events.)

I'd love to be writing this in Rust instead. Yeah, I'd get way more compile time errors, but that's a good thing. Actually, it's a great thing. The sooner I know there's a problem with my code (... or someone else's), the quicker I can fix it, and the more information I have about it, the better. Far from being baroque, I find Rust's error messages to be surprisingly direct and helpful. (If there's a particular situation that gives a distinctly confusing error message, feel free to file a bug against the compiler; they take that kind of thing quite seriously.) Now C++ template substitution errors, those are baroque! :wink:

Back to the main topic, I mostly use Vim myself but I have given the IntelliJ Rust plugin a try and it is quite nice. It removes a lot of the guesswork around types, so if you find that to be an obstacle, I would definitely recommend giving it a try. IntelliJ (not CLion) is free and cross-platform.

6 Likes

Experience shows that in general this is not the case.

Personal anecdote:

On the day of the roll out of a system comprising over 100 embedded compute nodes I discovered a bug in their firmware that could cause a slow memory leak and eventually failure. I calculated that the whole network would go down in about 7 days. It took me that 7 days to find the bug, devise a fix and test it all properly. On the 7th day the system went down, but I had the fix ready. Installing the fix involved a week long site visit in a different country. All in all an expensive disaster that would not have happened if would could have used Rust.

Industry experience:

Microsoft attributes 70% of security vulnerabilities in the OS software over recent years to memory use errors that could have been prevented using Rust: /blog/2019/07/a-proactive-approach-to-more-secure-code/

Similar bug rates due to such errors are also reported for the Linux kernel and other major projects.

All in all we see that although Rusts checking may be annoying to some they have the potential to prevent masses of problems and hence save a lot of wasted time and expense.

8 Likes

I personally think the problem is the tendency to optimize the writing of code and not the reading of code. That and the tools that you choose to use.

In the simple case below:

  let v1 = vec![1, 2, 3];
  let v1_iter = v1.iter();
  let total: i32 = v1_iter.sum();

You must know that vec! returns type Vec and track that type of v1 throughout the code, until of course it is reassigned later in the function if a developer chose to do that.

You could have written:

let v1:Vec<i32> = vec![1, 2, 3];
let v1_iter:Iter<_,i32> = v1.iter();
let total: i32 = v1_iter.sum();

Of course this is so basic in Rust, one would likely argue that if you don't know that vec! returns Vec<??> you should be learning more before complaining.

But if you are the type of developer who works with constantly evolving and DRY code where you have to look at each method to understand what each variable type is, it can quickly make reviewing code extremely painful.

In fact, the code bases I currently maintain, which are mostly in C# and Typescript would never tolerate someone writing the code:

  var myLocalValue = myFunction();
  myLocalValue.doSomething();

I think the only type of development where this would be acceptable is if you are not creating DRY code, have a very small list of APIs that any of your code calls directly, and the cost to add the type information is greater that the cost of every programmer who may need to understand that code learning the APIs like the back of their hand so they can quickly review and understand a set of code.

As for tools, I had been trying to get up to speed with vim. But I ended up just using Visual Studio Code. Visual Studio code is fantastic for jumping to the source code you are referencing so you can understand the types you are working with. I still prefer specifying the types I am working with for my local variables and having the compiler confirm everything is set versus writing code that looks like javascript that is checked and compiled.

A side note on writing invalid Rust code and using the compiler to know the type trick mentioned in this post. They specifically outline that strategy in the Rust book. The Rust Programming Language - The Rust Programming Language It might be worth a read.

I am still very new to Rust so I might be missing something. But I have been writing code in a variety of languages since 1991. In that time, I have found that not knowing the type information by just reading a line of code is costly. You can practically measure the efficiency of a language for a developer by how much time they have to spend searching to understand a line of code. The absolute worst I have experienced is a large undocumented javascript project where you actually have to find and inspect the unit tests to understand the exact intent of a function.

1 Like

So you are basically arguing against type inference. Note that Rust only has local type inference, which strikes a balance between ease of writing and ease of reading. Top-level functions and trait impls must always have full type information explicitly spelled out. Experience shows that this is way more important than having explicit types for each variable (or even every temporary expression) inside a function, since once you know the context, you know a lot more about what to expect inner expressions to be typed as.

And indeed, I don't find the "but you have to know that vec![] has type Vec" argument convincing at all (see my comment about very common APIs above).

3 Likes

I think that for many who are new to rust, using IntelliJ at first is helpful. Below is a comparison of what you see in VSCode (top) vs. IntelliJ Idea (bottom). Yes, in VSCode you can just hover to see the types, but early on, when you don't have reliable intuition about the type of most things, it's nice to just have the types (and argument names) for everything appear magically. Note that the code is not actually changed -- it's just presented to you as if you had entered the type annotations.

After you have been coding for a while in Rust, I suspect that that will become visual noise that doesn't add much value, because you "just know" what is going on. I'm not there yet, personally. But perhaps at that point you will prefer to just see the code as it is and hover over anything you need explained.

hello_world____rust_hello_world__-____src_main_rs__hello_world__and_main_rs%E2%80%94_hello_world

I'd really like to be able to just turn that on and off with a key combo in VSCode. :slight_smile:

4 Likes

I am starting to use VSCode in the hope of this and other great things yet to come.
Oh, unfortunately it needs RLS and that does not work with nightly, so I am back to square one.

Vec was a simple example to illustrate something that anyone would understand. But if I am navigating into other code to understand it, as I am doing with diesel right now, I am not dealing with such common types as a vector. Diesel is actually pretty straight forward compared to something like hyper, which I was looking at when I wrote the original post.

I always try to optimize the reading / understanding of code rather than the writing of it. And the time it takes to understand code is a substantial cost for large projects. Especially if that code is to be reviewed by someone other than the author.

So I would say that limiting your code to only the syntax necessary for the compiler to understand your intent and not future readers of the code (or your future self assuming you write modules and leave them alone for 6 months to 4 years as they do the job originally intended) is a bad idea. Of course, my views are limited to those who write and maintain large bodies of code that are under constant development/improvement. If you constantly work on a small body of code or write "throw away" small projects that after the three month development cycle you will never look at or reuse the code outside of fixing bugs, then my metrics are probably not applicable to your development experience.

Reading a method signature and having to understand that a typed variable (a concept you can label) is not returned but instead something that implements 4 traits with relationship interdependencies between the values in the traits and then determine what methods I can pass that return type to by looking over documentation or function signatures is a mental overload for me when reviewing code I didn't write and I don't want to take a substantial amount of time to become familiar with.

It may be that as I gain experience with Rust, something will just click. I am very new and just trying to get my bearings by understanding the full language syntax, what it is capable of and how some of the most useful crates (for me) are written. But right now it seems like rather than focusing on code, type definitions, and method return types to understand what can plug into what and what is meant to be used with what, you need to rely heavily on documentation provided by the author of the module. I am used to writing self describing classes, interfaces, methods, and types where documentation is practically "in the way" since the code naming, signatures, and single point of information for what a class or method can do is practically is self documenting.

Currently I am finding that reading the author's web sites (ex: diesel.rs), reviewing the crate documentation (ex: docs.rs/crates/diesel), and then opening the raw project are the only effective ways to getting the understanding I am looking for (ex: ~/.cargo/registry/src/github.com-XXX/diesel-1.4.3).

The documentation is the only way I have found to effectively find the trait implementation for types as while most of the traits are included in the .rs file that defines the struct, not all are. Let alone the blanket traits. I am used to being able to look at a .h file, or a single source file to understand everything I needed to about a class and "wrap it up with a nice bow".

Any advice you can give a newbie like myself for how to quickly understand the intent of code, how to use it, and how to perform static analysis on that code efficiently from the code rather than the author's documentation would be much appreciated.

1 Like

And this is not a good argument, as I mentioned earlier. If you need to look up the API anyway, there's no additional effort needed for seeing the types it is composed of, because they are literally right there in the documentation.

Me too. Rust has a few, much worse misfeatures that cause way more hardship with reading than type inference does, see e.g. the pattern matching "ergonomics" thing that was pushed through a couple months ago, despite widespread controversy.

I have to admit I don't understand what this is meant to be. Return types are clearly indicated in method signatures in rustdoc-generated documentation. Or is this a complaint against trait objects and/or impl Trait which hide concrete types in favor of statements of trait conformance? If so, I disagree with it because in the case of relying on traits, concrete types don't usually contain more useful meaning than knowing what traits they implement. And for writing decoupled code (which no doubt should be important when dealing with large systems), programming against interfaces rather than concrete types is one of the most important pieces of practice.

With all due respect, if you are "very new" to a language, I would indeed advise you to have more experience with it before giving negative criticism about its fundamental parts. Perhaps after a couple years your opinion will still stand, in which case maybe Rust is not a language for you. But asserting that things are wrong because you are not used to the particular style and idioms the language supports seems fundamentally ill-advised to me.

But then again, isn't this literally the primary purpose of the documentation? I'm seriously puzzled because this sentence sounds to me like "the documentation works better for finding out human intent and derived facts about the code than the raw code", which I don't find the least bit controversial. There are methodologies like literate programming that try to interleave documentation with code, but in my experience they aren't substantially more helpful than having separate documentation and code.

There is no royal road to geometry. In my opinion, quickly (or at all) understanding the code requires a lot of reading. And by reading I mean both reading the particular project you're interested in, as well as reading and trying to understand a lot of other code in general. One of the most helpful experiences I've had in learning a new language and ecosystem was always the act of contributing to opensource project. This might sound like a cliché, but as an example, I've contributed smaller patches, improvements, and bug fixes to Serde, and as a result, I can now use it without thinking nearly as much as in the beginning, when I was just starting to get familiar with it.

This one in particular sounds like you want a real static analysis tool. I can't give you profound advice on that because I didn't yet need to use one myself along with the compiler. That is, unless you mean "perform static analysis in my own head", in which case I think we're back at square 1. You need to read a lot of code and gain experience with idioms, APIs and coding style in general in order to be able to (realistically) do that. (Of course, one could program in a way where one runs a separate static analysis tool after every change in order to query the type of variables and functions, etc., but that doesn't sound productive to me.)

2 Likes

This, as far as I understand what you are saying sounds very familiar.

It is the complaint that programmers made a lot when they started using C++ two or three decades ago.

All of a sudden parameters and return types were not simple thing ones could immediately comprehend, like ints, chars, arrays of said things, pointers to structs. Oh no, now they were some abstract type whose full meaning could not be seen immediately. It was a an instance of some class, that was derived from some other class, which was derived from ..... Then we throw a bunch of generics and templates in there. It becomes a major task to follow that chain of abstraction and find out what is actually going on, what is what and what gets called by who and when!

The only way to make progress is to give up the idea of actually understanding what is going on and write code according to the documentation, which is of course equally hard to follow so lots of examples are required.

Which is why huge and complex IDEs became so popular when formerly a simple text editor was sufficient. You need all that intelli-sense stuff to get a clue about what you are looking at.

Rust of course is also a large and complex language, not to mention the libraries that come with it. It has the same problem of that impenetrability of layers of abstraction for the reader to deal with.

As I found when trying to use the postgres crate with TLS. Still using it with a thread pool defeats me.

Hopefully, familiarity eventually eases this problem. I never really did become comfortable with C++ in that respect though.

3 Likes

To be entirely fair, extremely generic functions with lots of type-level invariant wizardry are very difficult to understand. Even more importantly, all that isn’t always necessary.

It’s possible and perfectly fine to just use Rust as a safer C without slapping V-TECH stickers on the type system and pretending it’s Prolog.

If you’re just writing a program, keep it as simple as you want. If you’re writing a library which absolutely must be usable in many situations while upholding many guarantees, that’s what the powerful trait constraint features are for.

In C, if you want generics, it’s fairly common to just duplicate a function and append the type name to the function name. That’s fine too. That’s close to what the generated code comes out to.

Abstraction is good. Abstraction is bad. Both can be true.

5 Likes

Oh yes. I have long since maintained that all those "zero cost abstractions" of C++ have an enormous cost. The cost of all those millions programmer cycles wasted as they try and understand what the hell is going on in the code they have to read.

I have seen a lot of excessive abstractions suggested to simple problems here in the short time I have been around. Bad for human comprehension, bad for machine performance. Sadly when I mention it those posts have ended up being hidden.

4 Likes

While I respect that Rust is a complex language, I think you're describing a past that never really existed. There's really no such thing as "just an int", even in C, except for the most trivial of functions. Here's a section from the man page for the waitpid function describing its return value:

If wait () or waitpid () returns because the status of a child process is available, these functions shall return a value equal to the process ID of the child process for which status is reported. If wait () or waitpid () returns due to the delivery of a signal to the calling process, -1 shall be returned and errno set to [EINTR]. If waitpid () was invoked with WNOHANG set in options , it has at least one child process specified by pid for which status is not available, and status is not available for any process specified by pid , 0 is returned. Otherwise, ( pid_t )-1 shall be returned, and errno set to indicate the error.

Because C is such a simple language (and also because it was designed in an era of register-starved architectures, among other reasons), it packs a lot of subtlety into those "simple" ints and chars. In Rust you could express a lot of that if-then logic, which in C is just part of the man page, into traits with compile-time checks that can ensure you don't accidentally pass -1 when you meant to pass 0, or forget to test the return value is > 0 rather than just != 0.

Forgetting to check a particular error case is the source of a lot of bugs and security vulnerabilities in C-like languages (article, from yesterday). All the complicated type constraints, traits and relationships are really just taking all that stuff that was previously in the documentation and promoting it to the language itself.

So if anything I would say the trend is the opposite: moving essential information out of the documentation and into the code.

4 Likes