Common newbie mistakes or bad practices

Some additional ones I have seen:

  • In a CLI program, intermingling I/O, argument parsing, etc. with actual domain logic. Domain logic should have its own module or even crate for clean re-usability.
  • Again, w.r.t. CLI, parsing arguments manually, or even trying to configure clap manually, instead of using structopt to create a strongly-typed input/config struct.
  • Thinking of error handling as an "afterthought" or as some additional annoyance instead of designing with Result and ?-bubbling upfront.
  • Overusing slice indexing when iterators would be cleaner/faster, or the converse, overusing iterators where indexing would be easier on borrowck.
10 Likes
  • Path::to_str().unwrap() instead of Path::display() to put a path into an error message
  • Trait over-use: implementing From for something which could have been a named factory function; introducing a trait where a bunch of inherent methods would do.
  • Trait miss-use: implementing TryFrom<&str> instead of FromStr.
  • Unnecessary .unwraps: if opt.is_some() { opt.unwrap() }.
  • Generic bloat: pub fn do_something(path: impl AsRef<Path>) { 200 line function to monomorphise in every crate }
  • Cyclic dependencies between crates (when a leaf crate have a dev-dependency on the root crate)
  • Somewhat arbitrary splitting of code into crates in general.
  • Error management. Rust has the tools to implement the best possible error management of any language, but there's no pit of success there. One stable state is a giant enum which combines errors from different subsystems, has an Other(String) variant just in case, and which is used for basically anything.
14 Likes

Can you elaborate on why this might be a mistake/bad practice?

I'm guessing it might be because now you've got multiple pieces of code all trying to write to the same thing instead of going through a common abstraction/mediator?

1 Like

@Michael-F-Bryan The context is something I see people attempt rather often and then ask questions about in the Tokio discord. They are writing e.g. a chat server, and then they define a map that looks something like this:

HashMap<UserId, TcpStream>

but then they run into trouble with this because they want to be able to read from every IO resource at the same time, and they wrap it in a mutex. This pretty much always ends up falling apart. It is common enough that I wrote the article Actors with Tokio so I could link people making this mistake to an article explaining what they should do instead. (I.e. they should instead spawn an actor per connection, then put actor handles in the hash map.)

To be fair, there are some valid reasons to put IO resources in mutexes. For example, you could put a single writer into the mutex and have threads take turns writing stuff to it. Database pools also do it.

Perhaps a better description of the mistake is "putting IO resources in collections" or "trying to have a single async task manage a variable number of IO resources".

8 Likes

An addendum to this: relying on some IDE plugin's mangling of the error message rather than the actual rustc output.

rant

I have every respect for the people who develop the various IDE tools, and this is not meant to bash on IDEs or people who use them. But this mistake happens daily. I used to think that IDEs were fine in their own way but just never liked them much myself; now, though, I'm increasingly of the opinion that one needs at least a basic understanding of the command line tools before graduating to a pushbutton GUI. This shift in opinion is almost entirely due to the endless stream of people who post screenshots in which rustc's beautiful, colorized error message with ASCII-art arrows pointing at relevant sections of text and helpful suggestions has been stripped of its formatting, reflowed, truncated, and stuffed into a pop-up in what I usually assume is VS Code.

Perhaps for languages that have less helpful error messages - which is most languages - this is not a problem. For people who already know to check the terminal output and not just point at things with the mouse cursor, again, not a problem. But you don't have to spend long in these forums or Discord to realize that for a lot of people the IDE is just getting in the way.

19 Likes

This. I come from a university where the norm for teaching beginner programming classes in C++ is "use this IDE, press the green arrow, and you are good to go". The official advice for depending on external libraries is not to learn the usage of the linker; it is instead a pre-built zip file with an empty project in it that has been configured by the lecturer using the arcane options in the GUI to link against that library, and which you have to unzip again and again when you want to start a new project.

Abstraction is nice, but I don't believe anyone can realistically hope to use any sort of tool without at least a basic understanding of the level of abstraction immediately below it.


As an aside, the resource consumption of modern IDEs just infuriates me to the extreme. The latest version of Xcode for instance is more than 10 Gigabytes, compressed. It also consumes ridiculous amounts of RAM for no good reason. In comparison, I use Vim as a fully-featured IDE, and it was something like 40 Megabytes last time I checked.

13 Likes

On the other hand, when people read rustc output, they frequently seem to skip over the main error: message and look only at the secondary messages ā€” probably because they are more emphasized by the whitespace and ASCII art.

13 Likes

Which is clearly demonstrated by how common weā€™re getting users in this forum that post excerpts from error messages which donā€™t contain the main error.

11 Likes

Exactly. Perhaps there could be some kind of experiment with adjusting rustc's output to make the main message be as emphasized as the secondary ones, somehow.

3 Likes

I think this would make everyone scream in horror but if the error order could be altered so that the first (often "most immediately actionable") one came last, then it would be read first in most terminals.

4 Likes

This is a bit more advanced, but it's often helpful in larger architectures to distinguish (as the type level) between runtime/recoverable errors and critical/unrecoverable errors. So I guess "having a single error type for everything" may be bad practice

1 Like

Recovering from errors seems like a lot of work at times, I have been leaving onus onto users to try again if something is too complex to recover from. I have been using these little fellows for all my error types(but maybe I lack experience to see benefits of different error types):

enum Language {
    English,
}

enum Message {
    ErrUnexpected,
}

impl Message {
    fn to_string(&self, lang: &Language) -> String {...}
}

enum CustomError {
    Message(Message),
    Messages(HashMap<String, CustomError>),
}

CustomError points to the place error occurs and may include additional steps to resolve it for the user in their own language. Its much nicer to see it in json form in postman.

It's certainly a complex topic but without getting into the weeds too much in this thread, consider a long-running user facing app: for any particular error, do you crash the entire app, or do you throw a message box at the user and keep going? So with this conceptual distinction then, at the type level (that is, this is more about dev ux than user ux): if a function failing means something is fundamentally broken, you a/ should indicate this to the caller somehow and b/ make it almost impossible to ignore that failure. You could panic and document the panic, but that means the caller can't catch that without defining a global handler or doing things with threads. Or you can return a Result<T, CritError>, which is different from a Result<T, NormalError>. Notably you can't ? a crit-fail function within a normal-fail function without explicitly handling that.

That particular pattern is a specialisation of a more general pattern of using typestates.

3 Likes

As in just create a from_str / from_whatever function? What's wrong with impl From?

I would actually appreciate if the compiler only spat out a single error message at a time by default. I can't fix multiple errors at the same time, so I will end up rebuilding for each one of them anyway.

5 Likes

Yeah. The practical problems with From with a single call-site are:

  • it's harder to find this call-site, it's not obvious what code exactly is called by ::from or .into
  • it's harder to refactor the code. If, in the future, you'll need an extra parameter at the call side, you'll need to first change the From to an inherent method.

The philosophical problem with From is that it signifies context-less conversion. The fact that there's a single place where you convert Foo into Bar doesn't mean that this conversion makes sense in general, it may only make sense in that particular context.

This is often the case with error in libraries. Let's say you have liba, which uses libb internally. liba can add impl From<libb::Error> for liba::Error to make implementing liba easier, because ? now works. However, that means that the user's code in libx can now do this:

fn doesnt_use_liba_at_all() -> Result<(), liba::Error> {
    libb::foo()?;
    libb::bar()?;
}

That is, the user can miss use From to construct liba::Error from libb errors which do not, in fact, originate in liba. Preventing this kinds of API hazards is one of the great ideas of snafu.

9 Likes

I am glad I am not the only person to struggle with From. The problem I run into is slightly different: auto completion:

Foo::from_blah(<TAB>
==> IntelliJ starts showing me argument names + types

Foo::from<TAB>
==> InteliJ shows me from_blah from_cat from_dog ...

Foo::from(<TAB>
==> IntelliJ can't complete for me, and I don't, from memory remember all the things we can From from

I reckon itā€™s reasonable when combined with #[cfg_attr].

Another place where it would be nice to use #[path] is with build script code generation:

#[path = concat!(env!("OUT_DIR"), "/generated.rs")]
mod generated;

Unfortunately, this still doesnā€™t work (it was initially said to be implemented in Rust 1.54, but it turned out not to be the case), so you have to write that like this:

mod generated {
    include!(concat!(env!("OUT_DIR"), "/generated.rs"));
}

I'm not sure if this belongs here or in some other place, but if novice in question is not a total novice then probably the most common mistake they do is an attempt to write JavaScript in Rust or Python in Rust or some other language in Rust.
Rust is usually presented as ā€œmulti-paradigm general-purpose programming languageā€ which somehow convinces people who know some other mainstream language that they should be able to just write code in Java or C# and then, somehow, mechanically translate that into Rust.
Somehow C++ people rarely do that mistake, even if many C++ styles don't easily translate to Rust, too.
Rust maybe ā€œmulti-paradigm general-purpose programming languageā€ but it's also incredibly opinionated and tries very hard to stir you toward ā€œgreat APIsā€ (good APIs are easy to use, great APIs are hard to abuse).
Very often that make direct translation of other language to Rust either hard to do or impossible to do.
Besides: if you want to write JavaScript then why do you want to do that in Rust? JavaScript is perfectly viable language on it's own.

7 Likes

That deserves an answer:

Quite a lot of my Rust code looks like Javascript. Thing is I want to avoid as many as possible of Rust syntax features that are likely weird and alien to those readers of my code that don't know Rust, Javascripters and the like. Even if they don't have any Rust chops to be able to hack on my code I don't want to totally confuse them and frighten them away from Rust at first sight. So, for example if I have to use lifetime tick marks in my code I have failed.

Speed. Often I feel the need for speed. I can't make JS outrun compiled code.

See little example here: Writing Javascript in Rust ... almost

Scale. Javascript is really not suitable when programs get large and there are multiple people working on it. Rusts type checking helps enormously in keeping everything in order. It makes one far more confident when modifying/refactoring code knowing that Rust will prevent the myriad of ways that one can silently break things in other languages.

Politic: Before we had written a line of code our former employer threatened to sue us thinking we had stolen the code we wrote for them before they went bust. We had not. I thought it would be prudent to implement the similar functionality we needed in a different language, just in case. So what was JS and node.js became Rust.

On the other hand I write a lot of my Rust code as if it were C. As much as the compiler will allow. Because C like languages are what I understand.

Speaking of "mechanical translation". A vendor supplied us with a couple of thousand lines of C# to show how to communicate with their device. In lieu of documentation. I pretty much did a line by line translation of that in to Rust. Worked a treat.

4 Likes