Common newbie mistakes or bad practices

For those Rustaceans who have been around for a while, what are some of the common newbie mistakes or bad practices that annoy you?

I'm not looking for issues with the borrow checker or compiler errors. Instead I'm thinking of code that is perfectly valid Rust, but which indicates a deeper lack of understanding or bad practices, particularly those which frustrate you more than they should when you see them.

I've got a couple...

  • Strings are not paths - use &Path and join() instead of &str and format!("{}/{}", dir, filename)
  • Unnecessary indirection - fn foo(title: &String)
  • Don't hard-code where you are reading data from - use fn parse(reader: impl std::io::Read) instead of fn parse(filename: &str)
  • Unnecessary use of Rc<RefCell<T>> because that's how you would write code in a GC'd language
  • Copies followed immediately by a borrow - e.g. function_taking_str(&"...".to_string())
  • Loose typing/stringly typed APIs - e.g. pass around a String instead of an enum
  • Not using ? to propagate errors or None
  • Not knowing the distinction between runtime and compile time - e.g. trying to use include_str!() instead of std::fs::read_to_string()
  • Not implementing Default, FromStr, From, orTryFrom
  • Writing function_taking_slice(&array[..]) instead of function_taking_slice(&array) because you don't know array references coerce to slices
  • Using "" instead of None when you don't have a useful value
  • Not initializing your type with useful values - e.g. use Vec::new() when you construct your type then later populate the field (valid by construction, etc.)

This one isn't necessarily a lack of understanding, but warrants mentioning anyway...

  • Read the bloody error message
Edit: Just to clarify, my intent isn't to rant about beginners. We've all been there and writing terrible code is an important part of learning to write good code... I'm actually bringing people up to speed with Rust at work and was thinking of compiling a list of "bad practices" to help teach what is/isn't idiomatic.
39 Likes

I think this is one of the more recurring issues I've seen both here and on reddit. Returning -1 (when searching for an index) or generally returning "sentinel" values. I think that's because so much of the programming teaching litterature (in mainstream languages) does this, and everyone replicates.

MSDN FindIndex

This method returns -1 if an item that matches the conditions is not found.

12 Likes

Adding a field of type &Foo in to a struct, then following a chain of compiler suggestions to add lifetime annotations to all your structs. Ultimately fail in borrow-checker hell since really Foo should’ve been owned by the field in the first place.

36 Likes

Another one: When you review newbie code and 99% of things you’ll tell them would’ve been suggested by clippy as well.

24 Likes

Definitely guilty of this.

Two other things that has changed:

  1. becoming competent at macro_rules! ; drastically changed the code I write

  2. being able to keep compile time for most crates at < 2s; < 2s vs > 10s is the difference between staying in flow vs reading reddit

12 Likes

Another hang-over from writing a lot of C is using the wrong integer type and getting frustrated because you need to cast to/from usize all the time because those integers were actually used as indices.

14 Likes

I actually think std should support indexing slices and arrays with any unsigned integral primitive type (and I'm not alone, this has been mooted repeatedly on IRLO), so I sympathize with the beginners on this one :‌P

11 Likes

I would extend this to using args() or vars() vs. their non-panicking counterparts, and it's endemic in the wider community, not just beginners. Implicitly encouraged, even.

Others:

  • Using unsafe to just do what you would have in C because you "know it works"
  • Using getters and setters within the crate
  • Expecting dyn Trait to sub-type or be dynamically typed
12 Likes
  • Traits prefixed with I. Types and traits shouldn't be easily confused in sensible code; they don't need a marked naming convention.
  • Matched trait-struct "pairs". Often combined with the above: impl IFoo for Foo. It won't work out the way you want. (Also you probably shouldn't write this in C# either but w/e)
  • Pervasive use of getters and setters. I haven't seen too much of this recently.
  • Manually uniquefying all identifiers (lib_module_struct_method) instead of just using built-in namespaces (lib::module::Struct::method). It's what they're there for!
  • Using #[path] ever.
12 Likes

I get the sense that newcomers to Rust tend to overuse traits in general, when you can get by most of the time defining only ADTs (and using other people's traits). Maybe this is the imprint of object-oriented languages where classes can be both containers for data and abstract interfaces? (Makes me a little apprehensive about the "field in trait" proposals you sometimes see floated.)

7 Likes

7 posts were split to a new topic: Path, OsStr, and supporting non-UTF-8 paths/inputs

Using byte indexing into UTF-8 strings without awareness of char boundaries etc. and the resulting panics.

13 Likes

I think I've only used the #[path] attribute once in legitimate code, and that was because I wanted to share some type definitions between a crate and its helper proc macro without creating a third "common" crate that both depend on.

That's a good point. Although I find the OsStr versions not overly useful because you need to turn it into a String if you want to use it for anything other than a file name.

So your options are to either ignore the non-UTF8 OsStr, panic, bail with an error, or silently corrupt the input by substituting �. Not ideal.

1 Like
  • Measuring performance of unoptimized/debug builds.
  • Using an empty Vec<u8> as the buffer argument when calling Read::read.
  • Calling Read::read or Write::write when read_exact or write_all is needed. (Clippy has a lint for this.)
  • Unnecessarily converting String into something like Vec<char> (for use cases that could be rewritten to avoid char indices).
  • Trying to do an early return from a function without using the return keyword, by just removing a semicolon. (Fortunately, this usually causes a compiler error.)
17 Likes

Putting IO resources inside mutexes.

6 Likes

Question... Why is the example where you initialized a vec with Vec::new() and the populating later a bad thing?

Thanks for posting this, I really like discussions like this. Helps with better understanding the language and becoming a better programmer.

1 Like

I guess it's more of a style thing, but here's an example.

Say you are writing a NLP application and have a dictionary containing all the possible words you can handle. This is one way you could create the dictionary:

let mut dict = Dictionary::new();
// read the file and populate some internal HashMap or Vec
dict.load_from_file("./words.txt")?; 

This means my Dictionary now has two (hidden) states - empty and populated.

All downstream code that uses the Dictionary will assume it's been populated already and write code accordingly. This may include doing things like indexing into the dictionary with dict["word"] which may panic if "word" isn't there.

Now you've opened yourself up to a situation where passing an empty dictionary to code that expects a populated dictionary may trigger a panic.

But that's completely unnecessary. Just make sure the Dictionary is usable immediately after constructing it instead of populating it after the fact.

let dict = Dictionary::from_file("./words.txt")?;

impl Dictionary {
  fn from_file(filename: impl AsRef<Path>) -> Result<Self, Error> {
    let text = std::fs::read_to_string(filename)?;
    let mut words = Vec::new();
    for line in text.lines() {
      words.push(line);
    }
    Ok(Dictionary { words })
  }
}

Internally the Dictionary::from_file() might create an empty Vec and populate it incrementally, but it won't be stored in the Dictionary's words field yet so there is no assumption that it is populated and useful.

23 Likes

Oh wow, I had never considered what you said about two hidden states. `Interesting...Well, now I know. I cant necessarily pull an example right off the top of my head at the moment, but I do know I have been in situations before where I felt initializing with Vec::new() was indeed necessary, so I guess now the question is, where in my code did i put myself in that situation...

I think it also depends on your background or coding style.

FP languages are often completely immutable so you'll fall into this pattern naturally. It's kinda hard to create a half-initialized thing and populate it later when you aren't allowed to mutate anything.

On the other hand, OO languages are much happier to let you initialize an object after it has been constructed, especially because object references can be null by default and they have no qualms about mutability... You could argue this contributes to why OO languages have a propensity for crashing due to an unexpected NullPointerException.

Don't take this all as gospel, though. It's just a way of coding, and there are plenty of places where it's the wrong way to code (e.g. because it unnecessarily constrains your code or information just isn't available until a later part of the program).

5 Likes

Also

  • Attempting self-referencial structs
  • Long-lived reference-based structs generally
7 Likes