Did Rust teach me to use a lot of variables instead of inline expressions?

I sometimes see rust code that is quite impossible to read because of not using variables and/or making it a kind of deeply nested 'fluent api chained methods' style (which I don't like).
I would say that writing code more sequentially, using vars, is (much) more readable imho.

I agree that my points apply to functions as much as they apply to variables, but that analogy also goes the other way. You wouldn't write a function

fn some_new_fun() {
    return some_other_func(some_func())
}

unless some_new_fn() has a very clear semantic meaning that can easily be inferred from its name. Similarly, I would argue that

let x = some_func();
some_other_func(x);

can hurt readability unless x has a similarly clear and unambiguous meaning, and in my experience local variables more often than not do not have that. But of course sometimes they do, and when they do then local variables are totally justified.

Yes, that's precisely the point that I am trying to make. And like in the movie, the concern here is not the moral component of "lying" (spies do way worse things than that), but the fact that they make it tempting to burden yourself with more responsibility then you might have to.

True. Fortunately, dynamic → compiled seems to be the trend of our times :blush:

I still don't see how Rust is different from any other compiled and GC-free language in this respect. Would you mind to elaborate?

Sure. Think, for example, a situation where you are consuming a collection by calling into_iter() on it:

let filtered_collection = my_collection.into_iter().filter(/* narrow down the items */).collect::<Vec<_>>();

But later on in your code you also need to check something on the original collection. Something like this would work on pretty much any language other than rust:

let filtered_collection = my_collection.into_iter().filter(/* narrow down the items */).collect::<Vec<_>>();

if !my_collection.iter().filter(/* Narrow down the items of type B */).collect::<Vec<_>>().is_empty() {
  // Do something if  there are items of type B in my_collection
}

But in Rust since calling into_iter() consumes the original collection, you are not allowed to access it afterwards. You are fored to introduce a variable before calling into_iter() to bind the boolean:

let has_items_of_type_b = !my_collection.iter().filter(/* Narrow down the items of type B*/).collect::<Vec<_>>().is_empty();

let filtered_collection = my_collection.into_iter().filter(/* narrow down the items */).collect::<Vec<_>>();

Given that example, you would understand that you often have to do the same dance when you get a borrow checker error along the lines of cannot borrow `thingy` as mutable because it is also borrowed as immutable.

1 Like

I think you're arguing something slightly off-kilter to the rest of us, though. If I have a good name for x, then I absolutely would pull it out into its own, named, variable (even if it's used precisely once). Similar for some_new_fun - if it has a good name, then it can exist.

And my experience is that, more often than not, being unable to give an intermediate calculation a meaningful name is a sign that you don't actually understand the code you're writing, and should switch your focus from writing code to understanding what you're doing.

Finally, I've maintained enough code bases over the last 30 years to know that when I see code of the form:

some_func(
    some_other_func(some_args)
).next_method(
    method_args_frob(args_to_fn)
)

It's going to be an absolute pain to understand, whereas if someone's used bad names in

let quux = some_other_func(some_args); 
let baz = some_func(quux);
let frobbed_args = method_args_frob(args_to_fn);
baz.next_method(frobbed_args)

it's going to be relatively easy to maintain, because it's nice and easy to see where I can insert my changes, and it's also clear which intermediates are worth dumping out to see how data flows here.

4 Likes

Very interesting, thank you @moy2010 and @farnz (and everyone else) for your responses!

In this scenario, I'd 100% refactor the other way, and I'm pretty sure I'm not the only one. For example, the Julia community has been discussing for years a syntax for chaining functions the f().g().h() way (which is something that you can't currently do in that language). So maybe the OP is right and it is their exposure to Rust which made them prefer introducing temporaries?

It's small things.

  1. Rust have immutable variables as default and mutable ones need special marker.
  2. Rust tracks lifetimes and that means that if I build B from A, then C from B I couldn't use A by accident, because it wouldn't be valid anymore!
  3. On top of that I can combine #1 and #2 above to ensure that I wouldn't have A, B and C but would have something like data_file which is first a string validated not to have .., then it's transformed to std::fs::File (still under same name!), then processed (info JSON-parsed object) and, finally, dropped (after which point it can not be used, anymore) — and all that without new scopes or new names!

You still can abuse local variables, but it's harder to do that in many other languages.

2 Likes

This then comes down to taste - if I encountered the first form, I'd 100% refactor it into the second form, so that I could make sense of it. I'd then work out how to reasonably name the intermediates, because that actually matters to understanding what it does.

And that's been a consistent part of my programming career from starting out with BASIC, moving to C, learning C with Classes, Haskell, Python, Perl and other languages. Even in assembly, I'd be commenting the code to tell me what the meaning is of each register after a function call, just so that I can keep track of what's going on.

If you chain calls, as in the first form, it's now hard for me to name each of the intermediates, and I have to understand the entire chain in one go; naming the intermediates allows me, as a maintainer, to come in, make sense of the first call in the chain, and then consider the second call separately now that I know what the first call does.

This ties in with reading order; for the first form, I've got a bit of a zig-zag reading order; I'm not reading the code in the same order I read natural language, but zig-zagging around to make sense of it. For the second form, I'm able to read in natural language order (as an English speaker, at least), and that makes it easier to track which bits I have read and understood, and which bits are new.

5 Likes

Uhm ... never a good idea to trust the compiler to fix your code.
This coming from personal experience when fixing someone else's C code many. many moons ago. I found the problem within a couple of hours... then spent over a month proving it because they didn't believe what I told them.

W.R.T the OP's question... it depends.

Lets say you have
let x=bar();
foo(x);

vs.
foo(bar());

Is foo(bar()); hard to read?

I don't think so.
I also don't think that the let x=... is any better or provides more meaning.
Also there's the question of scope as to how long the reference exists on the stack.
And there could be reasons why you would break a statement down rather than a convoluted inline expression. So it could go either way.

Without seeing the actual code, and some context around the comment in the code review...
its a toss up.

Note: I'm not a Rust guru by any means. The issue is less about Rust and more about coding style.

There are no absolutes.

1 Like

this, should be chiseled in stone, absolutely... :slight_smile:

3 Likes

Or perhaps a major motion picture having beloved actor Ewan McGregor screaming it?

Surprisingly enough that, too, is wrong. Not only that rule is self-contradictory, but that self-contradictedness (is that a word?) gives us a hint about where absolutes exist: they are usually negative.

E.g. there are absolutely no way to decide, ex-post-facto whether you program have UB or not (essentially all “nice things” that you may want to ask about your program are undecidable) and that's why we hit absolutes every day, but they only make us mad (or sad, sometimes) because it's usually you absolutely couldn't have X… but maybe X′ or X′′ would work for you?

1 Like

I guess you're ignoring the context of what I was saying?

Its like saying, you never cross the streams.
(Its a Ghostbusters reference. Probably older than you. :thinking: :joy: )

Its like saying " You never do X ..." or "You always need to do Y" when in reality, there's always going to be an edge case where doing X or Y will make sense.

What? The Rust compiler fixes my code a thousand times a day. That is if rust-analyser does not fix it first :slight_smile:

Yeah, well, I see the problem.

Exactly.

Statistics prove that 99% of right thinking people are wrong.

2 Likes

Sure, but while there are exist subtle difference between “You never do X …” and “You could never do X[1]…”, but people often conflate them.

Sure. On purpose. Ignoring the context (which different from not understanding the context) is very powerful technique that helps to see if some assertion have deeper meaning or not.

In this particular case if you ignore the context you may easily see:

  1. That this particular assertion can not be true (it's absolute and yet asserts that absolutes don't exist which are contradictory statements).
  2. It shows us how can we actually build absolute statements… they are very common, all around us… just usually negative “you could never do X” (for various kinds of X).

That's much more educational result than “that's some Ghostbuster reference” (where they have crossed the rays and survived to tell the tale which made the whole thing into joke rather than turning it into something deep).


  1. Because doing X is just simply mathematically impossible. ↩︎

There are some optimizations that are more reliable than others; inlining of short functions, and removal of variables that are used at most once are two that are very reliable optimizations.

You should be able to depend on the compiler to fix up code that has short functions, and that has single-use variables, just as you depend on the compiler to output machine code that has the same effect as your source code.

You can't rely on the compiler eliding bounds checks for indexing, however (to choose an example where relying on the compiler is not a good move), since to elide the bounds check when indexing, the compiler has to be able to see that the index never goes outside the valid range; this is part of why using iterators is better than indexing, since iterators definitionally never go outside the valid range of the collection.

3 Likes

um what?

That's what a type system is for.

3 Likes

Few points:

Often, people talk about "code smells", and the reason for this terminology is that as others alluded, there are no absolutes. As an author of a team's style guide, I have some exceptions listed, but generally, it's understood that not a single line of that is absolute. (other than use gofmt, since it's for a Go project, and I'd mandate rustfmt for rust.)

Also, as is often said, premature optimization is the root of all evil. That doesn't mean use O(n!) algorithms, but if you're worried about whether your variable will be in the stack because you named it instead of in a register, you're doing it wrong.

There are times that performance matters a lot, and only then should you concern yourself with that level of nuance. Generally, trust the compiler. And beware, the compiler might not do what you think it does - I've had attempts at optimization yield worse compiled code. We're far from the days of C being a light sugar on top of assembly. (I then was amazed to see the weird magic it did with the code in a match that I thought was less efficient. In fact it did much better than even if I hand-wrote the assembly.)

Also that doesn't mean disregard performance - it matters. It's too often completely disregarded to ship a chat app that is 100s of MB in size and requires computers with many gigabytes of RAM to do only slightly more than a chat app did 20 years ago with 1/100th of the compute resources and binary size.

3 Likes

And for C++ there's clang-format. These are also very much not absolute but usually time saved from avoiding the discussion about how to use spaces best is worth occasional ugliness these tools produce here and there.

Again: no absolutes. I remember the validation tool written in an imitable Java-veteran way that team was improving for a year. They had OKRs like “speed it up 10% in a quarter”. And I just looked on the whole mess with bazillion pointers, grabbed Ragel and got 5x speedup (would have gotten 10x speedup if MSVC wouldn't have balked at the need to process function with cyclomatic complexity approaching 10000). Theoretically they should have achieves that 5x speedup in 6 years if they kept these 10% per quarter OKRs but in practice I'm pretty sure they would have plateaued way before that.

Rule of thumb: don't think in advance about local optimizations, but design long-living structures with as little pointer chasing as possible.

Compiler is very good at local reasoning around code, but usually pretty powerless about badly organized data structures.

Sadly most SOLID principles, while helping with understanding also push you in the direction of pointers-to-pointers-to-pointers data structure organization.

Yes, but as farnz mentioned

2 Likes

I would say that they're about even to read.

foo(bar())

isn't easier or harder to read than

let x = bar();
foo(x)

And the second form would be easier to read if you could name x something semantically meaningful.

But the moment someone realises that it should be foo(quux(bar())), or worse foo(baz(quux() + bar())), the first form encourages a maintainer to keep putting it on one line, while the second encourages it to spread out a little bit:

foo(baz(quux() + bar()))
// or
let x = bar();
let y = quux();
let z = baz(x + y);
foo(z)

I find the second one easier to read, because there's fewer ) symbols in a row - even now, I'm confident that I've typed the second form correctly, but the first one has me concerned that I've misplaced some of the closing parentheses.

Additionally, if I can name any of x, y, or z with a semantically sane name, the second form can be made clearer, whereas the first has me having to deduce the intent from the function names and types. Contrast:

check_thermals(rmag(adc1_wf() + adc2_wf());

with

let power_a = adc1_wf();
let power_b = adc2_wf();
let combined_power = rmag(power_a + power_b);
check_thermals(combined_power)
5 Likes