What is the basis of this clippy?

clippy occasionally yells at me:

warning: unnecessary closure used to substitute value for `Option::None`
   --> src/bin/example.rs:324:19
    |
324 |     let cert_data = secret
    |  ___________________^
325 | |     .data
326 | |     .as_ref()
327 | |     .ok_or_else(|| CertReportError::SecretDataMissing)?
328 | |     .get("tls.crt")
329 | |     .ok_or_else(|| CertReportError::SecretTlsCrtMissing)?
    | |________________________________________________________^
    |
    = note: `#[warn(clippy::unnecessary_lazy_evaluations)]` on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_lazy_evaluations
help: use `ok_or` instead

I.e., ditch the closure, and eagerly construct the variant. I do this if the underlying type is small (a machine word or less), but when it's large I would normally reach for the ok_or_else(|| …). I actually thought there was a clippy for the "you're constructing a large type, do it lazily!" but I can't get it to trigger at the moment.

The help leaves me unfulfilled:

Why is this bad?

Using eager evaluation is shorter and simpler in some cases.

"in some cases" … [which?], and I don't think this is one of those cases. What's the criteria for this lint?

Putting this in Godbolt, it actually seems like the same code gets generated either way. (A test for the failure case, followed by a jump to a "return error" stub, and the error value is constructed after the jump.)

I guess that's the or_fun_call mentioned by the documentation of unnecessary_lazy_evaluations.

Related. As the linked issues demonstrate, there is no fixed criteria.

If the construction of your type doesn't involve function calls, then there is no point in constructing it via a closure. It will be compiled to the same code. A closure in ok_or_else is basically guaranteed to be inlined, and thus optimized to the same code as in ok_or, bar evaluation order.

If you are doing function calls in the closure, then there is obviously a difference, because ok_or(Struct { foo: bar() }) will always call bar(), regardless whether Option is Some or None. This is obviously wrong if bar has side effects (e.g. consider bar() == panic!()), and an optimization issue even if it doesn't (because it won't be optimized unless it's inlined, which is not guaranteed). However, if you are just using a struct literal, then your code is entirely transparent to the compiler, and it will trivially optimize eager copies and large object construction.

I think it's obvious that opt.ok_or(Enum::Variant) is easier to read than opt.ok_or_else(|| Enum::Variant). It's shorter, it has less nesting, and doesn't need one to consider whether lazy evaluation matters. It will also always compile to the same code. Given that, it is never beneficial to call ok_or_else if all you're doing is simple literal object construction. There is a benefit if you do a function call in that expression, but in that case Clippy will always nag you to make it lazy anyway.

5 Likes

Ah; I didn't read that one. (It'd be real nice if those references in the docs were links.) Yeah, that one is sort of in the same vein. Oddly, it uses not calling String::new as an example of "too heavy", but if I understand String::new, I was under the impression that it should be lighter than initializing my enum, as my enum is wider than a String, so it'll need to initialize additional memory. (Must understanding is that String::new just writes 0s, basically, initializing a Vec to 0 len, 0 alloc'd, and a null ptr.)

Now that I'm looking again, though, oddly, the Godbolt I did shows that initializing a simple variant (no associated data) is that it literally just writes the variant number into memory:

example::foo:
        mov     rax, rdi
        mov     rcx, qword ptr [rsi]
        ; is the option none?
        test    rcx, rcx
        ; jump if None
        je      .LBB2_2
        mov     rdx, qword ptr [rsi + 16]
        mov     qword ptr [rax + 8], rcx
        mov     qword ptr [rax + 16], rdx
        xor     ecx, ecx
        mov     qword ptr [rax], rcx
        ret
.LBB2_2:
        ; write the variant number for the error type enum
        mov     qword ptr [rax + 8], 4
        ; write the variant number for `Result::Err`
        mov     ecx, 1
        mov     qword ptr [rax], rcx
        ret

I sort of expected the other bits to get initialized, but I guess not! In hindsight, I'm not sure why I expected that. Not initializing the bits that aren't used by a variant seems like an easy optimization that of course is taken.

Thank you, that seems very relevant.

Part of this is that there's always going to be some sort of subjective threshold of what constitutes a "heavy" expression. Mine is based on the size of the type (how many words of memory are we going to need to initialize); yours is "involves function calls". But I think I could come up with some pretty non-trivial expressions that are weighty but nonetheless involve no calls.

That subjective limit, wherever it is, should get documented.

This is sort of what I was trying to suss out in Godbolt. While true for my case, I don't see why the general claim should be true. The naïve view says it wouldn't be: in the case of ok_or, since the value is passed to ok_or as a parameter, one could reasonably conclude that the value must be constructed always, as it is required as an input to the function call. (It isn't, but that seems to be because at least in my case, the optimizer is doing a good job of realizing that we only need it in one case, and moving the construction into the "or" branch.)

Remember that LLVM still sees what's going on for tiny inlined helpers like ok_or. Once it's SSA'd, it'll still be clear that it doesn't need to initialize those bytes.

(And if it's somehow not clear, then ok_or_else won't really matter either, since in the "no optimizations" case that function has to return the whole Result type, which means copying more bytes than just the error value represents anyway, so it doesn't really end up helping -- there's still lots of copying if it's not smart.)

The way LLVM looks at code is completely different from the normal models that humans do while, say, tracing execution mentally.

2 Likes

Why do you think this is not one of the cases? You keep talking about big values, but your error variant is literally zero-sized: there isn't any information you need to compute lazily in order to construct it.

You don't gain anything by hiding the construction behind a closure, because the only kind of computation that happens there is trivial – putting the constant of the discriminant somewhere (likely just in a register) won't save you anything even if executed conditonally, due to the superscalar nature of modern processors.

On the contrary, you only make the compiler generate and optimize away more code unnecessarily, slowing down compilation and potentially triggering heuristic code size bail-out limits earlier than necessary for code that could have benefited more from optimizations.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.