Did Rust make the right choice about error handling?

Rust standard library uses Result basically for every action that can fail. Suppose such a typical situation, that we want to construct some complex struct S.
struct S { f1: T1, f2: T2, ... fn: Tn }
So we are creating a function that will do it unless anything will go wrong. Our struct is complex, so we want to build it field-by-field. And, of course, as we are in the real word, their construction can fail too.
fn createS(...) -> Result<S, SomeError> { Ok{ f1: createf1(...)?, ... fn = createfn(...)? } }
In this simple scenario each field will be moved from its own Result to the Result of createS. And this can repeat for any of f1,..f2 if they are constructed the similar way! In C, for example, we just allocate (uninitialized) struct outside, pass a pointer to it and determine with the error code was it initialized.
Rust library moves objects of arbitrary types VERY loosely. Can Rust call itself a "system programming language" after it?

1 Like

Yes.

And yes.

I don't see anything "loose" about your example. You end up with a struct full of results, that may be an error instead of some expected value. That accurately and exactly describes the situation you have created for yourself. With no ambiguity about special values actually being errors, as in C.

As my friend is fond of saying "You get what you order"

5 Likes

While habit of Rust people to move things often is bad, it is still your choice how to do it.
It is always up to you how to design error handling.
You can use boost approach to pass reference to storage for error like function argument

You can even use panic for that purpose if you want (do note that panic has some annoying properties in Rust comparing to C++ exceptions)
After all not all errors are supposed to happen often

TL;DR; Rust is system programming language because it gives you enough system to actually design performant code

If you so desired, it's quite possible to use std::mem::MaybeUninit to initialize an expensive struct field by field. Rust still gives you the tools to do these sorts of things (e.g. for performance or because you're initialising something passed to you by C), but these less safe options tend to be less ergonomic as a sort of safety "speed bump".

Another thing to consider is, like C and C++, the code you write isn't necessarily the code that'll be executed. With things like Return Value Optimisation and other copy elision optimisations, a good optimising compiler can often find ways to avoid expensive code.

I've been programming Rust since the 1.0.0 release and haven't seen this happen much in practice as much as you are suggesting. I also don't think I've ever seen unnecessary memcpy()s causing performance issues... That said, this is a sample size of 1, YMMV.

5 Likes

I don't see any serious performance degradation in your example.

If your field initializer function is cheap enough, it will likely be inlined and the optimization will make it literally zero cost. If it's not that cheap, the cost from the move itself is margins compared with the function call itself, unless the field is kilobytes large. Also if the return type of the initializer is small enough it will be returned via registers so the cost of move becomes next to nothing.

And the advantage is significant. In safe Rust, you can't encounter uninitialized value even by mistake.

2 Likes

I've been programming Rust since the 1.0.0 release and haven't seen this happen much in practice as much as you are suggesting. I also don't think I've ever seen unnecessary memcpy() s causing performance issues... That said, this is a sample size of 1, YMMV.

This actually means that you always allocates arrays on heap. Only in this case moving everything is cheap.

Return Value Optimization and other copy elisions have nothing to do in this case.

In C we don't return the struct, we pass the pointer to the struct allocated by caller, and use error codes as return values.

While habit of Rust people to move things often is bad, it is still your choice how to do it.
It is always up to you how to design error handling.
You can use boost approach to pass reference to storage for error like function argument

I can use any approach for my own code, but this basically means getting rid of standard library usage, which uses the mentioned approach everywhere :slight_smile:

Copying/moving anything within sizeof::<u64>() * 2 would be more as cheap as if you'd use pass by reference/pointer on modern HW

So I wouldn't necessary say all moves are bad.
Do note that with optimizations most of your function calls are inlined anyway, avoiding allocating extra stack space and all these copies.

As for standard library it is actually for the most part pretty effective, aside from few badly designed areas.
Do not shy away from moving (copying) small structs, performance wise it might be better than introducing indirection via pointer

2 Likes

Instead of talking about hypotheticals, can you give us an example of where you've encountered this expensive initialisation in the wild? Maybe we'd be able to help you find a more idiomatic way of expressing things.

11 Likes

Also note that recent few stack frames are likely be stored in the current CPU core's L1 cache, and tends not to be shared with other threads. Modern CPU's are ridiculously fast at copying values within such memory region. Studies revealed that the computers are used more like copying machines than computing machines, so the chipmakers put enormous efforts to optimize it.

3 Likes

Doesn't Return Value Optimization exactly counter unnecessary moves?

I'm legitimately asking here. My understanding was that this optimization turns functions which have moved returns values into functions which take a pointer for where to put the return value. Is this not exactly the situation you're talking about?

In my understanding, if the type is big enough, rustc will transform this:

fn my_func(_: Arg1, _: Arg2) -> Result<BigType, Err>;

into

fn my_func(_: Arg1, _: Arg2, out: *mut Result<BigType, Err>);

And since it does it in an optimization pass, we're free to still write the former code without any performance loss.

Is my understanding of Return Value Optimization incorrect, or are you talking about a different situation?

4 Likes

It is probably bad idea to mix RVO from C++ here.

In case of C++'s RVO it doesn't work with move, and only applies to copy.

I actually don't think Rust Reference specify possible optimizations, so it is hard to judge what compiler would do or LLVM for that matter.
But it probably can optimize return value by writing it to outer stack instead of creating temporarily variable and return it with move.

But as there is no RFC that states it, it would be hard to actually rely on it

2 Likes

True! It's worth noting that the rust calling convention is explicitly undefined, though, so the compiler can do these optimizations.

Also true.

However, most zero-cost abstractions rust provides are made possible by optimizations, and we rely on them in a similar way. For instance, iterators being faster than any hand-written for loop wouldn't be possible without a ton of inlining that LLVM is always free not to do. Regardless of that, it does reliably inline iterator methods, and we count on that.

2 Likes

The whole assumption about "cheap moves" reilies on another assumption - "all arrays are allocated on heap". This is not realistic in performance-critical code (isn't Rust supposed for it?)

RVO doesn't work with ? operator (because return types are different)

Could you elaborate?

I don't believe ? is very special in rust; it is mostly syntax sugar. In particular, it still expands to a regular return statement just like any other.

The exact expansion is the following:

let x = expr?;

// expands to

let x = match Try::into_result(expr) {
    Ok(v) => Try::from_ok(v),
    Err(e) => return Try::from_error(From::from(e)),
};

where all the Try methods are regular trait methods. It looks fancy, but all it does is return if the underlying thing is Err-like, and evaluate to the value otherwise.

As I understand it, there is only ever one return type, and ? doesn't change that.

Is there something else about ? which prevents RVO, or could you expand on your statement?

1 Like

? unboxes the Result. If f(...) returns Result<T, E>, f(...)? becomes T, so no RVO is possible here.

1 Like

Ah, so it's on the other side. That makes sense!

Thanks for explaining that - I think I understand the problem now. It's that the calling function has to store a Result<T, E> somewhere, and then it gets the T out, and that necessitates a move.

I wonder if there could still be optimization potential there? Like, if the compiler just kept around the Result<T, E> type - didn't free the local memory which stored the enum's discriminant/tag, it could potentially just keep the T in the same place it was inside the Result<T, E>, and not have to move it out. Do you think this would be a practical optimization, or does anyone know if the compiler will do this in practice?

2 Likes

I have an impression, that Rust borrows too much from functional programming to be efficent. It is fancy, it is correct, but it is not efficient to run on "bare metal" (without GC, for example).