What does a good Error look like?


#1

First, apologies for the long post and any incoherencies in it. I’ve been meaning to start a discussion about errors for a while already so there has been some build-up of thoughts and text fragments written over many weeks. There are several explicit questions in the text below, but please do share your thoughts and comments in general on the topic too.

The Rust book has a nice chapter about error handling, but it concentrates on the more technical side of it. I’m also interested in the “softer” and more semantic side of error handling and that’s the theme of this post. There also seems to be some value for having at least some parts defined more precisely overall for things to work smoothly. An example of that coming up right in the first subsection below.

The Error trait

The std::error::Error trait has three major facets:

  • fn description()
  • trait Display
  • fn cause()

The cause function seems to be the most obvious one, when considering what it should do. When to use it seems more trickier, but maybe the section about errors from other libraries below covers that, so moving on to the description() and Display.

The function description() returns a &str, and to me there seems to be an implication it probably returns a message from a set of 'static short basic descriptions. The alternative would be for the error type to own a custom formatted String for every error instance, and that doesn’t sound like its usually desirable.

However, when thinking about what the Display implementation should do, I got stumped on a really elementary question: should it contain the text from description() or not? If there’s no common rule to follow, we will end up outputting messages like “Basic description: Basic description: A more detailed description” or just “A more detailed description” without any common text to give context to the details. Both approaches have their merits. “Give more power to the user over formatting the message” and “make the life easier for the user and give fully fleshed out messages out of the box.”

It would be very nice if there was a more specific common guideline for the above part, so that everyone could trust things to work similarly. Or at least to have the policy of explicitly telling the chosen approach in documentation and being consistent with it within a single crate!

Level of detail

How much information should an error structure contain? At one extreme it contains everything that has any relevance to the situation at all, and I’m somewhat worried that the error types can easily grow to be unreasonably large. Maybe I’m just overguesstimating the cost of having relatively large values on stack, but it also seems kind of disproprtionate to have something like Result<u8, SomeReallyLargeErrorType>. Anyway, now that I brought memory usage up, I also feel like allocating memory to construct an error struct is mostly not a good idea, but am not sure if there’s all that much reason for that.

At the other end of the continuum the error type is maybe just an enum with no other info on the error except for telling very generally what kind of error happened. With this approach the usefulness of the error type goes down. It wouldn’t be very fun to code in Rust if the compiler just printed out “lifetime error” with no other information.

Errors from other libraries

Let’s say you’re making a library, but you’re not doing everything by yourself, you are using other libraries to do things for you. Do you think it’s okay to pass through errors, therefore exposing implementation details of your library to the user? Think about an error enum whose variants are actually error types from those other libraries. There’s a choice to either make the contained values within the variants public or not.

I’m personally on the fence with this. Hiding the underlying libraries let’s me keep the API under my control, but on the other hand, if more detailed error types are the way to go, I would probably practically end up repeating the error type definitions from the other library at least partially.

At some point I remembered the Error trait has the cause() function. Maybe the error can be given to the caller as &Error, therefore not actually exposing the specific error type, while still giving them access to it. Of course accessing any fields of the original error is not possible this way. This also seems to interact with how my own error type should implement description() and Display - when should I just grab them from the original error and when should I add my own messages? Should I rely on the caller traversing the cause hierarchy to display as stack of error messages?

Finally, there are the error types from the standard library. Is it a good idea to use std::io::Error for I/O-related errors even when the function is not under std:: but it otherwise matches the use case?

Error enumerations

It looks like in many cases the error types are best represented by an enum, with a variant for each major kind of error that can happen. Now, here at least two options seem to come up in a way similar to what I brought up above: there’s a possibility for a more or less detailed error handling here too. Let’s assume two functions, foo and bar, with the possible error cases for foo being either A or B, and for bar the potentially returned errors are B and C.

Should there be two error enums, {A,B} and {B,C} or just one, {A,B,C}?

fn foo() -> Result<SomeType, AB>
fn bar() -> Result<OtherType, BC>

vs.

fn foo() -> Result<SomeType, ABC>
fn bar() -> Result<OtherType, ABC>

The first option is more precise, the caller should have exact knowledge of what errors may happen. Then again, I’m not really sure if it is all that usable for the caller (think about having to write completely separate matches for all Result-returning functions, with no chance for a general error handler of library X’s errors), never mind being more work for the library author too.

Target audiences

When writing error messages, we should remember the messages won’t necessarily be seen only by other programmers, but the end users with less technical knowledge or understanding of the context of the error. Here I think the programmer of the actual user-facing program has a special responsibility to make the error messages good for users, but library authors have some responsibility too. If not anything else, at least I feel it’s not a good idea to have messages like “I HATE YOU”/“I LOVE YOU” message when authenticating with a CVS server. Those messages may and will bubble up to the GUI level and you end up having your program showcased on The Daily WTF.


Application error messages that are friendly and detailed
#2

I did this with conv, (see the documentation on conv's error types).

On the plus side, it allowed me to do some fun things like define extension methods that would “correct” individual problems. So, you could use .saturate() to cut overflow from the set of possible error conditions. Once you’d gotten rid of all error conditions, you can safely .unwrap_ok() to get the error-free value inside.

On the down side, it’s a huge amount of work and a massive bloody pain to actually do. It’s also why I went to the trouble of creating a kind of “tower” of errors, allowing to generalise the errors for the sake of user sanity.

I’m not sure I’d recommend the approach on anything much more complicated, unless you have help and a lot of alcohol.


#3

Is this really the case? As far as I understand, there is no room for i18n in Rust Error and Display, so it is not a good idea to bubble them to the GUI. And oftentimes an Error lacks contextual information necessary for the user. For example io errors don’t include the file name.

I’ve used the following struct type for the user facing errors in the latest binary I wrote:

pub struct Oops {
    message: String,           // user facing
    cause: Option<Box<Error>>, // for logs/debug
    debug: Option<String>,     // for logs/debug 
}

When a library routine returns unrecoverable error, I create an Oops with message which includes all necessary call side info (like the file name). If the original error implements Error it is stored as a cause. Unfortunately some errors are not Error and in this case format!("{:#?}", err) is stored in debug.

When I display the Oops to the user, I print the message and, if the debug flag is on, either cause or debug.

I would also like to add a stacktrace attribute to the Oops, but it is unstable now: https://doc.rust-lang.org/std/rt/backtrace/fn.write.html


#4

@DanielKeep: It’s interesting how conv works with errors, and the “built-in support” for handling some errors seems pretty nice. It’s a shame it isn’t easily expandable or generalizable into more complex cases.

@matklad: Very good point! English isn’t my native language, but apparently wearing my (lazy) programmer hat instead of end user hat when thinking of this made me ignore existence of other languages. Having stack traces available when handling errors is something I’d like to have too.

So, after being reminded of the internationalization point, looks like it’s not a good idea to hide any important information behind Error's cause() then, so that all necessary information for formatting the user-facing messages is available. Also got me back wondering about the “exposing third-party errors” part.


#5

Thanks for this post, I would also be interested in the ideas and current solutions.
In my current project I’m kind of reflecting the errors from libraries to present the actual error but not the underlying type. The problem here is the amount of .to_String’s at first, but secondary also the Problem of only passing one of the three described values an Error can have.

I would like to change this also into possible stack traces or some sort of better defined errors.