How do people handle errors in the first draft of a program?

When I'm writing a new program, I typically start by putting together the very basic user interface (command line handling, etc) and a very simplified version of the logic. I then progressively refine the code, adding functionality and handling edge cases as I improve things.

I've used this approach a lot in languages like Python, and it generally works well for me - I can ignore the details until I have the basic structure in place.

To do this in rust, though, I feel like I'd be doing an awful lot of unwrapping of results, letting the code panic if there are problems. Which is fine (it's the equivalent of having an unhandled exception in Python) and I'd just go back later and add error handling. Having explicit error types makes this much nicer, because it's easy to see where I've skipped over error handling.

But when I do come to add error handling later, the need to change return types feels like it makes the job a lot harder - the ripple effects of deciding I want to log a custom error message and then terminate the program cleanly seem like they are much bigger than (say) in Python defining a custom exception, raising it, and then catching it in my main routine.

How do people do this sort of thing in Rust? Accept the need for big "add error handling" exercises once you've outgrown panics? Include a very basic error handling framework in their initial code? Design error handling up front as a key program structure decision? Leave panics in and use a custom panic handler to at least make them look a bit nicer?

I feel a little bit like I'm reinventing the wheel here, and there "must" be some sort of best practice in this area. Or am I being excessively optimistic? Or is there a way of "gradually" enhancing error handling - moving from a panic to a very general error mechanism, to something more refined as needed - that I'm missing?

3 Likes

I tend to actually "handle" the errors with the ? operator even in early drafts of my program. This is made simpler by crates like anyhow, which declares a "cover all" Error type (and an anyhow::Result shorthand).

So I tend to have full error propagation in an early draft. But for an end-user, an error propagating all the way out to main and terminating the program isn't that different from a panic, so the part I save for later is figuring out where and how to recover from some errors and where to use more specific error types to make recovering easier (or reporting nicer for the user).

In some cases, this means converting the the return type of a function from anyhow::Result<Something> to just Something, but the compile time type-checking makes it easy to fix all call sites, and it tends to be fewer changes this way the other way around.

8 Likes

I start with anyhow, and if I later decide to move to thiserror I start with an error enum that has one variant Anyhow(#[from] anyhow::Error) and gradually work from there, eventually removing anyhow.

1 Like

Hmm, anyhow sounds useful, thanks. I've heard of it before, but never really looked into it for this purpose. I've not heard of thiserror, I'll take a look at that too.

Generally, thiserror is for libraries that need to report precise error causes and help users recover from the errors, and anyhow is a catch-all for applications or where you just don't care about the details besides displaying a message to the user.

For libraries I also like quick-error.

4 Likes

Like several other people have mentioned, I normally won't bother with unwrap() and just jump to returning a Result<(), Error>. Using the ? operator is actually easier than unwrapping, plus it opens the door for more a user-friendly experience out of the box.

I think anyhow has two killer features which make it stand out from the rest, though...

You can use the anyhow::Context extension trait to provide more information about the error. That means instead of "file not found", you might have an error that says "unable to parse the file, because we are unable to open foo.txt, because file not found".

The second killer feature is its Debug implementation. When you print an anyhow::Error using {:?} (which is what happens when you return Result<(), anyhow::Error> from main()) it will print out that aforementioned context chain alongside a backtrace when the RUST_BACKTRACE variable is set. For 99% of your functions you'll just be printing the error to the screen and exiting unsuccessfully anyway, so you get more by having a nice Debug implementation than you would if you had custom error types.

Here's a contrived example:

Contrived example
use anyhow::{Context, Error};
use std::{fs::File, io::Read, path::Path};

fn read_contents(filename: &Path) -> Result<String, Error> {
    let mut f = File::open(filename)
        .with_context(|| format!("Unable to open \"{}\"", filename.display()))?;

    let mut buffer = String::new();
    f.read_to_string(&mut buffer).context("Read failed")?;

    Ok(buffer)
}

fn parse_file(path: &Path) -> Result<(), Error> {
    let contents = read_contents(path).context("Unable to read the file")?;

    Ok(())
}

fn main() -> Result<(), Error> {
    let path = Path::new("foo.txt");
    parse_file(path).context("Unable to parse the file")?;

    Ok(())
}

(playground)

And the error it prints out when backtraces are enabled:

Error: Unable to parse the file

Caused by:
    0: Unable to read the file
    1: Unable to open "foo.txt"
    2: No such file or directory (os error 2)

Stack backtrace:
   0: playground::main
   1: std::sys_common::backtrace::__rust_begin_short_backtrace
   2: std::rt::lang_start::{{closure}}
   3: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once
             at /rustc/b41936b92cd8463020207cb2f62a4247942ef2e4/library/core/src/ops/function.rs:259:13
   4: std::panicking::try::do_call
             at /rustc/b41936b92cd8463020207cb2f62a4247942ef2e4/library/std/src/panicking.rs:401:40
   5: std::panicking::try
             at /rustc/b41936b92cd8463020207cb2f62a4247942ef2e4/library/std/src/panicking.rs:365:19
   6: std::panic::catch_unwind
             at /rustc/b41936b92cd8463020207cb2f62a4247942ef2e4/library/std/src/panic.rs:434:14
   7: std::rt::lang_start_internal::{{closure}}
             at /rustc/b41936b92cd8463020207cb2f62a4247942ef2e4/library/std/src/rt.rs:45:48
   8: std::panicking::try::do_call
             at /rustc/b41936b92cd8463020207cb2f62a4247942ef2e4/library/std/src/panicking.rs:401:40
   9: std::panicking::try
             at /rustc/b41936b92cd8463020207cb2f62a4247942ef2e4/library/std/src/panicking.rs:365:19
  10: std::panic::catch_unwind
             at /rustc/b41936b92cd8463020207cb2f62a4247942ef2e4/library/std/src/panic.rs:434:14
  11: std::rt::lang_start_internal
             at /rustc/b41936b92cd8463020207cb2f62a4247942ef2e4/library/std/src/rt.rs:45:20
  12: main
  13: __libc_start_main
  14: _start
4 Likes

(Emphasis mine)

When discussing error handling we often hear people speak of the "Happy path" and imply there is a "Sad Path". By "Happy Path" they are talking about the general flow of the programs logic when all is going well. The "Sad Path" being the messy details of what happens when things go wrong, that is to say error handling.

Typically there is a desire to make the Happy Path as clear as possible in the source code. It's the business logic, the algorithm, the whole purpose of the code after all. The happy path is what you are designing.

The desire then is to hide the Sad Path. Get all that messy error handling out of the way. Keep the Happy Path clear and readable. Ultimately this leads to totally ignoring the Sad Path by simply throwing an exception, often with some nebulous idea that the exception is handled elsewhere.

This is quite understandable and is naturally what we do when drafting some new solution to a problem.

However I have a little problem with this "clean Happy Path"/"hide the Sad Path" notion. Most of my work has been in embedded systems or server side processes. There it is very important to take care of every possible failure and react to it in some well planned manner. Things cannot just bomb out with an exception. Most errors are not exceptional anyway, they are a common occurrence, expected behaviour. Files can be missing, inputs can be wrong permanently or temporally, network connections fail, etc, etc, etc. Sometimes it seems every other line of code can be the source of a "sad" thing that needs dealing with appropriately.

The point I'm coming to is yes, error handling does make the job a lot harder as you say. That is because the Sad Path is in fact 50% of your design effort. It cannot just be brushed aside so easily.

Luckily Rust does not have exceptions so totally hiding error handling under the carpet while keeping the Happy Path clean is not so easy. Also luckily Rust offers very nice alternatives to exceptions. As we see in the suggestions above.

One just has to put the effort in. Because it's a required 50% of your problem.

All in all I'm not convinced that totally separating the Happy Path and the Sad Path is something we should strive for. The Sad Path is critical to the codes behaviour and should be more prominent in what the reader sees.

6 Likes

In the first draft of my programs, I use expect() with a message that tells me where the error occurred. I also return Result<Foo, std::error::Error>, where Foo is the return type, on those functions. Here's a snippet of some code I'm working on now,

pub async fn talk_to_client_async_mutex(
    data: Arc<Mutex<Data>>,
    client: &mut Child,
    msg: &[u8],
) -> Result<(), Box<dyn std::error::Error>> {
    let client_pid = client.id().expect("No PID for client");
    let mut to_client = client.stdin.take().expect("No stdin for client");
    let from_client = client.stdout.take().expect("No stdout for client");
    ...
    Ok(())
}

That makes it easier to add proper error handling later. In some, but not all, cases I have to change the return type.

2 Likes

I think that's a pretty harsh interpretation of what I was saying. I absolutely don't want to ignore error handling - however, I'd prefer to separate the process of designing the core logic from the process of working out how to handle errors. That's precisely to allow me to give each of them the level of consideration they deserve. Having to think about error handling while I'm trying to work out the mainline logic is, in my opinion, a recipe for putting in over-simplistic error handling, precisely because you're trying to focus on the main thread of the logic, and so don't think the error logic through.

Furthermore, when I am designing the error handling, I'd like to abstract it out, so that the error handling code is clearly separated and can be reviewed and verified without the logic being obscured by unrelated details (in this case, the "mainline" code that handles non-error cases, or the "happy path" if you want to think of it like that).

:man_shrugging: I guess you can design your code lots of ways, and keeping error handling and core logic together is a valid approach. I prefer having errors clearly transfer the logic into dedicated code that's visibly doing the job of recovering and continuing (or aborting).

Note that I'm not arguing for exceptions in any of the above. Far from it - Rust's abstraction capabilities and explicit error types seem like they will make writing clean, robust error handling code far easier. Since working with Rust, I find myself very nervous writing Python code precisely because I can't be sure I've dealt with all of the potential error cases. But none of that means I want to try to design mainline and error handling code at the same time - both jobs are hard, and I want to focus on one at a time, that's all.

1 Like

I think what we're seeing here is a difference in the core values/attitudes for different industries. In the embedded systems that @ZiCog is referring to you don't separate the error path from the main logic, because all the paths are core pieces of your application's logic.

You can imagine for something like a rover on Mars that the "sent telemetry back successfully" and "main antenna not responding" paths need equally as much thought put into them and it doesn't make sense to leave them separated or try to abstract out the error handling.

6 Likes

So let me start by saying that I'm still new to Rust, so I may not yet have quite caught on to all the idiomatic ways to do things (and I'm saying this in the best possible sense, it is good to learn a language's idioms!).

So what I do is twofold:

  • for libraries and library-like functions in binary crates, I very heavily rely on the ? operator as described above: almost all of my functions apart from the ones in files containing a main() function return a result
  • for top-level command-line handlers and, well, yeah, most of the functions in a file containing a main() function (not as a rule, but it usually works out this way), I use something like .expect(), but, hm, shameless plug: I actually use my own expect-exit crate; its interface has changed a bit since the first crates.io upload, but I think that it has pretty much matured now (of course, if it happens that this post brings a bit more attention to it and some brave people decide to try it and to point out some obvious problems, I'd only be too happy :))

In both cases, this allows me to do the equivalent of ... in Perl, raise NotImplementedError("oof") in Python, and so on, quite easily: for functions returning a result I place a Err("finish <function name> at some point"), possibly quoting a couple of the arguments to more easily figure out what I need to do when I decide to get to it, and for top-level functions I use a exit(...) expect-exit call the same way. This allows me to both test my error-handling routines (when I've placed an unconditional Err() in a library-like function - make sure the error is propagated and reported properly) and leave the actual implementation for later.

Hope this helps! Of course, as noted in the beginning, if any more seasoned Rustaceans would like to point out any problems with this approach, I'd only be too happy to learn more!