Error design: to store the context or to not?

I came across a pretty tough design problem and would love to hear the opinions of wider community.

Consider this code:

// These strings may have come from a configuration file or something
let foo = string_from_user1.parse()?;
let bar = string_from_user2.parse()?;
let baz = string_from_user3.parse()?;
// ...

When a user runs this program on invalid input it can create considerable confusion because it's not clear which string is badly-formatted. Especially if, let's say all strings contain x but only bar disallows x. The user would have to either search in documentation, look into the code or try different things. Neither seems appealing.

One could possibly fix this by instead writing:

// These strings may have come from a configuration file or something
let foo = string_from_user1.parse().map_err(|error| WrappedError(string_from_user1.to_owned()))?;
let bar = string_from_user2.parse().map_err(|error| WrappedError(string_from_user2.to_owned()))?;
let baz = string_from_user3.parse().map_err(|error| WrappedError(string_from_user3.to_owned()))?;
// ...

As you can imagine, this is annoying and error-prone. When I first understood this problem I decided to design all errors in libraries I (co-)maintain like this:

pub struct ParseError {
    input: String,
    invalid_char: char,
}

fn from_str(s: &str) -> Result<Self, Self::Err> {
    if let Some(c) = s.find('x') {
        return ParseError {
            input: s.to_owned(),
            invalid_char: c,
        };
    }

     // rest of the code...
}

fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
    write!(f, "cannot parse {} as foo: the character {} is invalid", self.s, self.c)
}

This looks great at the first sight but has some issues around allocations.
The input string may not be desired in some cases - like in web forms where the string is already visible in the form input, so showing it again is just noise, or it may be processed in different way like conversion to serde::de::Error using Unexpected::Str which takes &str and thus allocation is wasted.

A global feature flag is not great because the same type may be parsed from user and by serde in the same application.

You might say "users are slow, so one allocation is not a big deal". The problem of library code is that the type can be parsed by non-user too - e.g. when parsing Json over some RPC. It may matter in those cases.

Note that similar situation applies for some non-parse errors such - e.g. storing file path. There's actually a discussion about this in std where it also seems difficult.

I'm now really not sure what's the best approach for a library. Ideally I'd want to support all use cases but I don't see how. It looks like these requirements are at odds with each-other:

  • Write idiomatic code using standard Rust traits (FromStr, TryFrom); don't invent my own stuff
  • Make writing user-friendly applications easy
  • Don't allocate when not needed.

Any ideas how to resolve this would be extremely appreciated. Thanks even for reading this!

1 Like

For most of the contexts I work with, I can assume that

  • The error path is pretty uncommon
  • Users only care about performance when on the happy path
  • Most errors are unrecoverable, so making sure the user has enough information to debug what went wrong is very important
  • The context itself is pretty small (i.e. allocations the order of dozens of bytes rather than megabytes)

This matches up quite nicely with the way anyhow::Error is meant to be used (i.e. a root error with a chain of context for debugging), so it's quite common to see code like this:

Note that I use with_context() so we defer constructing the context/error message until an error actually occurs, but once the error occurs we'll allocate as many strings as we need to so users have the tools they need to start troubleshooting.

While that code snippet might come from an application, I think a lot of the underlying assumptions will apply to many other scenarios. Even if you don't abort due to errors, a HTTP server might send the top-level error back to the user as an error message and log the entire error chain to assist with debugging.

It's also not as simple as saying "avoid allocations if you can", and instead ask yourself why we try to avoid allocations. Typically, it is because we want to keep memory usage down (e.g. why copy an image to a new buffer when you can use a smart datastructure that will let you look at a sub-section of pixels) or avoid the performance issues of making calls into a fairly sophisticated set of code (an allocator) in the middle of a hot loop.

TL;DR: Make the allocation if it makes sense. Your users and future self will thank you for it.

5 Likes

Yeah, I think I agree. I've been bitten so may times by bad error messages that I don't want to ever write a bad error message in my life again. I'm still not 100% sure so if anyone else can chime in I'd be happy to hear another viewpoint.

1 Like

Assuming foo, bar, and baz in your example code are all the same type, then you don't have any choice other than appending additional context with map_err() or anyhow's with_context(). Maybe they are different types, and you can afford to give the FromStr impls their own Err associated types to customize the amount of detail each error carries. But I'll consider the former scenario.

It may just be that FromStr is too general to meet your needs. A set of inherent constructors would give your callers the flexibility to choose how much context they need in error messages and abstract over the implementation details of actually adding it. Glossing over the obvious downsides and bikeshedding names, it could be as simple as:

impl MyType {
    // You may have as many constructors as you need...
    // Even if they are only informational or map the error type differently.
    pub fn from_str_with_context(input: &str) -> Result<Self, WrappedError> {
        input.parse().map_err(|err| WrappedError(input.to_string()))
    }
}

Usage examples for completeness:

// The error here does NOT provide extra context.
// Just straight-up string parsing.
let foo = string_from_user_1.parse()?;

// This error will provide extra context.
// The constructor can be more explicit about what it is doing.
let bar = MyType::from_str_with_context(&string_from_user_2)?;

I'm not sure if you would consider this "inventing your own stuff", but I think the compelling argument here is that FromStr just may not be specific enough to your needs. It doesn't offer a mechanism for the caller to choose how much context to provide.


Maybe I misunderstand why allocation is a problem, but it's almost certainly needed in the error case.

You can't control how your callers will want to use the errors (as a library author, anyway). If your error type borrows the string temporarily, they will be unable to wrap it in their own error type as a source. Error sources require dyn Error + 'static. Even worse, the string cannot be dropped before the error, which would limit its usability to mostly static strings (or throwing away your error type before propagating errors to their caller).

In short, you want to allocate for ergonomic reasons.

If I may stretch the "context" further, personally for parsing I wouldn't immediately bail on the first failed parse error (unless its Ok value is actually needed as part of the parsing), and do something like:

let foo = string_from_user1.parse().map_err(|error| ParseError::Foo { context, error });
let bar = string_from_user2.parse().map_err(|error| ParseError::Bar { context, error });
let baz = string_from_user3.parse().map_err(|error| ParseError::Baz { context, error });

if let (Ok(foo), Ok(bar), Ok(baz)) = (foo, bar, baz) {
    // do work
    Ok(())
} else {
    let mut parse_errors = Vec::<ParseError>::new();
    if let Err(parse_error) = foo {
        parse_errors.push(parse_error);
    }
    if let Err(parse_error) = bar {
        parse_errors.push(parse_error);
    }
    if let Err(parse_error) = baz {
        parse_errors.push(parse_error);
    }

    // Then you can propagate all errors to the user in one go,
    // instead of them getting past the first error, re-running,
    // then hitting the next one.
    Err(AppError::ConfigParse(parse_errors))
}

In principle, I try to return as little and as much information in the Error as useful, and only translate it into "english" as close to the boundary as possible (sometimes not even english, but the user's locale).

1 Like

Yes, I consider it "inventing my own stuff". Which may be actually OK, it's just that to my knowledge no such pattern exists in the Rust ecosystem yet and someone has to be the first to start it. So if I were to do it I wonder if others would be interested in using the same convention.

Regarding allocations, yes, allocation may be also the right way.

@azriel91 yes, that's even better and I started doing it lately in some projects. And it can't avoid allocation anyway so might as well drop that requirement.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.