Announcing Failure

https://air.mozilla.org/bay-area-rust-meetup-november-2017/

Skip to 1:17:10

3 Likes

Hmm, I actually watched it on livestream, but think I will watch it again :slight_smile:

I successfully migrated from error_chain to Failure, in the sense that my program compiles, runs correctly, and prints what I expect when I purposely introduce an error. Once I got past my misunderstanding of the example, it took me about 6 hours to finish the conversion of 40 error_chain errors in 25 modules, but I haven't converted any of what error_chain calls foreign_links or links. The migration included changing Result<()> to Result<(),Error> and ErrorKind::Bar to FooError::Bar throughout. In the process, I generated a gazillion compiler errors that I knocked down one by one, which is why I took me so long. A more experienced Rust programmer would certainly completed the conversion much more quickly. One thing that helped was that I could almost always put what I had in the error_chain display directly into #[fail(display = ...)] for my custom error types.

One disappointment is build time. Since error_chain makes heavy use of macro expansions, it requires a larger recursion limit than the default. I had thought that Failure would build faster, but it doesn't.

I'm now trying to figure out how to use err.cause() and err.context() so I can get the same kind of error report I used to get from chain_err.

4 Likes

I've seen the IntelliJ family of tools automate these kinds of mundane refactorings before. For example the other day I was using CLion and this bit of code was highlighted:

if (vec.size() == 0) {  # CLion highlights and suggests using .empty() instead.
}

Alt-Enter in the IDE, and apply quick-fix automatically converts to:

if (vec.empty()) {
}

I bring this up as an example of where the Rust tooling (even outside of the IntelliJ world) could be a great help. I didn't realize how much time features like this could save until I started using them more.

If porting to failure is a common thing that lots of Rust projects do in the future, perhaps we can make a tool to automate the mundane parts?

1 Like

Released failure 0.1.1 with some additional features: Failure 0.1.1 released

7 Likes

I‘m currently building a library that i migrated from stdlib errors to failure. Currently, I‘m returning my own error type and not Error, since it‘s relativly small and it feels like the right thing to do...

What are your criterions to decide between the general, boxed Error and specific error types? Is it a pure size thing or more a „applications use Error, libs specific errors“ heuristic?

(Edit - I realized the problem I was having with Context)

I'm going through the process of converting our codebase from error-chain to failure. Overall it has been going very well, and there's a dramatic reduction in the amount of code relating to error handling.

We have been using error-chain pretty successfully in the code, and the macros help reduce a lot of boilerplate error handling, particularly adding type conversions. But it does mean that every crate has its own error type, even if its only being used to encapsulate and transport errors from elsewhere.

The code is heavily async, and uses a lot of futures-based combinators. One of the most common frustrations with working with that pattern has been keeping track of what error types are where, and converting them to whatever the "ambient" error type for the context is - without the benefit of the ? operator.

The code also uses a fair amount of trait-based genericity, which requires a lot of the traits to have an Error associated type for its implementation to define.

Moving to failure has been a breath of fresh air - by converting everything to a single uniform Error type, we can eliminate the need for error type conversion within chunks of async code, and can simplify all the async traits by eliminating the need for an Error associated type.

Just from a straightforward "make it compile" conversion, I've reduced the number lines by around 40% (1000 lines added, 1600 removed, over 133 files), before starting to remove all the unneeded (now no-op) conversions.

Not everything has been rosy however.

Chaining Errors
Our code makes a lot of use of error-chain's .chain_err() combinator so by the time errors bubble up to the top of the stack to be reported, they have a good causal chain which describes not only why the error occurred, but what was going on at the time.

failure has a built-in notion of a cause which is exactly what we want, but I haven't found a good idiom for using it. It also has the .context() method on both Error and Fail, but that doesn't seem to be the same thing - but I'm not really sure.

I'm basically confused by context and cause, and not sure to what extent they're the same thing. The documentation for context talks about it being suitable for user consumption, but that seems out of place - our code is server code, so the "user" is whoever is digging through the log files, and we always want maximum precise detail there if we're trying to debug something.

(If it were actually a user-facing application though, nothing that's coded into the source would ever be directly presented because of localization, etc - I don't think its appropriate for an error-handling library to try to address UI issues.)

@alexcrichton filed an issue about this, and subsequently closed it, but I'm not sure matter is actually settled.

Edit - I realized that when using .with_context() and then .downcast() to extract errors, I was downcasting to my error type, not Context<MyType>. Fixing that give the behaviour I want.

bail!()
failure 0.1.1 introduces the bail!() and ensure!() macros, which look similar to error-chain's. Unfortunately it 1) compiles cleanly with existing uses, and 2) does something superficially similar but actually quite different. Specifically, it takes its argument and stringifies it, and then returns it as an error message wrapped in an Error. But in error-chain, it will take its argument, convert it to a suitable error for the context and return it. In other words, if you do bail!(MyError) it will return it, retaining the type info of MyError - whereas failure's bail!() will simply return err_msg(format!("{}", MyError), losing the type info.

I think its a mistake to introduce something like this. It would be better if it were completely incompatible, and simply didn't compile with existing uses so they can be iteratively fixed - either by using a completely different name, or changing the implementation somehow.

I've been using Err(MyError)? as a replacement, and I think it's an overall improvement, so I'd be fine with failure simply not having the bail!() macro.

(issue)

Error does not implement Fail (and so can't be used as a #[cause])
I understand why - Error has a blanket impl<F: Fail> From<F> for Error, which makes things very convenient. If Error also implemented Fail, then this would result in conflicting implementations of From.

But it also means there's no way to express a generic type for a cause. For example, if I have:

#[derive(Debug, Fail)]
enum MyError {
    #[fail(display = "Conversion of {} failed", _0)]
    Conversion(String, #[cause] ???), // what type?
}

then there's no one type which can handle a number of different conversion failures. The obvious choice would be Error because its used to wrap up all the other error types. However #[derive(Fail)] requires a #[cause] field to implement Fail - but Error doesn't implement Fail.

If the custom-derive code had type information available to it, then it could implement #[cause] in two different ways - by directly returning a &Fail for Fail-implementing causes, or by calling Error::cause() to get the inner error for Error causes. But my understanding is that it doesn't have type information, so maybe it needs to have #[cause] and #[error_cause] to handle these cases (or something like that).

I've tried to work around this by using Box<Fail> for a cause, but Box<Fail> doesn't implement Fail. I have a hacky local BoxFail type, but it seems like a wart. And it doesn't help with Error unless I have a Fail-implementing wrapper for it. I can get one with .context(""), but that seems like a pretty awful hack.

4 Likes

@withoutboats I'm looking at this, and I think I can see how this improves on existing trait-based error management in Rust.

However my approach to errors has always been enum-based, i.e. something like this:

pub type MyResult<T> = Result<T, MyErr>;

pub enum MyErr {
// all error variants for the module
} 

And then just have error-prone functions and methods return a MyResult, or anything with a From impl for MyErr.

In cases where there is a specific need for a consuming crate/module to be able to define its own error variants, I can definitely see the advantages of trait based error management.
I can also see the advantage when the module author has a strong reason to hide the specific nature of the errors from the consuming code. That said, I prefer to expose the errors as enum variants as it allows consuming code to match on it.
But if neither of those things is necessary, would there still be a compelling reason e.g. for my projects to switch from enum-based error management?

1 Like

Thank you for thinking of us #![no_std] and no heap users. Unlike std::error::Error and even the best plans for error-chain, failure is actually usable!

5 Likes

I would still recommend implementing Fail for your custom error so that other people using your library can integrate it with the whole ecosystem around failure. Also, imagining that some of your variants refer to other error types, it is probably valuable for your users for you to implement the cause method.

2 Likes

I've encountered a minor inconvenience with Failure. Say that I have some function

fn foo(&self) -> Result<(), Error> { ... }

that is called from some other function

fn bar(&self) -> Result<(), Error> { foo() }

If I add a context in the most obvious way,

fn bar(&self) -> Result<(), Error> { foo().context("bar") }

it won't compile because I'm returning failure::Context instead of failure::Error. The fix is obvious

fn bar(&self) -> Result<(), Error> { Ok(foo().context("bar")?) }

but slightly annoying. (Well, the most annoying part is that I can never remember to add the Ok(...) until the compiler tells me I have to, but I'd rather put the blame somewhere else.)

1 Like

Same here.
Also, I think .context() and .with_context() are bad names, as they refer only to the error case, but they appear to append a generic context to the execution, not to the error.

In error chain, it was .chain_err() which is much clearer, IMO.

Maybe .wrap_err() or something like this, to be in line with .map_err()?

1 Like

.err_context(), perhaps?

2 Likes

Is there a way to use a reference with a non-static lifetime in a custom Fail type? If so, I can't figure out the syntax to specify the lifetimes.

Currently, Fail requires the type is 'static, which means you can't have lifetimes in it. This is so that other code can assume that errors don't have lifetimes, but we might loosen the restriction if it seems like it was a mistake.

What is your use case where you want to have lifetimes?

I only have references for some of the things I want to include in the error. If there was a way to specify lifetimes, I wouldn't have to clone them.

The problem usually is that often the error gets returned up outside of the scope those lifetimes refer to, so ultimately you'd get lifetime errors when you try to return the error. If you're going to handle the error immediately, probably its fine if it just doesn't implement Fail or any error trait at all.

Unless you think the error's going to occur very frequently, cloning them is probably fine. If it is, you may need shared ownership.

But not always. In my case, I'm writing parsers where each error in the chain is a tagged enum consisting of production name + chunk of input it couldn't parse (from more specific ones to more generic one). In this case, I just want all of them to point to the original &str so that on consumer side I could either print one of them or calculate positions within a string etc. - something not possible with a cloned copy.

Of course, I could split each error into a tuple of production and chunk of input, but that would be rather inconvenient to use, especially given that this change would be required just to workaround limitations in the failure crate which, as you said, might be lifted later.

Is this in an application or a library? If the latter, what about when your users want to throw the error further up? If the former, what benefit do you see in implementing Fail for this error type?

1 Like