Error handling in library: Enum or Trait type

Hi,

I had a few questions about error handling when implementing a trait in Rust.
I'm writing a library around hypervisors to wrap their "introspection APIs", meaning an API to query the virtual hardware state from the VMs and listen for hardware events such as interrupts or pagefaults.

The library provides multiple drivers, which can be enabled/disabled at build time:

  • Xen
  • KVM
  • VirtualBox
    etc..

All of these drivers implement the same trait:

pub trait Introspectable {
    // read VCPU registers
    fn read_registers(&self, _vcpu: u16) -> Result<Registers, Box<dyn Error>> { unimplemented!(); }
    ...
}

Our naïve implements used Box<dyn Error>, but this is not the best approach, as I understand, so I had a couple of solutions to implement proper error handling:

Solution 1 - returning an Error trait

Basically declare an Error trait, deriving from std::error::Error, to keep a good level of abstraction from the API and the drivers implementation.

pub trait DriverError: Error {}

Update our Introspectable trait functions definition:

pub trait Introspectable {
	fn read_registers(&self, _vcpu: u16) -> Result<Registers, Box<dyn DriverError>> { unimplemented!(); }
}

Implement a custom error in each driver, based on thiserror, and implement the from trait to convert them to the Box<dyn DriverError> type:

// defining custom Xen driver error
[derive(thiserror::Error, Debug)]
pub enum XenDriverError {
    #[error("no pending event channel ports")]
    NoPendingChannel,
    #[error(transparent)]
    TryFromIntError(#[from] TryFromIntError),
    #[error(transparent)]
    IoError(#[from] IoError),
    #[error(transparent)]
    XcError(#[from] XcError),
    #[error(transparent)]
    NixError(#[from] nix::Error),
    #[error(transparent)]
    ForeignMemoryError(#[from] XenForeignMemoryError),
}

impl DriverError for XenDriverError {}

impl From<XenDriverError> for Box<dyn DriverError> {
    fn from(err: XenDriverError) -> Box<dyn DriverError> {
        Box::new(err)
    }
}

impl Introspectable for Xen {

	fn read_registers(&self, vcpu: u16) -> Result<Registers, Box<dyn DriverError>> {
		...
		   .map_err(XenDriverError::from)?;
	}
	
}

Solution 2 - Returning an Enum error

I also watched this talk about error handling:

"RustConf 2020 - Error handling Isn't All About Errors by Jane Lusby" by @yaahc

"We don't know of users will handle our errors, ... , so we need error type that are maximaly flexible, ... and we want our errors to be an enum, so they can be reacted to easily"

Therefore I implement this solution too:

#[derive(thiserror::Error, Debug)]
pub enum DriverError {
    #[cfg(feature = "xen")]
    #[error(transparent)]
    Xen(#[from] XenDriverError),
}

pub trait Introspectable {
	fn read_registers(&self, _vcpu: u16) -> Result<Registers, DriverError> { unimplemented!(); }
}

And in the Xen driver:

#[derive(thiserror::Error, Debug)]
pub enum XenDriverError {
    #[error("no pending event channel ports")]
    NoPendingChannel,
    #[error(transparent)]
    TryFromIntError(#[from] TryFromIntError),
    #[error(transparent)]
    IoError(#[from] IoError),
    #[error(transparent)]
    XcError(#[from] XcError),
    #[error(transparent)]
    NixError(#[from] nix::Error),
    #[error(transparent)]
    ForeignMemoryError(#[from] XenForeignMemoryError),
}

impl Introspectable for Xen {
	
	fn read_registers(&self, vcpu: u16) -> Result<Registers, DriverError> {
		...
		   .map_err(|err| DriverError::from(XenDriverError::NoPendingChannel(err)))?
	}
}

So I'm confused about which way to go, and I'm looking for guidance about what is expected from a library, in terms of errors types, in 2021 rust ?

  • Error enum, because it is well defined, but the downside is that the DriverError is composed of multiple kind of driver specific error, so we tend to break the abstraction here ? (just my feeling)
  • Error trait, which keeps a good level of abstraction from the driver implementation, but at the same time, what about the user error handling ?
  • What about API stability ? as I understand in the case of enum Errors, using #[non_exhaustive] does the trick
  • last point: performance: dynamic downcasting has a bit of a runtime cost, and i would like to avoid that if possible.

Thank you for your help !

@Wenzel my first instinct here would be to add a generic associated type to your Introspectable trait

pub trait Introspectable {
    type Error;
	fn read_registers(&self, _vcpu: u16) -> Result<Registers, Self::Error> { unimplemented!(); }
}
2 Likes

Definitely do not lump together all possible kinds of errors in the same enum – this not only makes the abstraction leaky, it also makes it impossible to introduce new error types for future 3rd-party backends.

Typically, I think case-based error handling is overrated. Most of the time, there's nothing meaningful to do about an error except to propagate it or display it or panic. So I would say that most libraries could reasonably get away with creating a struct Error { message: String }.

Don't make your custom Box<dyn ErrorTrait>, though, unless needed for something else. The charm of std::error::Error is exactly that it is standard, you don't have to create it. If you do, then you get no such short-term writing ergonomics benefit from a boxed trait object, and it will be harder to use for consumers, because supertrait coercions don't yet exist, so unless you explicitly provide a conversion to &dyn std::error::Error, they won't be able to use it in many cases where one is expected.

In some less frequent cases, an error contains more machine-readable information other than just "thing X went wrong". In these situations, and if the error is designed to be recoverable, then it would be wise to make it into an enum indeed. However, it should be local to the backend, then, as illustrated above.

1 Like

Thanks for your replies.
@yaahc I like the associated error type solution, I think we will try this road.

@H2CO3
Definitely do not lump together all possible kinds of errors in the same enum – this not only makes the abstraction leaky, it also makes it impossible to introduce new error types for future 3rd-party backends.

Were your referring to the DriverError enum type ?

#[derive(thiserror::Error, Debug)]
pub enum DriverError {
    #[cfg(feature = "xen")]
    #[error(transparent)]
    Xen(#[from] XenDriverError),
}

I understand that it leaks the abstraction by revealing all the driver error types, but I don't understand how it would prevent adding new error types in the future ?
could you be more specific ?
Adding a Hyper-V error kind would be as simple as

#[derive(thiserror::Error, Debug)]
pub enum DriverError {
    #[cfg(feature = "xen")]
    #[error(transparent)]
    Xen(#[from] XenDriverError),
    #[cfg(feature = "hyperv")]
    #[error(transparent)]
    HyperV(#[from] HyperVDriverError),
}

?

If someone else were to implement a 3rd-party renderer, they couldn't add their own variant to the DriverError enum.

1 Like

@yaahc coming back to your proposed solution,
it's actually impossible to implement for us, because we have an init function that is supposed to instantiate a given driver and return it as Box<dyn Introspectable>.

let mut drv: Box<dyn Introspectable> = microvmi::init(domain_name, DriverType::Xen, init_option);

But it can return any kind of driver, so we cannot specify the concrete error type in the function signature ??

So I'm confused about how to update my function signature here:

   Compiling microvmi v0.1.10 (/home/wenzel/Projets/rust/libmicrovmi)
error[E0191]: the value of the associated type `DriverError` (from trait `Introspectable`) must be specified
   --> src/lib.rs:30:14
    |
30  | ) -> Box<dyn Introspectable> {
    |              ^^^^^^^^^^^^^^ help: specify the associated type: `Introspectable<DriverError = Type>`
    | 
   ::: src/api.rs:180:5
    |
180 |     type DriverError;
    |     ----------------- `DriverError` defined here

Any tips ?

Thanks.

Would that DriverType::Xen imply that it will be a XenDriverError that comes back? I'm not sure exactly what your constraints are but if you're okay with adding generic parameters to init then you could do something like this:

trait DriverType {
    type Impl;
}

struct Xen;

impl DriverType for Xen {
    type Impl = <the Introspectable type for Xen>;
}

fn init<D>(domain_name: DomainName, driver_type: D,  init_option: InitOption) -> Box<dyn Introspectable<DriverError = D::Impl::DriverError>>
where
    D: DriverType,
{ // ... }

The idea is that you need some way to derive the error type from one of the inputs. I did this by changing driver_type from an enum to a trait, where the trait is just a marker that ties back to the implementation that gets used for that driver type. Then in the return type you can reference an associated type of the input to propagate the error type.

In practice, there's no reason you'd need to even have an argument here, since it's a zst here, you really only want to specify the type, so if you removed that you could still call init like this let mut drv = microvmi::init::<Xen>(domain_name, init_option);

1 Like

Thanks for your reply !

I'm not sure exactly what your constraints

I think i need to give you more details about this init function.

The desired function signature is the following:

pub fn init(domain_name: &str, driver_type: Option<DriverType>, init_option: Option<DriverInitParam>) -> Result<Box<dyn Introspectable>, MicrovmiError>

As you can see, the DriverType enum is optional, in order to enable 2 behavior:

  • Some(): where we return a Box<dyn Introspectable> of the requested driver
  • None: in which case we will iterate over all the fields in the enums, and try to instantiate the driver, to return the first successful one. (not that this is not implemented yet)

So your proposed solution would be interesting for the first case, but for the second case, i'm not sure how to handle it with generic types here.

Now I'm wondering if it makes sense to have a unique init function, handling both cases, or you think it will be easier (in terms of implementation) to offer 2 distinct function to probe for available drivers, and another one to initialize a specific driver.

However, I like the idea of having this unique entrypoint into the library, as it's simplifying the API for the end user.

Thank you very much.

I don't think that if you do it with a unique function or two distinct ones that it makes a huge difference in terms of how much work it is to implement. I think the biggest difference is how much flexibility and convenience you want in the final interface. Thinking through everything you've told me I think the best idea for you would be to provide a builder style API. That will make it much easier to support the complex setup and driver specific behavior, regardless of whether or not you do it with traits and the type system or enums and runtime checks.

https://rust-lang.github.io/api-guidelines/type-safety.html?highlight=builder#builders-enable-construction-of-complex-values-c-builder

This should be a good resource for how to design builder APIs if you're not already familiar. Going back to your original comment I think having an enum that wraps all the other possible error enums is fine.

I don't think this is bad in a general sense but I think its also perfectly reasonable if you want to avoid this API, it would make matching on specific errors a bit more annoying. Not much tbh, Rust pattern matching syntax is good at this kinda thing.

That said, if you want to return an Introspectable that only returns the exact error type for the actual drive type then you'd want to add a default type parameter to the builder and add an extra drive type that abstracts over the others and returns your DriverError enum which is then set as the default for new builders. As far as I can tell it isn't possible to dynamically pick the drive kind but statically return the correct type. Though you could probably add some cool downcast methods that will potentially cast the return type to the exact one on the trait object if the type behind the trait actually implements that exact version of Introspectable.

Let me know if you want me to sketch out what I mean, I'm worried I didn't explain this very well...

1 Like

I ended up getting nerdsnipped and implemented an example of what I mean.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=0b6ef3b91cfeaa75a74278f297044930

One thing to note, I did end up adding a Send + Sync + 'static bound to Introspectable to help deal with some lifetime issues that otherwise pop up in init. If you need to support trait objects that aren't thread safe or contain lifetimes that can be supported without too much additional effort.

1 Like

Hi,

I'm working together with @Wenzel on this library. First of all, @yaahc thank you for taking the time to put together such a detailed example. I've played around a little with your suggestion and I really like the idea of using a builder pattern. However, there are some things that are not clear to me right away. For example, how exactly did you envision the AnyDriver? I suppose it has to wrap all the other concrete driver types, right? That's why I have created an AnyDriver enum and implemented the interface by doing pattern matching on self. I don't particularly like the fact that this creates another layer of indirection though. Was this the same route that you wanted to take or did you have something entirely different in mind?

1 Like

That's pretty much what I was imagining, yea. You need some sort of way to handle all the runtime cases in the type system. An enum is what I imagined though you might be able to instead use boxed trait objects. The only problem is each of the underlying Driver types have different error types, so you'd need some sort of adapter still to map the error type to the enum version so the different implementations have the same type when boxed.