Pattern for unwrap_or with debug assertion

I often find myself in situations where there is an Option or Result that in "normal execution" should be safe to unwrap, but for which certain inputs could cause a None or Err variant. The situations include:

  • Deserializing machine-readable values (Bincode/Postcard). Normally, I expect data being deserialized to be valid bytes from the serialize function, but if my API doesn't know for sure that the bytes are trusted, they could potentially be invalid or malicious.
  • Fuzz-testing an API that processes byte inputs that are typically valid but could be invalid (say, an API that takes a byte slice and wants to make it into an str).

To handle such situations, my choices are typically:

  1. Just call unwrap. Not great because it may make it trivial to cause a program to crash in production.
  2. Bubble the error case up through return values. Not great because it makes users of an API think about how to handle a case that may be distant from what they're doing and should not happen in "normal" situations.
  3. Mark the function unsafe. Not great because unsafe should be used judiciously, and it may be difficult for the caller to reason about the safety invariant.
  4. Check the invariant elsewhere in the code where errors are already being propagated. Not great because users need to pay for validation cost, even though normal execution should never hit these situations.
  5. Call unwrap_or and provide some reasonable default fallback value, like an empty string. Not great because it may obscure actual bugs in code.

A pattern I've found that seems to be the "lesser evil" is a variation of 1 and 5: call unwrap in debug mode, and unwrap_or in release mode. This means that during unit testing, we are more likely to find actual bugs when they exist, but in release mode, we call back to "GIGO" behavior.

I find myself writing this a lot:

debug_assert!(result.is_ok());
let value = result.unwrap_or("fallback");

It would be convenient if the standard library offered a function like debug_unwrap_or that did this behavior.

What do people think? What's the most Rusty way to handle this? Should we make a proposal to add debug_unwrap_or to the standard library?

The Rusty way is to bubble up errors using the ? operator.

That is precisely what Result is for. The Err branch must account for all errors you can deal with (other than cosmic rays I guess). If the user cares, then the user can wrap your Err with their own an add their own handling. If they don't, well they YOLO it and call unwrap. But a library writer must not do it - either return an Ok signalling it's okay or Err signalling its broken - in whatever way it may be.

2 Likes

(2) probably, or maybe (4). Result, ?, and so on are there to encourage you to think about and acknowledge the unhappy path. Not handling the non-"normal" situations and not validating inputs is the source of a great many bugs and security vulnerabilities. These situations need to be handled correctly, even if it's not the funnest part to tackle or the part of the program which captures most of your concentration.

If it's 100% you won't have an unhappy path, you should just have a T, not a Result<T, _>. That you acknowledge the panics find bugs in testing demonstrates that it's not 100%. Any publicly exposed API (say) will be subject to a great number of non-"normal" interactions.

2 Likes

Let's focus on the deserialization example. I have a struct that has invariants that are expensive to verify. Here's what happens with each option:

(1) Code crashes if I am passed invalid bytes.
(2) The getter functions on the struct become fallible, when they shouldn't be because the struct has invariants that are upheld in every case except for the deserialization constructor.
(3) User needs to reason about safety invariants and may be error-prone.
(4) Code gets slower for everyone.
(5) Possibly GIGO behavior.

The reason I like (5) is because often there is a default fallback value. However, if I hit that path during debugging, I may want to know about it. This is what leads me to the debug_unwrap_or suggestion.

The deserialization should fail when the invariants are not upheld. This is basically a better version of #4. "Make invalid state unrepresentable".

Are you using serde? I believe it's not too difficult to replace the derived Deserialize with the expanded version, where you can then add relevant cross-field validations.

1 Like

I agree, failing in the serde::Deserialize impl is the best option, when checking the invariant is cheap. The problem is that checking the invariant is not always cheap; it could either require memory allocations or else be a polynomial-time operation. Clients understandably don't want to pay that cost.

In general I'm trying to reduce the number of invariants or make the invariants cheap to verify, but it's not always possible to do that.

In that case you can have two versions - one version which panics on wrong input while another that returns a Result.
It is a bit like the slice::get which is functionally same as the index operator but the former is failable while the later panics on out-of-index access.

If you want this for yourself, you can make a trait that adds debug_unwrap_or to all result types.

However, I don't think release-mode On Error Resume Next fits Rust. Rust tends to be strict about error handling, and something either is an error, and can be relied to fail, or isn't. The only exception to this in Rust is handling of arithmetic overflow, but that's only a compromise, because of how big performance impact universal overflow checks would have.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.