My new macro crate, throwing

Svizel_pritula · September 14, 2023, 5:44pm

I'm looking for feedback on my new crate, throwing.

The source code is available here: GitHub - SvizelPritula/throwing

It defines a #[throws] macro that allows you to declare what errors a function might return, similarly to Java's throws keyword. It generates an enum under the hood.

Example usage:

use std::io::{self, stdout, Write};
use serde::Deserialize;
use throwing::throws;

#[derive(Clone, Deserialize)]
struct Summary {
    extract: String,
}

#[throws(reqwest::Error | serde_json::Error)]
fn fetch_extract() -> String {
    let url = "https://en.wikipedia.org/api/rest_v1/page/summary/Rabbit";
    let response = reqwest::blocking::get(url)?;

    let summary = response.text()?;
    let summary: Summary = serde_json::from_str(&summary)?;

    Ok(summary.extract)
}

#[throws(reqwest::Error | serde_json::Error | io::Error | break FetchExtractError)]
fn main() {
    let extract = fetch_extract()?;
    writeln!(stdout(), "{extract}")?;

    Ok(())
}

ScratchCat458 · September 15, 2023, 12:34am

This library fits an odd use case but works decently well for it.

One thing that I'm quite happy with is that the function signature is shown correctly on rustdoc:

use std::{fs, io};

use throwing::throws;

#[throws(io::Error)]
pub fn get_file() -> String {
    let content = fs::read_to_string("Cargo.toml")?;
    Ok(content)
}

There are two primary cases when errors need to be worried about, application code and library code.
In application code, we use functions to encapsulate sections of logic that need to be reused and in most cases just want keep propagation errors with ? until they reach fn main. For this case libraries like eyre and anyhow are generally what you would want, essentially better more effective versions of Result<T, Box<dyn Error>>.
In library code, the intent is to model everything that can go wrong using types and state. Generally libraries will either have a single error type or split it into smaller error types based functions/methods with similar functionality. For this, I like using thiserror to help implement fmt::Display and From conversions.

In throwing a new enum is created for every function definition, even those that produce the same errors. This causes further issues when the error types are unable to have docs added to them unless the a type alias is created or the macro is expanded prior to compilation, which defeats the purpose of a proc macro. In application this doesn't provide much benefit other than an explicit declaration of the types of errors that can occur, but this can be done by creating a single application error type that aggregates all of them. For library developers using crates built with this, who may be creating a single global error type, this further complicates the process of aggregation.

As someone who started learning software development with Java before using Rust (and never looking back), I like the concept of bringing this over. However, it should be noted that with Java throws, the pattern causes the list of exceptions to cascade upward, where the possible exceptions of functions used in the body of another are a subset of the parent function's exceptions and you can see all unhandled exceptions that may occur at the top level function. Unfortunately without this cascading property, the concept becomes much less useful.

Still, you have quite good work on this and I'm interested to see how it develops!

H2CO3 · September 15, 2023, 6:05am

While we are at the question of generating enums:

Naming the enum simply after the function, eg. GetFileError, is Bad^TM, because it can easily cause name collisions with perfectly legitimate user-defined errors. I mean, GetFileError is a wholly conceivable error type name; I would be pretty upset if a library silently generated that name and caused my code to fail compilation with a mysterious error message, which is hard to debug because the culprit is hidden behind a macro.

Adding a bit of name mangling (eg. leading underscores) should improve matters. By the way, it is not clear to me based on the documentation whether the name of the generated enum can be specified, or only the names of the individual variants.

However, doing this the "right" way – via name mangling – results in another problem: the errors will be impossible to match on reliably! This becomes an issue when active handling of the errors is desired instead of simple propagation.

All in all, I would advise you against trying to force the style of other languages in Rust when it comes to error handling. The Result system is pretty well thought-out and mature, and there are established libraries for streamlining the creation of custom error types, such as thiserror. Visible, nameable, matchable error types should usually be preferred.

SkiFire13 · September 15, 2023, 6:19am

The fact you have to specify break FetchExtractError in order to allow the conversion between them is a deal breaker for me.

Svizel_pritula · September 15, 2023, 9:45am

This crate was mainly created as an experiment. I've often found that current error handling solutions for Rust often provide great ergonomics for passing errors around and logging them, but fall short when it comes to handling them. For example, you might want to display an error message to a user, but if you don't know all possible errors, you're forced to display something like "Unknown error" for some or most of them. Today's crates also often have quite opaque error types, so you can't usually do better than "Something went wrong with the database." or "There was some syntax error in your JSON, somewhere".

Errors can cascade upward. Due to Rust limitations, you have to explicitly list the names of all error enums of functions you call with the break keyword, but if you do so, the generated code will match on them and "rethrow" each variant:

#[throws(FooError)]
fn first() {
    foo()?;
    Ok(())
}

#[throws(FooError | BarError | break FirstError)]
fn second() {
    first()?;
    bar()?;
    Ok(())
}

#[throws(FooError | BarError | BazError | break SecondError)]
fn third() {
    second()?;
    baz()?;
    Ok(())
}

Rustdoc is a good point, I'll probably make it possible to attach arbitrary attributes and documentation to the generated errors.

Svizel_pritula · September 15, 2023, 9:50am

The errors are given nice names since they aren't hidden: They will appear in the functions signature and can be matched on. In fact, if you never match on any of the errors, there is no reason to use this crate over anyhow. Naming specific errors after the function that throws them isn't uncommon in the Rust ecosystem. As shown in the docs, it is possible to specify a name explicitly using #[throws(type ErrorName = Variants)].

Svizel_pritula · September 15, 2023, 9:52am

I'm not too happy about that either, but I don't think there is a way around it without specialization. It might be possible with some shenanigans involving rewriting the function to change all ? usages to something else.

H2CO3 · September 15, 2023, 10:57am

That's not my problem; I know the naming conventions of the ecosystem, thank you very much. The problem is that the generated enums are invisible in the source code, and anyone unintentionally declaring something else with the generated name will have a hard time figuring out why there are mysteriously duplicated items.

Svizel_pritula · September 15, 2023, 11:08am

Not really? rustc is perfectly capable of identifying and reporting naming conflicts, as well as tracing macro-generated stuff back to the relevant macro.

For example this code:

use throwing::throws;
use std::num::ParseIntError;

#[throws()]
fn parse_int() { Ok(()) }

Will yield a very readable error, as well as error squiggles under the macro with rust-analyzer:

error[E0255]: the name `ParseIntError` is defined multiple times
 --> src/main.rs:4:1
  |
2 | use std::num::ParseIntError;
  |     ----------------------- previous import of the type `ParseIntError` here
3 |
4 | #[throws()]
  | ^^^^^^^^^^^ `ParseIntError` redefined here
  |
  = note: `ParseIntError` must be defined only once in the type namespace of this module

(Other errors follow)

H2CO3 · September 15, 2023, 12:59pm

People don't expect attributes on an item to randomly influence another item, though. The macro expansion isn't shown, so the fact that the macro is underlined won't necessarily make anything clearer. If you look around on this forum, you'll see people confused by way more obvious errors.

Heliozoa · September 15, 2023, 4:11pm

I think that's just a documentation problem, as long as the crate docs make it clear this is what's happening it should be fine. The widely used derive_builder crate does the same by generating StructNameBuilder, and with the way this is highlighted as the very first thing in the docs, I don't have a problem with it: derive_builder - Rust

oooutlk · September 17, 2023, 10:49am

Your project remind me that I published a similar crate cex several years ago.

#[cex]
fn fetch_extract() -> Result!( String throws reqwest::Error, serde_json::Error ) {
//omitted 
}

system · December 16, 2023, 10:49am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Announcing stable proc-macro-error	1	366	July 7, 2020
[SOLVED]: Macro Error on Serde Serialize/Deserialize: `Err` value: ParseFloatError { kind: Invalid } rust-analyzer macro-error editors and IDEs	4	1798	August 20, 2021
Code review request of procedural macro code review	1	383	February 13, 2021
Errors in Rust can now be handled more ergonomically, cleanly, and simply: introducing a new error crate announcements	36	6620	February 22, 2021
Rust inside node.js with serde_json help	6	1949	January 12, 2023

My new macro crate, throwing

Related topics