Why does my Rust file reading and zipping logic feel so long compared to examples I’ve seen?

I’ve been playing with some file-related functionality in Rust, and I noticed that my way of reading files and preparing output feels really long, even for simple tasks. I use std::fs::read_to_string, or sometimes BufReader if I need line-by-line reading. I wrap almost everything in functions that return Result<_, std::io::Error> and match errors inside the function instead of just using ?. I do this to print custom error messages depending on what went wrong — like file not found, or permission denied. It works fine, but when I compare it to other examples people post, they often just use ? in main() and let it crash or bubble the error with minimal handling. Is it really better in practice to avoid all this extra error matching?

When I'm reading multiple files in a loop, I also check if the file path is valid, readable, and sometimes try to clean the input manually before opening it. I made helper functions like read_file, validate_file, and clean_path just to break things up. I like the control, but I’m worried that I’m making it too complex for what Rust expects. Does anyone here avoid this kind of splitting and just keep everything inline in a for loop with some ?? I'd really like to understand how other users approach this.

One part that actually made me think more about how to simplify was when I added ZIP creation using the zip crate. I got the idea from using a free file zipping website that lets you drop multiple files and get one ZIP file in return. It felt very natural to do something like that locally in Rust. I read each file into memory, then added it to the ZIP with ZipWriter::start_file and write_all. That process worked better than expected, and I was happy with it. It also made me wonder if I'm overthinking other parts of the code.

My question is mainly about structure. Is there a preferred way in Rust to deal with things like file I/O, zipping, and error messages without making everything look like a full-blown error-handling system? Do people usually go minimal and let the system print generic errors, or is it common to match on every possible ErrorKind manually? And when it comes to organizing file logic, do you separate your reading, checking, and writing into separate functions even for short tools, or just do it inline? I’d like to see how others approach this when you're reading several files, transforming or bundling them, and writing some kind of result.

Also, is it considered weird in the Rust community to avoid things like anyhow or thiserror when I’m just writing simple tools and want to stick with std::io::Error and basic matching? I keep it plain on purpose, but I wonder if I’m just doing more work for no real benefit. Would really appreciate if others could share how they handle this kind of situation.

  • Example code is often just propagating the error because the question of exactly what to do with the error is irrelevant to what the example is teaching. ? is sufficient to indicate “an error could happen here”.

  • Different kinds of programs need different kinds of error handling — printing, in particular, is completely inadequate in a GUI program where the user will not see what was printed.

  • Often, ? is the right thing to do with an error because doing anything else with the error is the responsibility of the caller.

  • Also, a lot of production code really is just written with a complete lack of interesting error handling. This can be just sloppiness, but also, there are a lot of cases — servers, in particular — where the program is meant to run unattended and if it encounters any errors, the solution looks more like “restart the server/machine” or “debug the bug that led to this error” than responding to the specific error with a specific action.

This is bad practice because it creates bugs, because the filesystem can change state between the “check” and the “open”. What you should do instead is just open the file, and then examine the error you get back if it fails. That error can tell you if the file is not readable, so there is no need to separately, inaccurately, check readability first.

I don’t know what you might mean by “clean the input”.

It is good to add details like “what file did the error happen to”. However, it sounds like you are doing this in an extremely verbose way, from the way you describe it. You can, and should, keep the error context code very short by using wrapping error types and .map_err():

enum MyError {
    OpenInputFile {
        path: PathBuf,
        error: io::Error,
    }

...

for path in paths {
    let file = fs::open(&path)
        .map_err(|error| MyError::OpenInputFile { path, error })?;

We’re still using ?, but we're making sure there is sufficient context included by mentioning the path (and perhaps also what was being attempted, too, but that description would be more application-specific so I didn't include one). If just adding the paths and operations is sufficient, then the fs-err library can help with that, but the benefit of defining your own errors (or using anyhow) is that you can also report non-IO errors using the same error type.

In most cases of error reporting (that is, the error is shown to the user, not reacted to by the program itself), the thing that is most valuable to do is adding context, not customizing how the ErrorKind is printed.

There is no one answer here — it depends on the kind of code you are writing.

  • If you are writing a library, then you should be defining your own error types that tightly specify what kinds of errors can arise from each of your library functions. You might or might not use thiserror to help you define those error types, but the important thing is how those types are defined, not what tools you use to do it.
  • If you are writing an application — particularly, the sort of application you are talking about where you are operating on user-provided files and reporting errors directly to the user — then you should be adding context to almost all of your errors, and there is less value in having precise types for errors — and the thing that anyhow is good at is making it easy to add ad-hoc context and propagating untyped errors.

I should also mention snafu which is an alternative aimed at helping both of those tasks — I have not yet tried using it myself.

But, in any of these cases, you should generally keep your error printing separate from the code that is performing the actual work. That code should add context to the error and return it, and only the top-most levels of your program should be concerned with how errors are printed.

The reasons why these libraries exist and why std is not already all you need are:

  • std doesn't yet have anything for printing error source() chains, which is absolutely essential for your kind of program.
  • std doesn’t provide any generic “with added context” error wrapper type like anyhow does.

These are both easy enough to write yourself, without very much code, but a lot of people would still rather use a library than rewrite those things in every program.

3 Likes

Also worth mentioning is eyre/color-eyre. It is an expanded version of anyhow, with colors, but also the ability to add custom sections, and suggestions.

I found this useful in a project where I used and embedded scripting language, allowing me to attach the stack trace from the user provided script. Not needed for most projects, but when you need it, it is great.

However, anyhow style libraries do have a downside. I have found that I often need to manually handle one or two errors specially in my program (out of the entire program), and since everything is type erased with anyhow that doesn't work. As a result you end up with one place where you use a en error enum of the interesting cases plus an Other(anyhow::Error) variant. Not ideal, but it mostly works.