Thoughts on resource management and program structure


#1

TL;DR: I am searching for a pattern or feature easing handling of multiple partially subsequent I/O operations

Continuing the discussion from Is there a good way to roll back partial io (or resource allocation) operations?

When programming with I/O resources (which may fail), I noticed I end up writing many functions like this:

fn do_some_io_operation(arguments: Arguments) -> Result<Value, Error> {
    // prepare operation, do something which will not fail
    
    // do the operation itself which might fail with -> Result<?, Error>
    match io_operation(crit_resource) {
        Err(e) => {
            // do some cleanup on preparations
            e // return previous error
        },
        Ok(v) => {
            // hand crit_resource down to subsequent operations
            match do_next_io_operation(new_arguments) {
                Ok(v) => {
                    // handle results
                    // cleanup if necessary
                    Ok(something) // everything went ok, continue
                },
                Err(e) => {
                    // unwind if necessary, e.g. delete or kill crit_resource
                    e // return previous error
                },
            }
        },
    }
    // at this point, crit_resource is either "finished"
    // i.e. in a state that it should be at the end
    // or it is back to what it was before entering this function
}

This structure guarantees that either every operation was finished successfully or all operations are rolled back to how they were before if an error occured.

In some cases the match blocks can be simplified to using try!(), but that doesn’t help much if you have to clean up on errors.

I usually build this in a way that crit_resource only lives in one of these functions so additionally to RAII (resource acquisition is initialization) I do something I’ll call RAIFM (resource acquisition is full resource management) here.

Take e.g. a RAIFM function

  • creating a temporary folder. It is then responsible to delete it in case any subsequent I/O operation fails.
  • spawning a child process. It is then responsible for killing the process in case any subsequent I/O operation fails. It is responsible to catch the exit status and parse it.
  • creating a file. It is then responsible for closing the file or deleting it in case any subsequent operation fails.

So a RAIFM function always is a scope (and thus a lifetime) of an object for an I/O resource. To help keep code readable it always handles exactly one resource, not more.

I don’t like that because of the following reasons:

  1. I end up writing the same code over and over again.
  2. parameters get more and more and more. If you nest many of these functions you will end up having e.g. 10 parameters. This is bad for readability. Usually you only need one or two of those and pass the others through. This is bad for both readability and making refactoring hard.
  3. I get many of those functions nested inside each other, as a result there is no easy way of seeing the bigger picture.

For example I’d like to see this structure at once (no reading through tens of functions):

  1. Create and open file A
  2. Create and open file B
  3. Start subprocess or thread C
  4. Start async I/O operation D

    N-3. finish (e.g. flush) D, handle possible errors
    N-2. Join or kill C, handle exit status or results
    N-1. Close or delete file B
    N. Close or delete file A

Then I want to hide all implementation details including rollback on errors and parameters somewhere else.

Now my questions:
Have you run into this problem before? How did you handle it?
Do you know any programming patterns, language features or library functions to handle it?
Do you know (academic) research on this issue?