How to avoid naming a type in API that only acts as intermediary value

Hey folks,

I am very stuck on something that I swear should be possible to do, but I have
literally spent days trying to get this to work, and I just can't seem to
figure out the right incantations to do what I need. It's possible I'm missing
something basic, but I'm really hoping one of the Rust experts out there can
help. I'm just exhausted and my brain hurts. Thank you in advance for any
advice you can give!

So... I am trying to make a type that holds a hierarchical list of non-critical
errors that occurred during a program run. When a function calls a subfunction,
I would like the caller to be able to easily create an "error sublist" that the
subfunction can push its own errors into, and when the subfunction is done
running the sublist should be pushed into the caller's error list using a
mapping function provided by the caller when it invoked the subfunction.
(I use Drop so this happens automatically when the subfunction ends).

Here is an example of what the usage would look like (ideally):

enum UpperError {
    Upper,
    Lower(ErrorList<LowerError>),
}

enum LowerError {
    Lower,
}

fn lower(mut errors: ErrorSublist<'_, LowerError>) {
    errors.push(LowerError::Lower);
}

#[test]
fn basic() {
    let mut errors = ErrorList::default();
    errors.push(UpperError::Upper);
    // The arg to sublist must be a function that converts
    // ErrorList<LowerError> to UpperError. It will be called when the
    // ErrorList is dropped at the end of lower().
    lower(errors.sublist(UpperError::Lower));
    errors.push(UpperError::Upper);

    //  `errors` should look like this after:
    //
    //  ErrorList [
    //      UpperError::Upper,
    //      UpperError::Lower(ErrorList [
    //          LowerError::Lower,
    //      ]),
    //      UpperError::Upper,
    //  ]
}

This is my first attempt to implement this. It works, but I don't want the
lower function to have to explicitly name the error type of the upper one:

pub struct ErrorList<E> {
    errors: Vec<E>,
}

impl<E> ErrorList<E> {
    pub fn push(&mut self, error: E) {
        self.errors.push(error);
    }
    pub fn sublist<'a, S>(
        &'a mut self,
        map_fn: fn(ErrorList<S>) -> E,
    ) -> ErrorSublist<'a, S, E> {
        ErrorSublist::new(self, map_fn)
    }
}

// Not shown: Deref and DerefMut to &mut ErrorList<E> via `self.list`
pub struct ErrorSublist<'a, E, Parent /* dammit! */> {
    list: ErrorList<E>,
    parent: Option<&'a mut ErrorList<Parent>>,
    map_fn: fn(ErrorList<E>) -> Parent,
    push_fn: fn(&mut ErrorList<Parent>, fn(ErrorList<E>) -> Parent, ErrorList<E>),
}

impl<'a, E, Parent> ErrorSublist<'a, E, Parent> {
    fn new(
        parent: &'a mut ErrorList<Parent>,
        map_fn: fn(ErrorList<E>) -> Parent,
    ) -> Self {
        fn push_fn<Src, Tgt>(
            target: &mut ErrorList<Tgt>,
            map_fn: fn(ErrorList<Src>) -> Tgt,
            list: ErrorList<Src>,
        ) {
            let parent_error = map_fn(list);
            target.errors.push(parent_error);
        }

        ErrorSublist {
            list: ErrorList::default(),
            parent: Some(parent),
            map_fn,
            push_fn,
        }
    }
}

impl<'a, E, Parent> Drop for ErrorSublist<'a, E, Parent> {
    fn drop(&mut self) {
        if !self.list.errors.is_empty() {
            let list = std::mem::take(&mut self.list);
            let parent = self.parent.take().unwrap();
            (self.push_fn)(parent, self.map_fn, list);
        }
    }
}

As mentioned, I don't want the ErrorSublist to have to explicitly name the
upper error type. Looking at the types involved in ErrorSublist, we can see
that actually knowing the type Parent is not necessary -- both map_fn and
parent are pointer-sized, and the only important property about them is that
they can be passed as arguments to push_fn.

If Rust had some way to say something like for<Parent>, I think that would
pretty much solve my problem.

Instead, I managed to make it work by erasing the types:

pub struct ErrorSublist<'a, E> {
    list: ErrorList<E>,
    parent: Option<*mut ()>,
    map_fn: fn(),
    push_fn: fn(*mut (), fn(), ErrorList<E>),
    phantom: std::marker::PhantomData<&'a ()>,
}

impl<'a, E> ErrorSublist<'a, E> {
    fn new<Parent>(parent: &'a mut ErrorList<Parent>, map_fn: fn(ErrorList<E>) -> Parent) -> Self {
        fn push_fn<Src, Tgt>(
            target: &mut ErrorList<Tgt>,
            map_fn: fn(ErrorList<Src>) -> Tgt,
            list: ErrorList<Src>,
        ) {
            let parent_error = map_fn(list);
            target.push(parent_error);
        }

        let push_fn: fn(&mut ErrorList<Parent>, fn(ErrorList<E>) -> Parent, ErrorList<E>) = push_fn;

        ErrorSublist {
            list: ErrorList::default(),
            parent: Some(parent as *mut _ as *mut ()),
            map_fn: unsafe { std::mem::transmute(map_fn) },
            push_fn: unsafe { std::mem::transmute(push_fn) },
            phantom: std::marker::PhantomData,
        }
    }
}

impl<'a, E> Drop for ErrorSublist<'a, E> {
    fn drop(&mut self) {
        if !self.list.errors.is_empty() {
            let list = std::mem::take(&mut self.list);
            let parent = self.parent.take().unwrap();
            (self.push_fn)(parent, self.map_fn, list);
        }
    }
}

I don't know if it's possible to monomorphize a function with another function,
but since map_fn is known at compile time, it would be great to be able to
roll it into push_fn to avoid one of the function pointers. Something like
this (imaginary):

fn push_fn<Src, Tgt, F>(
    target: &mut ErrorList<Tgt>, list: ErrorList<Src>
)
where
    F: fn(ErrorList<Src>) -> Tgt
{
    let target_error = F(list);
    target.push(target_error);
}

Either way, the code above works, but it requires unsafe. I can probably
convince my coworkers
that it's worth doing if there's no better way, but it seems like there oughta
be some way to express this using trait objects. The issue with trait objects
is going to be storing the map_fn passed in by the upper function. We can't
directly store it in ErrorSublist because we'd have to name the upper error
type again.

So the obvious next course of action would be to make a new type that stores
the fn(ErrorList<E>) -> T and the &mut ErrorList<T> and store it in the
sublist as a boxed trait object.

That also works:

pub struct ErrorList<E> {
    errors: Vec<E>,
}

impl<E> ErrorList<E> {
    pub fn sublist<'a, S: 'a>(
        &'a mut self,
        map_fn: fn(ErrorList<S>) -> E,
    ) -> ErrorSublist<'a, S> {
        ErrorSublist {
            list: ErrorList::default(),
            sink: Some(Box::new(ErrorListPush {
                target: self,
                map_fn,
            })),
        }
    }
}

pub struct ErrorSublist<'a, E> {
    list: ErrorList<E>,
    sink: Option<Box<dyn ErrorListSink<E> + 'a>>,
}

impl<'a, E> Drop for ErrorSublist<'a, E> {
    fn drop(&mut self) {
        if !self.list.errors.is_empty() {
            let list = std::mem::take(&mut self.list);
            let sink = self.sink.take().unwrap();
            sink.sink(list);
        }
    }
}

trait ErrorListSink<E> {
    fn sink(self: Box<Self>, list: ErrorList<E>);
}

struct ErrorListPush<'a, Src, Tgt> {
    target: &'a mut ErrorList<Tgt>,
    map_fn: fn(ErrorList<Src>) -> Tgt,
}

impl<'a, Src, Tgt> ErrorListSink<Src> for ErrorListPush<'a, Src, Tgt> {
    fn sink(self: Box<Self>, list: ErrorList<Src>) {
        let target_error = (self.map_fn)(list);
        self.target.push(target_error);
    }
}

This is also fine, but it requires heap allocation -- Which seems like a
step down from the type-erased-but-unsafe solution above.

I squinted at this a bit, and let's say that I was either able to roll the
map_fn into the push_fn (as discussed above) (maybe using a macro if there
is no better way), this could be a thing:

struct ErrorListPush<'a, Src, Tgt> {
    target: &'a mut ErrorList<Tgt>,
    push_fn: fn(&mut ErrorList<Tgt>, ErrorList<Src>),
}

Now, of course, this screams "trait object", so okay -- Fair enough:

pub trait Push<Src> {
    fn push(&mut self, list: ErrorList<Src>);
}

impl<E> ErrorList<E> {
    pub fn sublist<'a, S: 'a>(
        &'a mut self,
    ) -> ErrorSublist<'a, S>
    where
        Self: Push<S>
    {
        ErrorSublist {
            list: ErrorList::default(),
            pusher: Some(self),
        }
    }
}

pub struct ErrorSublist<'a, E> {
    list: ErrorList<E>,
    pusher: Option<&'a mut dyn Push<E>>,
}

impl<'a, E> Drop for ErrorSublist<'a, E> {
    fn drop(&mut self) {
        if !self.list.errors.is_empty() {
            let list = std::mem::take(&mut self.list);
            let pusher = self.pusher.take().unwrap();
            pusher.push(list);
        }
    }
}

// Now we do this to our custom error types instead:
impl Push<LowerError> for ErrorList<UpperError> {
    fn push(&mut self, list: ErrorList<LowerError>) {
        self.push(UpperError::Lower(list));
    }
}

This also works, but it's more inconvenient and it loses the property that
the same inner error might be mapped to different outer errors depending on
context. (For example, an io::Error might be wrapped to
Error::ReadDatabaseFailed at one point in the code, and
Error::WriteJsonFailed in some other part of the code.)

So... I'm stuck. The solution I want is basically the one that I achieved with
the type-erased code above or the boxed trait code above, but without the
unsafe code or heap allocation.

Here is a link to the type-erased version on Rust Playground.

And again, thanks for taking the time to read through this pile of words. :sweat_smile:

1 Like

Does this help with your first problem? By introducing a trait for ErrorSublist, you can make Parent an associated type, which doesn't have to be named at the usage site. I called it IErrorSublist, but there's probably a better name.

I don't have the time atm to look into you second question, but if I understood correctly, you should be able to avoid storing function pointers at runtime by introducing type parameters for all the functions. After being monomorphized, simple function pointers are zero-sized and have no overhead.

2 Likes

Thank you! I ended up using this on my finished product, and it works great. It actually ended up encouraging me to sort of "re-think" the way I designed my API around your idea -- Rather than try to design my types so that they don't contain their parent type in their type signature, it makes much more sense to just provide subfunctions with a "limited view" of the type through the interface you suggested. I ended up using this trick to abstract away even more details from the subfunctions about what the type that's passed to them is actually capable of!

So, overall, your suggestion both fixed my problem and led to better design. :slight_smile:

I admit that it does still bother me a bit that I couldn't really figure out how to make this work with my original types using traits when I was able to make it work by erasing their types, but it doesn't bother me enough to keep messing around with it more when you've come up with such a good alternative.

Thanks again!