Trivial inductive Fn trait-bound-solving in recursive functions

This is an issue that came up while I was writing a pretty terrible parser, so I'll give a simplified reproduction case here.

Basically, consider a recursive enum defined like this:

enum Expr<V, T> {
    Var(V),
    Int(T),
    Add(Box<Self>, Box<Self>),
}

One of the things that I think is relatively obvious to want to do is to treat a type of this form like a bifunctor, so I wrote a method (basically trying to replicate lmap / first from the functional world) like this:

impl Expr<V, T> {
    fn map_vars<F, U>(self, op: F) -> Expr<U, T>
    where 
        F: Fn(V) -> U,
    {
        match self {
            Expr::Int(int) => Expr::Int(int),
            Expr::Var(var) => Expr::Var(op(var)),
            Expr::Add(lhs, rhs) => Expr::Add(
                Box::new(lhs.map_vars(&op), 
                Box::new(rhs.map_vars(&op))
            ),
        }
    }
}

This seemed to work –– cargo check said it was fine –– but when I actually tried to build the binary, I hit the recursion limit. The error elided 128 redundant requirements, and then said:

required for `&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&...` to implement Fn<(&str,)>

The actual type name (which had to be written to a file) was a closure preceded by 129 &s. This was a horrific diagnostic to receive, so I immediately gave up and went to change my implementation :upside_down_face:.

As it turned out, the immediate solution for me was to make op a function pointer (fn(V) -> U) and to remove the borrows in the Expr::Add branch (since function pointers are Copy). A slightly better solution (I think) would be to make op a &F, since this prevents the recursive borrowing in the original case.

As far as I can tell, the issue here was that

  1. the Fn trait has a blanket impl such that if op implements Fn, then &op also implements Fn, and;
  2. on each recursive invocation of map_vars, a new & is prepended to the argument, so;
  3. the trait-solver has to endlessly check that op, &op, &&op, &&&op, and so on (up to infinity) are implementors of the Fn trait.

So, obviously, increasing the recursion limit wouldn't have helped, and the solutions (using function pointers and &Fs) also fall out fairly immediately from this description.

My main question is this: why can't the trait solver handle this kind of situation? It seems like a really simple case of proof by induction, and my understanding is that the trait solver already has some tools for e.g. coinductive solving. Is there a safety reason that induction cannot be used? Are situations like this too niche to have specific compiler support?

Remember that generic functions are duplicated for each type they're called with. Let's say you call map_vars with MyFnType. Then due to your recursive call, the compiler must generate the following infinite list of different implementations of map_vars:

  • Expr::map_vars::<MyFnType>
  • Expr::map_vars::<&MyFnType>
  • Expr::map_vars::<&&MyFnType>
  • Expr::map_vars::<&&&MyFnType>
  • Expr::map_vars::<&&&&MyFnType>
  • ...

You will not be able to convince the compiler to generate infinitely many functions. I recommend defining a private helper function that takes &F so that it can call itself without adding another layer of indirection.

4 Likes

The fact that cargo check doesn't fail means that the trait solver accepted the code.

As @alice hinted at, the part of the compiler that doesn’t handle this (and cannot in principle) is monomorphization, the process of turning a generic function (with type parameters) into as many concrete functions as necessary to cover all uses (different substitutions for the type parameters) in your code base. If the code needs “infinitely many”[1] different versions of a function, that cannot possibly work anymore.

This makes for a general (minor) downside of the implementation strategy of “monomorphization” for generics: you can’t have patterns that recursively go through an unbounded number of different types.


While in your use case, this would have been undesirable anyways (since piling up &&&…&s and dereferencing them has some real overhead, so run-time for traversing highly unbalanced trees could even degrade from 𝓞(𝑛) to 𝓞(𝑛²)), in general this kind of recursion has some niche but interesting actual use-cases, too, in languages that do support it (e.g. Haskell), see also Polymorphic recursion - Wikipedia.


  1. even if any given run-time input would only ever use finitely many versions; so “infinite” here only needs to mean that at compile-time, there is no possible upper bound to the number ↩︎

3 Likes

Using a bound of F: Copy + Fn(V)->U also works, if you get rid of the recursive references. That lets function pointers and closures that only capture Copy types be used directly— These will be the vast majority, as there’s no need to use a move closure here.

If callers really need to capture non-Copy values, they can pass &move |x|… explicitly.

1 Like

This is a nice summary of the problem, thanks!

I admit, most of my confusion comes from the totally opaque error that cargo gives you for this kind of problem –– it's quite a departure from the much nicer error messages that you would usually get. I'd love if there was a rustdoc / clippy lint for this, though I have no idea to contribute one.

Adding Copy bounds on function arguments is an anti-pattern imo.

1 Like

May I ask why? In this case, it seems to perfectly capture the required semantics: The ability to both call f as a function and be able to pass f as an argument to multiple recursive calls.

I suppose you could use Clone instead, but that runs the risk of letting the copy cost overwhelm the actual logic when there's an easy way to avoid it (passing a shared reference as f).

I don't know what @alice had in mind, but what comes to mind for me is that:

  • A naĂŻve user seeing the Copy bound (or an error mentioning it) may think “I have to contort my closure to be Copy” rather than “I can just add an &”.
  • If the closure happens to capture by value a large piece of data, then you have the risk of it being actually copied many times.

Neither of these problems applies if you require &F.

Another option in this situation, which I would consider to be the “cleanest API possible”, is to write a public non-recursive function taking F and passing on &F to the private, recursive function. But this is overkill if the function is not public.

6 Likes

Yeah, that's pretty much what I had in mind.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.