Recursive Parser Causes Trait Resolution Overflow

I'm building a recursive parser where different syntax structures reference each other. Each parser implements try_construct, which merges recovery contexts using context.merge_with(self.recovery_context()).

The issue arises when calling try_construct of FunctionSignatureParser inside TypeParser. This causes Rust to overflow while evaluating trait bounds:

error[E0275]: overflow evaluating the requirement `core::CombinedRecovery<&DummyRecovery, DelimitedRecoveryStragery<'_, seperated::SeparatedBy<ParameterParser>>>: Sized`
    |
    = help: consider increasing the recursion limit by adding a `#![recursion_limit = "256"]` attribute to your crate (`skullbrain_parser`)
note: required for `CombinedRecovery<&CombinedRecovery<&DummyRecovery, ...>, ...>` to implement `core::RecoveryStrategy`
   --> skullbrain-parser/src/core/mod.rs:142:16
    |
142 | impl<'a, A, B> RecoveryStrategy for CombinedRecovery<&'a A, B>
    |          -     ^^^^^^^^^^^^^^^^     ^^^^^^^^^^^^^^^^^^^^^^^^^^
    |          |
    |          unsatisfied trait bound introduced here
    = note: 128 redundant requirements hidden
    = note: required for `CombinedRecovery<&CombinedRecovery<&CombinedRecovery<&..., ...>, ...>, ...>` to implement `core::RecoveryStrategy`
    = note: the full name for the type has been written to '/home/user/SkullBrain/target/debug/deps/skullbrain_parser-8bf45239dc460134.long-type-1258188400081423112.txt'
    = note: consider using `--verbose` to print the full type name to the console

error[E0275]: overflow evaluating the requirement `core::CombinedRecovery<&DummyRecovery, TypeParserRecoveryStragery>: Sized`
    |
    = help: consider increasing the recursion limit by adding a `#![recursion_limit = "256"]` attribute to your crate (`skullbrain_parser`)
note: required for `CombinedRecovery<&CombinedRecovery<&DummyRecovery, TypeParserRecoveryStragery>, ...>` to implement `core::RecoveryStrategy`
   --> skullbrain-parser/src/core/mod.rs:142:16
    |
142 | impl<'a, A, B> RecoveryStrategy for CombinedRecovery<&'a A, B>
    |          -     ^^^^^^^^^^^^^^^^     ^^^^^^^^^^^^^^^^^^^^^^^^^^
    |          |
    |          unsatisfied trait bound introduced here
    = note: 127 redundant requirements hidden
    = note: required for `CombinedRecovery<&CombinedRecovery<&CombinedRecovery<&..., ...>, ...>, ...>` to implement `core::RecoveryStrategy`
note: required for `CombinedRecovery<&CombinedRecovery<&CombinedRecovery<&..., ...>, ...>, ...>` to implement `core::Recovery`
   --> skullbrain-parser/src/core/recovery.rs:4:9
    |
4   | impl<T> Recovery for T
    |         ^^^^^^^^     ^
5   | where
6   |     T: RecoveryStrategy,
    |        ---------------- unsatisfied trait bound introduced here
    = note: the full name for the type has been written to '/home/user/SkullBrain/target/debug/deps/skullbrain_parser-8bf45239dc460134.long-type-14150472290417344823.txt'
    = note: consider using `--verbose` to print the full type name to the console

However:

  1. Increasing the recursion limit does not fix the issue. (even up to 2024)
  2. Calling try_construct of TypeParser inside FunctionSignatureParser works, but the reverse (calling FunctionSignatureParser::try_construct in TypeParser) does not. (Possibily due to some recursiveness???)
  3. Merging recovery contexts itself does not cause issues; the error only occurs when actually calling try_construct inside TypeParser.

Syntax Example:

For reference, the expected syntax (using TokenKind) is something like :

TYPE = TokenKind::Type OR TokenKind::Ident OR FUNC_SIG;
FUNC_SIG = ( TokenKind::Ident : TYPE ) RETURN TYPE;

This shows that TYPE and FUNC_SIG are mutually recursive.

Code Structure:

pub enum TokenKind { 
  Type, Ident
///.....
}
pub(crate) struct TypeParser(FunctionSignatureParser);

pub(super) struct FunctionSignatureParser {
    pub(super) parameters: Parameters,
}

pub(crate) struct Parameters(Rc<DelimitedContentParser<ParameterList>>);

type ParameterList = SeparatedBy<ParameterParser>;

pub(crate) struct ParameterParser(pub(super) Rc<RefCell<TypeParser>>);

Utility structs:

pub(crate) struct SeparatedBy<T: SyntaxConstructor> {
    separator: TokenKind,
    content_parser: T,
}

pub(crate) struct DelimitedContentParser<T: SyntaxConstructor> {
    content_parser: T,
}

pub(crate) struct CombinedRecovery<A, B> {
    primary: A,
    secondary: B,
}


/// Literally only used to pass some value for the recovery_context to the try_construct method, and defines no recovery point 
struct DummyRecovery;

Recovery strategy:

pub(crate) trait RecoveryStrategy {
    fn is_recovery_point(&self, kind: &TokenKind) -> bool;

    fn merge_with<T: RecoveryStrategy>(&self, other: T) -> CombinedRecovery<&Self, T> {
        CombinedRecovery {
            primary: self,
            secondary: other,
        }
    }
}

impl<'a, A, B> RecoveryStrategy for CombinedRecovery<&'a A, B>
where
    A: RecoveryStrategy,
    B: RecoveryStrategy,
{
    fn is_recovery_point(&self, kind: &TokenKind) -> bool {
        self.primary.is_recovery_point(kind) || self.secondary.is_recovery_point(kind)
    }
}

And a simplified try_construct for the parsers:

impl SyntaxConstructor for TypeParser {
    fn recovery_context<'a>(&'a self) -> impl RecoveryStrategy + 'a {
          /// Usually either a unit-struct or a wrapper around self
          todo!()
    }

  fn try_construct(
      &self,
   //   stream_state: &mut TokenStreamState,
   //   syntax_tree: &mut RawSyntaxTree,
      recovery_ctx: &impl RecoveryStrategy,
   //   parent_id: NodeId,
  ) -> RecoveryResult {
      let recovery_ctx = recovery_ctx.merge_with(self.recovery_context()); // This works fine
      // Some parsing code that advances the stream_state
      // With some parsers containing a 
     //  self.some_other_parser.try_construct(stream_state, syntax_tree, &recovery_ctx, parent_id) 
  }
}

Initialization Code:

pub(self) fn initialize_type_parser() -> (TypeParser, FunctionSignatureParser) {
    // Create shared reference to be used by both parsers  (a new-type around `Rc<RefCell<Option<T>>>`
    let mut type_parser_ref = CircularRef::default();

    // Create parameters parser with reference to type parser (which is None for now)
    let parameters: Parameters = parameters(CircularRef::clone(&type_parser_ref));
    let function_signature = FunctionSignatureParser {
        parameters: parameters,
    };

    // Create type parser with reference to function signature
    let type_parser = TypeParser::new(&function_signature);

    // Now set the type parser in the shared reference
    type_parser_ref.set(type_parser.clone());

    (type_parser, function_signature)
}


pub(crate) fn parameters(type_ref: CircularRef<TypeParser>) -> Parameters {
    // DelimitedContentParser::parentheses omitted as its not important 
    let inner = DelimitedContentParser::parentheses(parameter_list(type_ref));
    Parameters(Rc::new(inner))
}

My Understanding:

I know that impl Trait just hides the concrete type, so a recursive problem is possible. However, I expected this not to be an issue since try_construct (of FunctionSignatureParser in TypeParser) is only called when the correct tokens are found in stream_state, meaning it should execute in a bounded manner rather than infinitely expanding at compile time. (right??)

Questions:

  1. Why does calling try_construct on FunctionSignatureParser inside TypeParser cause a compile time error but the reverse works? (For the latter I am guessing that it completes the circular cycle)
  2. Is there a way to restructure my code to avoid this issue or is this a limitation of my current design?

Thanks in advance!

Just from the error, it does look like there's recursive call that effectively translates into having an infinitely long type (e.g. because the type length depends on the runtime depth of the recursion), which is not supported. That's just my intuition based on the error text though.

You may get better help if you can cook up a minimal reproduction.

In return position (-> impl Trait), yes. In argument position (as with recovery_ctx in try_construct), it hides a generic type parameter to the method.

(This may not be relevant to the problem.)

Do you mean, correct tokens are found at run time? How would the compiler know what will be found at run time?


  1. started with all your code and added stuff until I gave up ↩︎

  2. started with the apparently problematic method and added stuff, but it doesn't fail ↩︎

2 Likes

Understood. I apologize for the oversight. I will provide a minimal reproducible example as soon as I return from school.

Ya thats what I meant! For instance if next token.kind is a ( then parse it as a FUNC_SIG. I just thought that it might work.

To make this work, I suspect that you’ll need to erase a type somewhere with Box<dyn Trait> so that the compiler doesn’t have to reason about the arbitrarily-large compound type as a single unit. Where, exactly, is the best place to do this would require a deeper analysis than I have time for at the moment, though.

I found that boxing the returned recovery context (Box<&'a dyn RecoveryStrategy>) does help with the recovery_ctx merge issue, but it leads to a different recursion error instead:

Minimal Examples

  • Original example with the overflow error in trait resolution:
    Playground
  • Boxing the returned recovery strategy does not help:
    Playground
  • However, making the return type of recovery_context asBox<&'a dyn RecoveryStrategy> does not fix the recursion entirely, leading to a different error:
    Playground

New Error After Boxing

After using Box<&'a dyn RecoveryStrategy>, the error shifts to:

error: reached the recursion limit while instantiating `<TypeParser as SyntaxConstructor>::try_construct::<CombinedRecovery<&CombinedRecovery<&..., ...>, ...>>`
  --> src/main.rs:83:27
   |
83 |         let type_parser = self.parameters.0 .0.borrow().as_ref().unwrap().try_construct(&[], &recovery_ctx);
   |                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
note: `<TypeParser as SyntaxConstructor>::try_construct` defined here
  --> src/main.rs:64:5
   |
64 |     fn try_construct(&self, _: &[TokenKind], parent_recovery_ctx: &impl RecoveryStrategy) -> RecoveryResult {
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   = note: the full type name has been written to '/playground/target/debug/deps/playground-b0f7a9e23d80b917.long-type.txt'

Also, storing a TypeParser inside FunctionSignatureParser does not solve the issue either (I already tested that).

If we give the impl types names, there is indeed a recursion which is unbounded as far as the compiler can tell.

// main calls
TypeParser::try_construct::<DummyRecovery>
// which calls
FunctionSignatureParser::try_construct::<
    CombinedRecovery<&DummyRecovery, TypeParserRecoveryStrategy>
>
// which calls
TypeParser::try_construct<
    CombinedRecovery<
        &CombinedRecovery<&DummyRecovery, TypeParserRecoveryStrategy>,
        FunctionSignatureRecoveryStrategy,
    >
>
// and now we're looping

The compiler sure was unhelpful in pointing out what parts of the code were actually problematic though!

The problem isn't only potentially infinite size in bytes, it's that infinitely long types are straight-up not supported. So you need type erasure (like dyn Trait), not just boxing. (Or some larger redesign that avoids infinite types from the compiler's perspectie.)

The original RecoveryStrategy is not dyn-compatible due to having a type generic method, so I'm not sure how useful it will be to fix this, but let's give it a shot.[1] I modified your playground like so:

-impl<'a> RecoveryStrategy for Box<&'a dyn RecoveryStrategy> {}
+impl<T: ?Sized + RecoveryStrategy> RecoveryStrategy for &T {}
+impl<T: ?Sized + RecoveryStrategy> RecoveryStrategy for Box<T> {}

 pub trait SyntaxConstructor {
     fn try_construct(
         &self,
         stream_state: &[TokenKind],
         recovery_ctx: &impl RecoveryStrategy,
     ) -> RecoveryResult;

-    fn recovery_context<'a>(&'a self) ->Box<&'a (dyn RecoveryStrategy + 'a)>;
+    fn recovery_context<'a>(&'a self) -> Box<dyn RecoveryStrategy + 'a>;
+    // and made all the implementations match
 }

Side note: why all the references everywhere? If I was confident they are not somehow required in your actual code base, I probably would have ripped a lot more out. For example, try_construct may be able to take a impl RecoveryStrategy and not a &impl RecoveryStrategy.

Anyway, the error remained essentially the same, so I went to the error site[2] and applied type erasure there:

 impl SyntaxConstructor for FunctionSignatureParser {
     fn try_construct(&self, _: &[TokenKind], recovery_ctx: &impl RecoveryStrategy) -> RecoveryResult {
         let recovery_ctx = recovery_ctx.merge_with(self.recovery_context());
+        let recovery_ctx: Box<dyn RecoveryStrategy> = Box::new(recovery_ctx);
         let type_parser = self.parameters.0 .0.borrow().as_ref().unwrap().try_construct(&[], &recovery_ctx);
         Ok(())
     }

And that compiles (but overflows the stack -- perhaps it wouldn't in your actual code base).

You may want to consider changing the merge_with method to return a Box<dyn RecoveryStrategy + '_>. If you also took a Box<dyn RecoveryStrategy + '_> as an argument, that would make merge_with a dyn-compatible method (so you could perhaps add it back to the RecoveryStrategy trait). On the other hand, if you need to access both "halves" of the CombinedRecovery, type erasure may present some significant hurdles.[3]

The other -> Box<dyn ...> changes you included may or may not be necessary if you take that approach (I didn't try it).


  1. I see your Helper trait but didn't try to figure out if it solves everything for you. ↩︎

  2. which at least the compiler told me about this time ↩︎

  3. Perhaps you could build CombinedRecovery-like functionality into the trait, but that's just a guess; I don't have a good enough feel for the overall design. ↩︎

2 Likes

Your suggested approach works perfectly (and I will mark it as the answer). And FYI boxing fn recovery_context<'a>(&'a self) -> impl RecoveryStrategy + 'a; is not needed.

trait SyntaxConstructor {
    fn try_construct<R: RecoveryStrategy>(
        &self,
        recovery_ctx: &R,
    );

    fn recovery_context<'a>(&'a self) -> impl RecoveryStrategy+ 'a;
}
let recovery_ctx: Rc<dyn RecoveryStrategy> = 
    Rc::new(CombinedRecovery::new(recovery_ctx, self.recovery_context()));

This also works with Box or other indirection types.

1 Like