Rust AST parser & emitter: what do you use?

In Python I use builtin ast; in Java I use javaparser; in C I wrote a custom parser/emitter; in TypeScript I use builtin functions; …

What should I use in Rust? - In the past I've hacked away with nom, syn, quote, and prettyplease… it would be nice to see a higher level API that enables:

  • parsing—without evaluation—and emission of the following:
    • derives and other macros
    • imports (use)
    • structs
    • impl blocks
    • fn prototypes (I don't care about body-parsing, just want to understand the interface and be able to replace the interface and push the body back verbatim)
    • global const / let bindings
    • comments

Thanks

Maybe tree-sitter?

Yeah tree-sitter isn't bad. I tried contributing back in 2022; to no avail. Here I am asking just about Rust not about any other language. But yes, you are correct, tree-sitter would still be a viable option.

(keeping this forum thread open so others can contribute further suggestions)

I've linked the Rust syntax implementation for tree-sitter above. Is tree-sitter's generic tree structure not sufficient enough / clunky to use in your scenario? I haven't built anything with tree-sitter yet (hopefully in the near future though—as of late I'm trying really hard to formulate problems at work in terms of context-free grammars, just so I can use a parser to solve them), so I'm curious about your setup and how tree-sitter fits as a possible solution.

TL;DR - tree-sitter is decent, but very generic. So you could take it as the base for a more specific; say mutating visitor pattern; API that would actually be usable.

Look at the other language AST libraries I linked and what functions that expose; compare this to what tree-sitter exposes. Very different.

Tree-sitter was originally created for IDE purposes (think JSON-RPC LangServ stuff like: rename, move, syntax highlight SQL within string of different programming language, etc.); whereas my interest is in understanding structs and function prototypes then transforming these betwixt eachother and to/fro JSON-schema and OpenAPI more broadly.

TBH, if you're emitting things that you want to be more stable, consider instead starting from an IDL and generating the rust code, rather than the other way around. That's often much easier.

2 Likes

Agreed, you always want your Source of Truth for interface types to be in a language with no macro support at all.

Yeah @scottmcm @riking that would certainly make things easier. But my goal is bidirectional synchronisation.

From OpenAPI to target language.

Into OpenAPI from target language

Synchronisation of orthogonal types intralanguage


Examples (Rust)

Synchronisation

Synchronisation of orthogonal types within language, e.g.:

OpenAPI → Rust

  1. SQL migration files in Rust for diesel, likely as simple as some CREATE TABLEs; then using
  2. GitHub - Wulf/dsync: Generate rust structs & query functions from diesel schema files to generate associated functions; then
  3. utoipa::path decorated actix-web routes
  4. Tests and mocks for end-to-end and unit testing

OpenAPI ← Rust

  • (dynamic) produced from utoipa
  • (static) from new parsing [static analysis] of codebase, focussing on diesel struct and utoipa::path decorated actix-web routes

So with that in mind; much like my Python and C implementations; my Rust implementation must be whitespace and comment aware, yet work at a high enough level to be ergonomic.

Thanks for any and all [continued] suggestions.

FWIW: I open-source everything

PS: I was just asking in the rust-analyzer channel on the official Zulip and one of the maintainers suggested their library is in fact relevant.


Q: […] the main things I need are:

  • Function params + function decorator(s) + function docstring → JSON
  • JSON → function params + function decorator(s) + function docstring
  • struct fields + docstring ↔ JSON
  • Merge JSON into Rust, handling structs—by confirming same number of fields + field names + types—if exists else generating new ones

Interested in other things later but yeah this would get me >85% of the way there.


You can parse and extract information from the AST. SyntaxEditor allows you to make mutation.

You can study the API of the syntax crate that provides everything related to syntax in rust-analyzer.

For higher-level understanding of the AST this file can be valuable. Most AST declarations are autogenerated from it.
— Chayim Refael Friedman

Note that you can use syn to parse raw strings, so it may be a good option to get going quickly, especially as it's already set up to quasi-quote etc. with quote

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.