Options for working on a Rust dialect

I am thinking about trying a slightly different syntax for Rust on my own, for fun. Surely one can write a source-to-Rust compiler, but for the scope I'm wanting to implement and given that it's only a syntactical difference, I believe it would be more straightforward to modify the parser itself in the source.

I used the word "dialect" in the title because that's the terminology used in Rackett, although I am using it in the reduced sense of changing the syntax. In Rackett you can use #lang dialect on top file to specify which one you want. I would prefer to use a different file extension although, if that's somehow not possible, a similar approach to Rackett: if the file starts with // lang: [name], then it's parsed using this custom syntax; otherwise, it's normal Rust.

  • Has anyone played around with this idea before?
  • Is it as straightforward as tweaking rustc-parse?
  • If you have examples in mind, could you point me to MRs on Github which significantly change the syntax? As an idea of which files are touched and the effort involved

Thanks in advance

1 Like

Alternative syntaxes would be cool. I've asked for it before:

I think an interesting avenue would be to extend proc-macros to support arbitrary syntax or loading of modules from other files. This way you wouldn't need to modify rustc, and instead you could parse your syntax in your macro.

1 Like

From my limited experience using Rust, proc macros are not as nice as writing the code directly, especially for discoverability purposes. Notably, it makes code much more difficult to search and "jump-to-definition" in the editor. A few years ago I've read The Grep Test article and it stuck with me, even more so after I went down this road of code generation through proc macros on a few projects of mine and it felt unpleasant because finding code became harder. To put it short, I am not a fan of generating non-boilerplate code through macros... It seems like there are too many drawbacks for the development experience. Feels fine for "pure DSLs" such as parser combinators, but otherwise I would not want them in my code if possible.

Proc macros aside, I'd like to point out again that I am scoping my idea to syntax changes or small extensions instead of a whole different language with possibly different semantics. More of the reason why I thought it would make sense to change the compiler directly given that seemingly there's no first-class support for this "dialect" idea.

One thing I also find relevant is having the Rust tooling seamlessly integrate with this notion of a dialect.

  • rust-analyzer: uses rustc-lexer and that, at least, would need to be changed.
  • rustfmt: have to check if there are token-specific rules which would make it misbehave if the syntax would otherwise not have them
  • clippy: uses rustc-lexer, possibly other syntax-sensitive crates.

Considering the reasonable amount of moving parts listed above and estimating that the effort for maintaining the changes would not be massive, it does feel like tweaking the compiler crates is the best way to go at this.

Given that scoping, I'd say you should just work in your personal fork of the source code.

https://rustc-dev-guide.rust-lang.org/ should be helpful.

2 Likes

Proc macros can run arbitrary Rust code, so there’s nothing stopping one from reading a filename out of its arguments, reading that file on disk as futrher instructions, and injecting the result as the macro expansion.

On a related note, can a proc macro be invoked as an inner attribute of a module? It would be nice to be able to write #![my_proc_macro_crate::transform_module] at the top of a file.

1 Like

Currently there's a showstopping usability problem for doing this with proc macros. They can't create Spans that refer to the correct file, so any errors from the transformed code would lack correct location.

As for inner attributes: it's a chicken-egg problem, because the attribute is inside the code that is supposed to be parsed with proc-macro's syntax.

2 Likes