My Experience on writing demo about compiling techniques(1)

Recently, I 've leaned some basic compiling techniques concepts, such as NFA, DFA, LL etc.
and need write some code to demo them.

To be honest, rustlang is not my first choice since I need quick prototyping these concepts mentioned above. A traditional dynamic language with VM (tranditional means they are both developed in 1990s) should be compatiable for this job.

As a picky customer, I go through plenty of “language shops” to pick the most compatiable good.

1st try: Java (not really static as it has a VM supplying dyn support)

Java has been widely used on almost everywhere, including my favourite parser generator: Antlrv4.
And many tech post writen demo with it.

I just give up because pure OOP makes code logic not straight and hard to understand for newer, and there are large labour works which are supposed to be done by compiler. In other words, writing code using Java (exclude the springboot since has been redeveloped ) just like a manally compiler
to generate some low level code.

2nd try: Python (favourite language in my early programming times)

No close closure, No macro (differ from meta class programming), No symbol type.
Nowadays, sometimes Python acts like even not as good as Java (>=1.8).
(You can never belive that I have to use a Class for close closure workaround in python which reminds me java1.7 using anonymous class for function first workarounde, that's terrible)

3rd try: Hy on Python( Lisp + Python )

Lisp supports macro support, Python ecosystem is sweet, that sounds a good combination.
However, after writen almost 1_000 lines hylang code, I found I have spent most of time to read the document and source code of hylang to identify what's the std lib macro, how it works, if it do the right for now etc... REPL and jupyter notebook aren't enough for efficent coding.
Hy >1.0.0 has become mature, but it's on alpha state, one time I found a mistake in std lib's function and later it's just be removed, all left to me is time go by.

4th just research actrually:

Other Lisp

For Common Lisp: complete macro support, beautiful error message mechanism, all in repl, functional style OOP. These let me dived into it deeply. However it lacks essential stateful data structure and emacs always distract my attention to study (oh my grateful vim style emacs, if i have triple lifetime, one year for learn vim, almost double lifetime for emacs)
So, I don't want to reinvente the wheel since I have too many things to learn and no triple lifetime (even no double lifetime)

For Clojure: I have searched clojure repos on github, most of them are just wrapper of Java, and I don't think it's really diffrent from hy.

For Rackets: en, all rackets community repos what i need are just under not matained status. so I dicede not to dive into it.


After every thing concerned, JS maybe top 3 chioce, but I prefer a more decent language.


Template Haskell is good, Haskell is also an alternative choice, but I really need random access some times.

5th It's Rust:

I have used rust for writing some algrithm such as Boyer-Moore string pattern match, AC automaton based crossed linked list, Optimal Suffix Array Sorting etc. as a instread of C++. I'm not sure if it's all ok for more complicated structure designed with OOP style.

The result is amazing, thanks to rust-analyzer by matklad supply a detailed and greateful language server which very very friendly to newer,

  1. type hint just saving plenty of my time from look through document,
  2. greate document saving time from look through forums,
  3. good forums saving time from open the source code

Using declarative macro defined by macro_rules and enum a little like Haskell data
make things straight.

define productions for grammar G3

pub fn grammar_demo3() -> Gram {
    declare_nonterminal! {S, A, B, C};
    declare_terminal! {a, b, d, g, h};

        | A C B;
        | C b b;
        | B a;

        | d a;
        | B C;

        | g;
        | ε;

        | h;
        | ε;

And calculate First sets and Follow sets for it.

        let g3 = grammar_demo3();
        let g3_fsts_expected = vec![
            First! {S| d g h ε b a },
            First! {A| d g h ε },
            First! {B| g ε },
            First! {C| h ε },
        let g3_folls_expected = vec![
            Follow! {S| NUL },
            Follow! {A| h g NUL },
            Follow! {B| a NUL h g },
            Follow! {C| b g NUL h }

        let g3_first_sets = g3.first_sets();
        let g3_follow_sets = g3.follow_sets(&g3_first_sets);
        for (sym, firstset) in g3_fsts_expected.iter() {
            debug_assert_eq!(g3_first_sets.get(sym).unwrap(), firstset);
        for (sym, follset) in g3_folls_expected.iter() {
            debug_assert_eq!(g3_follow_sets.get(sym).unwrap(), follset);

If Rust can supply some way to export symbols created in declarative macro scope:

export_syms! { a, b, c }

Then we can write in mixed in coding style

macro_rules! mymixin {
    () => {
        // define a
        // define b
        export_syms!(a, b);

fn foo() {
    // in other function 

Generally, declarative macro is highly restricted, proc macro is more flexiable, however, writen in another project makes proc macro embarassed to use: Imaging that, I have to split code (struct, enum function defined myself) releated this proc macro to another project to avoid circle dependency.

And annoying explicit lifetimes did bother me, any data struct has explicit lifetime would infection other data struction which composed by it, then lifetimes annotaion will expread like virus until you can't handle it: usually it's caused by such error xxx lifetime not compatiable with xxx lifetime. There must be some workaround designs caused that!

All in all, Rust is my favourite choice for writing demo to learn about compiling techniques, it's not prefect, but modern and advanced enough to defeat against other competeters.

Just Hope for Rust2 can resolve or improve these issues!

1 Like

FWIW, although I agree that Python's support for closures with early captures is very lack luster, do know that sometimes there may be a slightly shorter way to write one without having to commit to a full class or having to define a function:

x = 42
f = lambda: print(x)
x += 27
f()  # Does not print 42!

# But:

x = 42
f = (lambda x: (lambda: print(x))(x)
#           ^ local variable that shadows the outer `x`: "early bound var"
x += 27
f()  # Does print 42

But yes, snippets such as the following one:

for f in ((lambda: print(i)) for i in range(10)_:
    f()  # prints `0`, `1`, …, `9`

for f in [(lambda: print(i)) for i in range(10)]:
    f()  # prints `9` each time 😭

are very disheartening.

Yes, Any tricky solution just supplys a lexical scope substitute: Class attribute or Lambda scope.

As for single expression limit of Lambda of Python, I' ve read that
In Hackers && Painters ch13-Appendix Power by Paul Graham

Python users might legitimately ask why they can’t just write ...(multiple-line statement)
and my guess is that they probably will, one day.

the book is published in 2009 :joy: