How to get started with working on ASTs

Hi, another !rust-specific question, but I'm asking here anyways because you people are so awesome.

I'm a self taught programmer. I know some basic structures like Arrays, Vectors, Linked lists, Stacks, Queues, Disjoint sets, Red-Black trees, AVL trees, Adjacency list and matrix etc. My ultimate goal is to work with abstract syntax trees. But I don't know to implement algorithms like Shortest Path, Kruskal, Prim, Aho-corasic, Greedy, KMP, Boyer-Moore etc. What book should I pick up?

I've been reading SICP (JS adaption because too lazy to learn scheme) but I am not sure it's going to help. I want to work with data structures like piece-table, want to implement my own text editor with modern features, I want to do something like the xi-editor. I just want to roll my own GTK based editor with built-in terminal panel (I know GTK), so I have to know about parsing, lexing ASTs.

I've been told to pick up CLRS, but practically speaking, finishing up CLRS will take me a very long time. I don't mind investing time but I want to gain and practice relevant structures and architectures.

Thanks in advance.

I'm not sure if shortest path or Kruskal are crucial algorithms for working with abstract syntax trees? If you wish to learn e.g. Kruskal's algorithm, I recommend you just look it up and try to implement it in practice.

I don't really know any books on the matter, but maybe you should just try implementing it and see if you get stuck?

Where did you learn these algorithms from? From uni?

I guess Kruskal's algorithm won't be all that necessary but I don't know a thing about Lexing/Parsing. Maybe I'll have to pick up the legendary dragon book which kinda intimidates me :sweat_smile:

I didn't study CS, but math. It sounds like you already have a book on the subject? Just give that book a try then, and remember, you learn by doing.

I don't think the Dragon Book is a very good intro, even though it's the standard. If you're new to compilers, check out which is an excellent intro and assumes no prior knowledge.

You might also be interested in #lang-dev on the Discord:

Kruskall's algorithm is for graphs, you don't need it for parsing. Look up LR, LALR, and GLR parsing if you're interested in more of the theory.


Hey, it seems like you are on the right track, generally! I'd say that to get really great at programming it's equally important to both understand the theory and do implement some practical applications.

And the rub is that you don't have to understand the theory to do the practice. If you want to implement an editor, it's definitely a great idea to try to implement it using only the stuff you already now. Like, you can start with storing all the text as an array of characters, and try to invent fancier datastrcutures yourself. You probably won't re-discover state of the art stuff, but you'll learn much more than if you just directly implement an algorithm directly from paper.

For theory about alorithms, CLRS is definitely a great reference, but reading it from cover to cover will take a lot of time. I can recommend Dasgupta/Papadimitriou/Vazirani's "Algorithms" as a shorter alternative (or, really, a compliment).

For theory about compilers and programming languages, I'd recommend this course (this is how I got into the field mainly), but it maybe makes sense to start with algorithms first.


After decades of programming, me neither really. Compilers and such are still a black art to me.

However, some years ago I did manage to write compiler for for a very simple C/pascal like language that generated code for Intel x86 and the Propeller micro-controller from Parallax Inc. It was terribly inefficient, no optimization at all, but I was very proud of myself having managed to get from text to executable binaries.

That was inspired by finding the article "Let's Build a Compiler" by Jack Crenshaw:

His explanation is done with Pascal, I wrote mine in C. No AST used but a lot of useful hints and tips to be picked up along the way.

Perhaps it time to redo that in Rust...

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.