I’m working on a compiler for a class assignment, and I’d like to give it a fuzz tester for at least the parser bits. Something that generates a lot of random input, either valid or invalid, and checks whether or not the parser responded correctly to it.
What crate do you recommend I start with? I have a complete definition of my language’s grammar that I can use to feed the fuzzer’s generator as well.
Update: I guess what I want is not quite a fuzzer (which I’ve now made work with libFuzzer!). I want a way to construct valid and invalid input, and see if my parser accepts valid inputs and rejects invalid ones.
Depending on the language, you could always use an existing corpus of programs to test your parser against. For example, if I was writing a Rust compiler I’d throw the existing test suite (especially the parser tests) at my compiler and check that I get similar results. Often standardised languages and formats will already have published some sort of “conformance test suite” you could use.
I guess you could always write an implementation of quickcheck::Arbitrary for your AST/token stream and then use quickcheck to make sure the parser can parse valid inputs and error out on invalid ones. But that would be logically equivalent to writing a second parser implementation (your AST generator) and testing one against the other, and what’s to guarantee the generator can correctly generate invalid and valid input?
Dang. Well, it is indeed a made-up language (an extension of a very small one called Tiny-C).
And you’re right about the Arbitrary implementation being not really much better than building a second parser.
Hmm. How could you test this? I guess it’s partially doable with a grammar enumerator of sorts, which generates from the grammar itself valid inputs to it. How to generate clearly invalid input, I don’t know though.
You can get pretty far by manually adding tests as you go.
In the past I’ve written my own compile-test utility which will try to parse or execute a corpus of files on disk and make sure they finish with the desired outcome. You could probably put together a pretty wide range of test programs by sitting down with a text editor for half an hour and writing short programs which should either be valid or invalid for your language.
At work we needed to write programs to go on a motion controller in the company’s proprietary variant of BASIC and for fun I started my own parser and runtime, complete with its own set of syntax-pass parser tests. It’s also possible to write tests that construct the AST by hand, but that gets pretty boring pretty quickly.
I think this is the big reason why you don’t hear about people using fuzzers for generating parser tests. It’s easy enough to enumerate all possible AST variants, but generating a known invalid stream of tokens seems like a difficult problem.
That sounds like something serde would be perfect for. Just store a JSON dump of your AST next to the test file (e.g. the AST from foo.c gets written to foo.json) then you can periodically regenerate the files when you’ve made a change to the parser and verified there are no regressions.
That’s pretty much how rustc's UI test suite works. They’ll run rustc on a file then compare stdout to a copy that’s been manually verified in the past and use that to make sure error messages are done properly.
Last idea, I think: fuzzing (and also testing the acceptance of input) of the parser can be achieved in a better way by using bnf’s Generators.
I’m using Pest to build Parse Trees. After that, I feed the parse trees to my AST builder. Since Pest is (I assume) correct, and I can generate syntactically valid input with bnf, I should be able to test that my AST builder always works on valid input.
The rejection of invalid input should happen at the Pest level, and bugs in that step I can fix with my grammar spec. But valid input should never be rejected nor crash the program at the AST building level. This approach would serve as a fuzzer on steroids for that part.
This of course, doesn’t check that the built AST is correct. But test cases and everything else we’ve talked about should be fine for that