Looking for Interesting Libraries for Program Synthesis


I'm Yoshiki, a PhD student at CMU.

My colleagues and I have developed a program synthesis algorithm tailored to Rust, and we are currently looking for interesting libraries to to test our techniques on.

For those of you who are not familiar with program synthesis, please think of it as a (pretty limited) "AI programmer" that can generate programs based on some machine-readable specification.

In particular, I am looking for 3 broad categories of libraries.

  1. A small, intuitive library to demonstrate our program (say geometry, matrix operations , etc.)
  2. A very large library involving polymorphism to demonstrate scalability (at least several hundred methods/functions, preferably a few thousand)
  3. A library that either contains unsound behavior on purpose, or is likely to contain unsound behavior. For this category, we will be running this through Miri+stack borrows.

Due to limitations on our side, it will be great if the libraries are free of async functions.

I have a few idea for 1, but I am completely lost for 2 and 3. Any help will be greatly appreciated.

Thank you for your time.

I think it'd be cool (and really meta) to use program synthesis to generate a brainfuck interpreter. This lends itself to your first category in that an interpreter only needs to understand memory (a big byte array), some current cursor variable, and do a switch-case on the instruction under the cursor.

It'd also be quite easy to write a fitness function because you can compare the contents of memory and the cursor value with what you expect.

For the more advanced project, what about synthesising something which can parse markdown (i.e. pulldown-cmark? This is also fairly easy to implement because you can create some sort of diffing function (maybe built on the difference crate?) comparing pulldown-cmark's output with your own.

In terms of soundness, this one could be a bit tricky... One option would be to create wrappers for native libraries, but it'd be tricky to create some sort of fitness function and compare different syntheses. Maybe you could give it the signature for a native function and the desired signature for the result, and make sure it writes the correct shim code for ferrying inputs and results between Rust and C?

1 Like

Hi @Michael-F-Bryan.

Thank you for the ideas.

I haven't thought of using it as an interpreter, but it should be doable. One idea might be generate words or geometric pictures from the interpreted bytes. Reminds me of people who made art/music using some DSL subset of Haskell.

String parsing is one of the classic applications of program synthesis, and I think those libraries will come in handy.

I agree with you on the difficulties of the soundness side. I am not familiar with FFI, but there isn't much excuse not to learn it :slight_smile: .

In addition, perhaps, I can dig up older versions of libraries with known bugs and see if the synthesis can expose that vulnerable call pattern, although new bugs will be nicer from an academic standpoint.


One thing that came to mind is using program synthesis for developing salsa queries for incremental computation, salsa

However it wasn't clear to me if you were looking for libraries for use with synthesized programs, or making a synthesized implementation of a libraries API, this idea being the former...

As far as unsafe, I know termite has shown some amount of program synthesis for drivers. There are plenty of rust OS projects, not exactly a library but perhaps a synthesized version of a driver which is certain to have some unsafe while being decently specifiable.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.