DARPA TRACTOR program: C to Rust conversion

Back in July 2024, DARPA announced they were accepting proposals for reliable C to Rust conversion automation. Project title is Translate All C to Rust. A year later, there are some contracts with academia. Here's one, for $5 million:

Trying to find more of the contractors.

9 Likes

This sounds like an intractorble problem to me (see what I did there)

I was wondering though. If they manage to create some monster code analyse/AI translator that can translate C to Rust and shake all the UB out of it, then do they need Rust at all? Instead they could:

  1. Use that monster to generate good, UB free, C from the original C.

  2. Then if there are still teams of C devs working on some code base they could use that monster as a clippy for C to ensure all new code and changes they make is safe.

I would worry that such a monster tool would produce Rust that nobody could read or want to work on. Firstly because this C devs would have to learn Rust, secondly because my observation of such language translations is that the output is pretty tortuous and convoluted.

Still, I did have good results getting my AI friend to translate a 500 line C program to Rust a few months back. Including dealing with some gnarly C macros.

The object of this is to get good Rust out. It's a DARPA R&D project. They know it's hard.

There's C2Rust, but the Rust that comes out is awful. They just define unsafe types that have exactly the semantics of C pointers, and transliterate C to unsafe Rust.

The two key problems are figuring out how big arrays are, and turning pointer arithmetic into slices if at all possible. Solve both of those, and halfway decent Rust should come out.

3 Likes

It can't be that hard, just run the clang frontend to get LLIR, then run the Rustc frontend backwards, right? :clown_face:

I'll say what I'm sure I said the last time this came up: actually "fixing" C programs to be idiomatic Rust isn't a local transformation. You need to analyze the global data flow to extract the ideal types to represent the permitted transitions and access patterns, something that LLMs seem deeply unsuited to currently with their current very limited context windows.

I would never say this is impossible, even in the short term: even now I can think of a couple things worth looking into to make this more feasible, but it's the sort of thing where we haven't even gotten a firework to work reliably and they're trying to land on the moon.

1 Like

It can't be that hard, just run the clang frontend to get LLIR, then run the Rustc frontend backwards, right?

Yeah, just a sat solver :smirking_face:

Well, if you manage to translate it to unsafe-free Rust, than the ub-free guarantee comes from Rust, not that monster. And if it can't, it can handle 99% of the codebase at least, while human will solve the hard problems like "this tool can't change this, so I will make my change and let it run next".

Hey, today I learned a thing! That's infinitely better than most days!

2 Likes