I'd like to announce a tool that's meant to help rewriting C codebases in Rust. It's a source to source translator: it takes a C file and produces roughly the same code with Rust syntax.
Unlike Corrode it doesn't even try to preserve correct C semantics, so the generated code won't even compile until type errors and C-isms are fixed manually.
The goal here is to produce readable code that's a starting point for refactoring. Citrus helps with the boring syntax conversions when you're manually converting a C codebase to Rust bit by bit. For example, you may be able to copy&paste some C expressions, loops, or small functions to Rust without having to manually correct type var to var: type over and over again.
It's like bindgen, but also includes function bodies.
So, give it a try. I haven't got decent error reporting yet, so start with small simple files There is still a lot of missing bits and corners cut, so please file issues.
Back story: it's the third version of this program. I've learned the hard way that:
Parsing of C is not fun (typedefs, dangling elses, preprocessor quirks, and compiler-specific headers), so I've had to use Clang here.
It can't be done with the stable libclang. Its view of AST is just too vague and incomplete. It seems like it's 90% there, but the last 10% is undoable (e.g. can't distinguish between for(a;b;) and for(;b;c)). I've had to use LLVM/Clang internal C++ functions for disambiguation. Unfortunately, this means building the project is a pain. I've got pre-built binaries for you.
It can't be done well in one pass. I've built a PHP to JavaScript converter as an excercise to figure out a working approach. Getting a simple, flexible AST first, and then cleaning it up gradually in multiple passes is the way to go!
It does recognize standard for as for i in 0..x, and falls back to while otherwise. It rewrites swich without fall-through to match (and generates wrong code outherwise ). goto is rewritten as break 'label, but it doesn't really make sense. If you have tricky flow control, I suggest first refactoring C to have boring flow control.
pub struct hllhdr {
pub magic: [i8; 4],
pub encoding: u8,
pub notused: [u8; 3],
pub card: [u8; 8],
pub registers: [u8], /* "HYLL" */
/* HLL_DENSE or HLL_SPARSE. */
/* Reserved for future use, must be zero. */
/* Cached cardinality, little endian. */
/* Data bytes. */
}
Macro expansion also interleaved with a function comment
pub static HLL_P_MASK: c_long =
/* Given a string element to add to the HyperLogLog, returns the length
* of the pattern 000..1 of the element hash. As a side effect 'regp' is
* set to the register index this element hashes to. */
/* Count the number of zeroes starting from bit HLL_REGISTERS
* (that is a power of two corresponding to the first bit we don't use
* as index). The max run can be 64-P+1 bits.
*
* Note that the final "1" ending the sequence of zeroes must be
* included in the count, so if we find "001" the count is 3, and
* the smallest count possible is no zeroes at all, just a 1 bit
* at the first position, that is a count of 1.
*
* This may sound like inefficient, but actually in the average case
* there are high probabilities to find a 1 after a few iterations. */
((1 << 14) - 1);
pub static HLL_REGISTERS: u64 =
/* Register index. */
/* Make sure the loop terminates. */
(1 << 14);
pub unsafe fn hllPatLen(mut ele: *mut u8, mut elesize: usize,
mut regp: &mut c_long) -> c_int {
...
}