Transcribing C code to Rust

I'd like to transcribe this little C program to Rust : http://fractal.math.unr.edu/~ejolson/pi/tatami/src/limited.c

Not for any particularly useful reason. It is so far the fastest solution to a certain problem involving the tiling of Tatami mats in a little coding challenge going on here: Liberation through Computer Literacy - Page 67 - Raspberry Pi Forums

The actual challenge is Problem number 256 of the Project Euler #256 Tatami-Free Rooms - Project Euler

My old Rust entry is lagging behind, something has to be done!

That code has some self contained functions that are easy enough to deal with.

But the rest of it is juggling a bunch of global variables and arrays in typical C fashion.

Would it be wise to put that data in a struct and have those functions as methods?

Or would it be better to keep them free standing and pass the arrays in by reference?

Or something else?

Any ideas?

1 Like

I'd suggest a combination of them both, the struct would contain the configuration, and the slices would contain the data, so as to be able to use the same configuration with multiple pieces of data.

If a method takes &self, then technically there's no meaningful difference between foo.bar() and bar(&foo), so do whatever makes most sense for the code.

You can try converting the code with c2rust to see if you'll actually get the performance you expect before you invest into writing nicer Rust code.

1 Like

There are some slight differences between method syntax and function syntax where one or the other option will affect whether multiple mutable references are allowed.

Similarly, I also sometimes keep items apart instead of in a struct, so I can easily have a mutable reference to one and multiple immutable references to another.

There’s an art to it but I’m usually happy with the results in the end.

Good, I was just coming around to the idea that might be the case.

Interesting. I tried it. It actually produced Rust that compiles and runs. After I fixed a couple of places where the output had type mismatches. The C passes ints as parameters to a function expecting chars and c2Rust did not fix that.

As best I can tell it runs at the same speed as the original C !

Thanks for the heads up on that.

Now I will get down to writing a proper Rust version.

I suspect because truncation wouldn't have "fixed" it, it would have merely hidden the error (that is hidden in C by default).

1 Like

Yes, I think so.

I don't begin to understand how that C code works but I know they are calculating with ints but storing them in a huge array of chars. They never go out of range.

I thought the fact that that code is written in pre-ANSI style (It compiles and runs on a PDP 11) might trip up c2rust but that did not phase it at all.

2 Likes

I managed to translate that Tatami solution C code in the OP to Rust.

My approach was:

  1. Find lumps of related data in those globals and wrap them up as structs.
  2. Find related functions and attach them as methods to the structs.
  3. Nothing gets created until it is needed (almost).
  4. Except #includes become global consts
  5. Added some passing of structs to methods as required.
  6. Fix up names to keep clippy happy and run through rust-fmt. All clean!

Amazingly I had no hassle from the borrow checker, despite the recursive weirdness going on there.

I still have no idea how that algorithm does what it does!

The result: Crude timing shows it to be 15 to 20 percent faster than the original C code!

$ gcc -Wall -O3 -o limited limited.c
$ time ./limited
T(85765680)=200

real    0m0.920s
user    0m0.891s
sys     0m0.016s
$ cargo build --release
...
$ time target/release/tatami_rust
T(85765680)=200

real    0m0.759s
user    0m0.719s
sys     0m0.016s

Thanks for the hints everyone.

Oh, the Rust version is here if anyone is up to taking a peek and commenting:

1 Like

Unfortunately the execution time for this Rust transcription is about half the speed of the original C version when they are run on the ARM of a Raspberry Pi 3. Which is a shame as this was all motivated by a Raspi Pi forum discussion.

This is the second or third time I have noticed such a massive reversal in performance of equivalent Rust/C code between my x86-64 desktop and the 32 bit ARM Pi. Both Debian both. GCC 8.3 on the Pi, 8.2 on the PC. Both using today's nightly Rust.

Is there a reason for this in particular? Is it because of the old 32 bit ARM of the Pi? Are there any build options for the Pi to improve the situation?

Building the C version with clang on the Pi yields results closer to GCC. So I would expect that Rust can do similar.

$ gcc -Wall -O3 -o limited limited.c
pi@aalto-1:~/tatami_rust $ time ./limited
T(85765680)=200

real    0m4.624s
user    0m4.621s
sys     0m0.000s
$ clang  -Wall -O3 -o limited limited.c
$ time ./limited
T(85765680)=200

real    0m5.123s
user    0m5.123s
sys     0m0.000s
$ cargo build --release
    Finished release [optimized] target(s) in 0.37s
pi@aalto-1:~/tatami_rust $ time target/release/tatami_rust
T(85765680)=200

real    0m7.772s
user    0m7.758s
sys     0m0.010s

What does the fox profiler say? I noticed that there's a lot of array indexing going on in the Rust transcription. I'm not sure if that comes from the original C code, but maybe bounds checking has a fair share in the slowdown? I'd assume ARM's branch prediction capabilities are in general significantly less sophisticated than those of X86, so it kind of makes sense to me, but…

1 Like

Yes there is a lot of array indexing going on.

To the most part all of that is exactly the same as the original C version. I don't understand how it works so I had to duplicate the method as closely as possible. Despite having restructured things into structs and methods.

I have some C++ code here for which I found that as I tweaked it around to optimize for my PC it got slower on the ARM. And vice versa. There seem to be some code that Intel liked and other code that ARM liked. I never did pin it down.

What I did worry about is all the indirection going on in the Rust version now that there are no globals. Perhaps that ends up being more expensive on ARM.

Yikes!

Edit: Yikes indeed. Seems I was reporting improved timing after breaking the code and it giving the wrong results! All deleted here.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.