I'd suggest a combination of them both, the struct would contain the configuration, and the slices would contain the data, so as to be able to use the same configuration with multiple pieces of data.
If a method takes &self, then technically there's no meaningful difference between foo.bar() and bar(&foo), so do whatever makes most sense for the code.
You can try converting the code with c2rust to see if you'll actually get the performance you expect before you invest into writing nicer Rust code.
There are some slight differences between method syntax and function syntax where one or the other option will affect whether multiple mutable references are allowed.
Similarly, I also sometimes keep items apart instead of in a struct, so I can easily have a mutable reference to one and multiple immutable references to another.
There’s an art to it but I’m usually happy with the results in the end.
Good, I was just coming around to the idea that might be the case.
Interesting. I tried it. It actually produced Rust that compiles and runs. After I fixed a couple of places where the output had type mismatches. The C passes ints as parameters to a function expecting chars and c2Rust did not fix that.
As best I can tell it runs at the same speed as the original C !
Thanks for the heads up on that.
Now I will get down to writing a proper Rust version.
I don't begin to understand how that C code works but I know they are calculating with ints but storing them in a huge array of chars. They never go out of range.
I thought the fact that that code is written in pre-ANSI style (It compiles and runs on a PDP 11) might trip up c2rust but that did not phase it at all.
Unfortunately the execution time for this Rust transcription is about half the speed of the original C version when they are run on the ARM of a Raspberry Pi 3. Which is a shame as this was all motivated by a Raspi Pi forum discussion.
This is the second or third time I have noticed such a massive reversal in performance of equivalent Rust/C code between my x86-64 desktop and the 32 bit ARM Pi. Both Debian both. GCC 8.3 on the Pi, 8.2 on the PC. Both using today's nightly Rust.
Is there a reason for this in particular? Is it because of the old 32 bit ARM of the Pi? Are there any build options for the Pi to improve the situation?
Building the C version with clang on the Pi yields results closer to GCC. So I would expect that Rust can do similar.
$ gcc -Wall -O3 -o limited limited.c
pi@aalto-1:~/tatami_rust $ time ./limited
T(85765680)=200
real 0m4.624s
user 0m4.621s
sys 0m0.000s
$ clang -Wall -O3 -o limited limited.c
$ time ./limited
T(85765680)=200
real 0m5.123s
user 0m5.123s
sys 0m0.000s
$ cargo build --release
Finished release [optimized] target(s) in 0.37s
pi@aalto-1:~/tatami_rust $ time target/release/tatami_rust
T(85765680)=200
real 0m7.772s
user 0m7.758s
sys 0m0.010s
What does the fox profiler say? I noticed that there's a lot of array indexing going on in the Rust transcription. I'm not sure if that comes from the original C code, but maybe bounds checking has a fair share in the slowdown? I'd assume ARM's branch prediction capabilities are in general significantly less sophisticated than those of X86, so it kind of makes sense to me, but…
To the most part all of that is exactly the same as the original C version. I don't understand how it works so I had to duplicate the method as closely as possible. Despite having restructured things into structs and methods.
I have some C++ code here for which I found that as I tweaked it around to optimize for my PC it got slower on the ARM. And vice versa. There seem to be some code that Intel liked and other code that ARM liked. I never did pin it down.
What I did worry about is all the indirection going on in the Rust version now that there are no globals. Perhaps that ends up being more expensive on ARM.