@DoumanAsh Thanks. For "just fine" do you mean it works but has bugs?
Neon instructions are not stable so you better off with C++ if you want to benefit from it.
Although in practice if you're using C++ libraries like eigen I doubt you'll get equally performant alternative in Rust right now
Well then I have to use C++...? I do hope I can use Rust since it so wonderful...
I do not think it had bugs, but it was more about getting around some specific issues when it came to libpng actually, not opencv itself. So I think it should be really ok to use.
Well then I have to use C++
Nothing wrong with using C++ actually especially if you have C++ dependencies.
It really depends on what you want to speed up though.
Both OpenCV and Tensorflow provides C bindings so they are not an issue, but eigen is specifically C++ library and I'm not sure you can get equally performant alternative.
Not to mention right now Rust doesn't give you neon instructions on stable toolchain
If you are dropping down to assembly anyway, you will see little benefit from Rust's type safety. You can definitely write the array primitive operations separately, and then create a Rust abstraction on top of them.
Nothing wrong with using C++ actually especially if you have C++ dependencies.
Well something is wrong: Rust is more beautiful
It really depends on what you want to speed up though.
Both OpenCV and Tensorflow provides C bindings so they are not an issue, but eigen is specifically C++ library and I'm not sure you can get equally performant alternative.
Not to mention right now Rust doesn't give you neon instructions on stable toolchain
If you are dropping down to assembly anyway, you will see little benefit from Rust's type safety.
Sorry I am not quite sure about it. I will heavily use the OpenCV binding. It provides a lot of functions around the Mat class (i.e. multidimensional array, a matrix) in opencv. But IMHO OpenCV is mostly safe and do not have memory problems - since it is quite widely used, and when people are using it (in C++) they do not complain about memory problems. On the other hand, when I write down my own C++ code, it can be dangerous without being very careful, so Rust helps here. Therefore, why do I see little benefit from Rust's type safety? Thanks!
Regarding simd instructions and non-stable, as I understand it, the issue is about being able to call them explicitly. The optimizer is still able to generate them, even on stable Rust.
I don't understand the problem, then. If you are using OpenCV, then it is written in C++, and Rust's lack of optimizations won't affect its performance. So you may call into the OpenCV C interface from Rust. Is that not the case?
@alice Several of my use case is writing a dynamic programming by myself. You know,
things like dp[i][j][k] = some_expressions. So I guess Rust is not smart enough to optimize them automatically and need human beings to do so. Please correct me if I am wrong.
@H2CO3 My use case contains both (1) use OpenCV algorithms, and (2) handwritten algorithms such as dynamic programming on matrices. So yes, when using opencv, I am not worried about Rust's performance optimization. But for DP, I have to care about it.
It sounds like you need to play around on Godbolt and look at the assembly generated by rustc to see for yourself whether LLVM is smart enough to optimise a piece of code or automatically use SIMD.
I would also find it extremely likely that any optimizations that generate SIMD instructions would be in LLVM, so if C++ can do it, Rust should be able to too.
Basically the toy code is the following. It seems that Rust do not use SIMD. For human beings, we can use SIMD for the val = val.min(...) line.
pub fn f(XLEN: usize, YLEN: usize, SLEN: usize, cost_curvature: &[i32]) {
let mut dp = vec![0; XLEN * YLEN * SLEN];
let compute_index = |x, y, s| x * YLEN * SLEN + y * SLEN + s;
for x in 1..XLEN {
for y in 1..YLEN {
for s in 0..SLEN {
let mut val: i32 = 2147483647;
for s_prev in 0..SLEN {
val = val.min(dp[compute_index(x - 1, y - 1, s_prev)] + cost_curvature[s * SLEN + s_prev]);
}
dp[compute_index(x, y, s)] = val;
}
}
}
}
The NDK docs note that Neon support is not guaranteed on armv7, so if you support armv7, the conservative thing to ensure your code will run on all devices is to not emit those instructions.
Moreover, imho the best way may be to do some runtime checks and then happily use neon on most devices.
In addition, I use readelf -A xxx.so, and see the following for many commercial applications that I download from the store using an old phone that only supports armv7: