Using Rust for AI

Hello, I'm just getting started with Rust, would like know if Rust is a good option for Artificial intelligence and machine learning

Here's a resource I heard about via TWIR. I can't vouch for it or answer your question directly, but hopefully it's useful.

Rust might be lack of resources for machine learnings.
Almost all of the papers wrote in Python
Several of them might bind to high-performance C/C++ libraries.
NONE of them concern about the "memory safety"

Why you want to use Rust for machine learning?
actually, almost all of the performance-related code is interacting with GPU. Change the main library from Python to Rust could not speed up the model.
If you really want a high-performance framework, I strongly suggests MXNet, which is faster than pytorch and tensorflow at least in my computer.

If you really want rust, you might actually want interact with CUDA code, which C/C++ might be a better choice.

1 Like

It depends on what you mean by "AI". If you merely mean machine learning, let me offer a perhaps unpopular point of view. (If you don't mean ML, please elaborate.)

The official/majority PoV seems to be that "Rust is not good for machine learning because there are no libraries, you should just use Python", and that "memory/type safety doesn't matter for ML anyway because data sets are messy and dynamic".

Let me attempt to counter both points below.

As for the lack of libraries and usability: it is unfortunately true that Rust doesn't have nearly as many ML libs as Python. However, I would argue that this doesn't matter as much as one might think. Why? Because most of the pain, tedium, and complicated maths lies in the data cleaning / training / feature selection / hyperparameter tuning stage, which you only do once, or at least very infrequently. And apart from huge deep neural networks (which I don't know much of anything about), training is usually fast enough, even in Python, and even if you need to re-do it a couple of times (say for cross-validation or because you discovered a bug in data cleaning).

In constrast, most ML models are stupidly simple to evaluate. For instance:

  • A support vector machine literally requires a dot product and a vector addition.
  • Logistic regression ditto, and a call to the hyperbolic tangent function at the end.
  • A decision tree is similar to a binary search that 1-st year CS students are expected to be able to reproduce.
  • Common preprocessing methods like normalization to zero mean and unit stddev, or PCA are also trivial (we do have an excellent ndarray and numerous linear algebra libraries).

This in turn means that, even if you have a complicated procedure for training a model in Python using some of the excellent and highly nontrivial-to-reproduce libraries, you can still whip up an evaluator in Rust quickly, using basically nothing but the stdlib, maybe ndarray, and some basic algorithmic thinking.

In fact this is exactly what I did for a project of a friend of mine a couple months ago. The part of the project I was working on involved some heavy signal processing and classification, and it was integrated into the web frontend of a moderately big system. Thus, Python was basically out of question. Of course I did the initial prototyping of the necessary signal processing algorithms and the training of the model in Python. However, after having exported the model parameters in JSON, I was able to write a specialized evaluator for it in Rust, in the course of a single day, and compile it to WebAssembly. And then for the non-ML part of the rest of the code (which is plenty), I still got to enjoy all the benefits of Rust's strong type system instead of losing it all to Python.

To address point #2, the lack of need for type safety: I think that's completely false, basically. Yes, real raw data is messy and full of missing values and comes from different sources in different formats, etc. But at the end of the day, nobody will use that raw data! The key thing to realize is: the data always needs to be cleaned, and it must be put in a form that is basically strongly-typed, tabular, with statically-typed columns, in order to stand a chance to be used for training.

If a machine learning model is fit on a 2-variable numerical dataset, there is no way it will work on a 3-column dataset of categorical features. So a model knows the dimensionality and the set of types of its input space intrinsically, and it needs to be evaluated on exclusively that type of data. In fact, in the evaluator I mentioned above, I ended up modifying the training python script so that it output not only the trained model, but also the dimensionality and the feature names as JSON. I then wrote a procedural macro that generated a statically-typed evaluator purely at compile time from the JSON description, verifying its internal consistency as well as its consistency with the parameters and hyperparameters of the model.

This was a huge time-saver, because I now never need to manually check whether the model and any input data match after re-training or changing feature extraction, for example (since feature extraction affects both the number of features and the name of the feature variables). I can be sure that if my evaluator compiles, it is correct, in the sense that it gets the data it expects in the correct format and in the right order.

There is one more thing that someone mentioned above, which is that Python calls out to high-performance C and C++ (and sometimes Fortran) libraries, so Rust is not needed. Well, I don't think that's a fair thing to say. Having to use one language (Rust) instead of two (Python and C) is a big relief in any project, as it alleviates cross-language build system grievances. Not to mention that C and C++ are especially a pain in the neck to build, let alone cross-compile to WebAssembly, so in the aforementioned project, I would have used Rust over C anyway, if only for wasm-bindgen and wasm-pack, because the Rust toolchain is simply a lot easier to use than C toolchains.


So, should you use Rust for machine learning? I think the answer is yes! Use it for evaluating pre-trained models in a type-safe and high-performance manner, and for integrating them into other code bases that benefit from Rust's memory model. And be prepared to write reusable, high-quality Python scripts for reproducible data exploration / cleaning / training sessions.

10 Likes

That's true, but that might not the answer since simple evaluators could be written in any languages.

And things become tricky when the complexity of program increases.

e.g., If you require your model receive data stream and update itself online, you could not stop your old rust model and move all the received data into the python or R trainer.

what's more, except move all the evaluator in WebAssembly, the best FFI might be C FFI, which could be operate easy with C but not so easy with rust.

I know that Rust is a very good tool.
but, for the question whether rust is a good choice..
the answer might be "yes and no".

if you love Rust { yes }
else if you want to practice language skills { yes }
else if program is easy-writting { choose whatever you like since writing a small program is almost always safe }
else if you have tutorials { tutorials could be a better choice }
else { probably no }
1 Like

Is Rust a good option for X and Y? (where X: Task, Y: Task)

My own p.o.v. currently is: Rust is good whenever you want a compiled language that allows a high grade of abstraction/reusability, safety, and speed. I personally don't care so much about the ecosystem as long as I'm capable of writing interfaces to other libraries where (and when) needed. And doing FFI doesn't seem so difficult (with a few surprises regarding safety).

Specifically, I think AI and machine learning are subjects where Rust could play out its key features.

I've been programming a (multivariate) Estimation of Distribution Algorithm for fun, and it was really nice how rayon allowed me to execute calculations in parallel while my code stayed clean and easy to read.

(Edit: Of course, when targeting the GPU, things might not be that easy and require less abstract interfaces.)

Maybe Rust isn't the best choice if you need to use a particular library A or need a particular framework B. But isn't that true for any language?

And maybe Rust isn't always the best choice when you look for a language that has ready to use libraries for a particular subject area S or T. I would say whether Rust is a good option for AI and machine learning really depends on the programmer/team and the particular task. If the goal is to call existing libraries in that field, then maybe Rust isn't the best choice (but still might be, if you're willing to do FFI!). If the goal is to create something from scratch (maybe even something that is new and hasn't been done/implemented before), then I'd definitely do it in Rust. But that's just my personal opinion.

I don't see how that is relevant to my point. The question was not about "any language", it was about Rust.

That seems overly general to me. Of course things become tricky when the complexity of the program increases. This is neither specific to, nor the fault of, Rust. In particular, this doesn't make Rust any more difficult to use for any specific problem, ML included.

I don't think that needs to be pointed out. I specifically wrote about the offline scenario, i.e. when training can be done once and then only evaluation is needed – which is the typical situation. I do acknowledge that my workaround doesn't apply to online learning algorithms.

I don't follow. First off, Rust has no problem interacting with C FFI at all, it's actually one of the strengths of the language. But more importantly, I don't see how this is relevant here. I don't think it can be stated in general that "the best option is C FFI" in the context of ML. Neither the ease of use of generally-available, high-level Python libraries, nor the safety features of Rust apply to C libraries, with very few exceptions.

the alternative of Rust is "any language", thus I compare Rust with "any language".

This is NOT of course, since there might be some published package in a possibly non-Rust language.
For example, in deep learning, some high-performance package, such as TVM, TF-lite, TensorRT, ... is mainly written in C/C++, most of the tutorials are then with C/C++
port them in Rust is possible, but painful, since most of the people do not use Rust.

This could be a question for ML learner: if you use some exist old code, it might works. Once you want to chase the bleeding-edge techniques, you have to face the new problems alone.

point out some "online" learning is that, showing machine learning is not as simple as several decades ago.

That's true, but why Rust?

I'll give you a simple example: commucate with a famous statistic software, R.

R have official C support, thus, it is easy for R users writing C extensions (and even Fortran extensions).

here's what I found

https://github.com/r-rust/gifski : Languages C 74.9% R 24.8% Rust 0.3%
(description in CRAN) Multi-threaded GIF encoder written in Rust
https://github.com/r-rust/hellorust : Languages Rust 41.9% C 37.6% R 20.5%
(description in its README file) Minimal Example of Calling Rust from R using Cargo

If you have to manipulate codes with C, why using Rust?
(It is not very easy writing R with C, since there could be C macros like R_protect(*ptr) and R_unprotected(3) exists. This is why rust code needs lots of C wrapper. C++ might be better dealing with such things, but I'm unfamilar with writing R extensions with C++.)

If we have plenty of packages, Rust could be better since we could paralleize the code easier than using #pragma omp parallel
But at least now, the competation is unfair.

If you say, Rust could even win in this unfair situation.
Rust should already win several years ago.


The most encourage thing for me in this year is, some Rust code is now in the Linux kernel.
This should be a good start, not a flag indicates win

I believe that Rust would win.
But the time is definately not now.

I will focus now on Deep Neural Networks, as that's the part I'm mostly interested in.
For the broader ML area we have linfa in the rust-ml group, but I'm not actively working on it,
so I won't argue how helpful Rust is for ML.

For Neural Networks I personally do prefer Rust over the common Python + C++ combo.
I've been working on merging a convolution layer with dropout, a few years back.
Looking at the code to find out which C++ and which Python lines I must adjust to add it was a pure
nightmare. I was almost happy when I was told to not work on up-streaming it anymore, as they wanted to use it on their own. Having a codebase in a single language would be a dream.
For performance reason we probably can only pick Fortran, C++ and Rust for that, with Rust (as obvious here) being my favorite.

Having a single code-base is not only good for maintainers, who don't have to write wrappers anymore.
It's also nice for users. Even if it's possible to use both, C++ and Python in a project, that causes friction
and cost time, effectively splitting the field. With Rust, you can have the great performance and the nice interface, together with all the projects from crates.io.

Next, even if you want to run NN on GPUs most of the time, devs also have to implement a cpu only version. That's something that can be nicely done in Rust. The GPU site is weak (currently), but there
are also some cool projects going.

Finally, we also have the stronger Type System. I've had some discussion with another Rust dev who
wanted to write a deep learning library in Rust which is able to catch more errors at compile time.
With PyTorch and those you can define a lot of NeuralNetworks whose dimensions doesn't work out.
That's something, that a compiler should tell you. Personally I would also like to add that I'm often annoyed by having to calculate the input dimension for a layer where the compiler could just use the output dimension from the previous one.

So I think yes, there is a lot to earn by moving to Rust for such libraries.

very interesting @H2CO3 . i read about SIMD, single instruction multiple data a little, how helium for mobile and scalable vector extension SVE2 for performance up to the fugaku TOP500 supercomputer is available on ARM architectures nowadays with more simple programming than traditional intel SSE usung ACLE, arms c language extension, AVX. dan iliescu, francesco petrogalli describes how ACLE can be applied to machine learning. the simplest example is a scalar vs a vector loop, then it goes on with multiplying matrices. is one of you able to write this code in rust so it gets translated to ARM SVE?

// Scalar version.
void add_arrays(double *dst, double *src, double c, const int N) {
for (int i = 0; i < N; i++)
  dst[i] = src[i] + c;
}

// Vector version
void vla_add_arrays(double *dst, double *src, double c, const int N) {
  for (int i = 0; i < N; i += svcntd()) {
    svbool_t Pg = svwhilelt_b64(i, N);
    svfloat64_t vsrc = svld1(Pg, &src[i]);
    svfloat64_t vdst = svadd_x(Pg, vsrc, c);
    svst1(Pg, &dst[i], vdst);
  }
}

the source is here: https://developer.arm.com/-/media/Arm%20Developer%20Community/PDF/Arm-scalable-vector-extensions-and-application-to-machine-learning.pdf

I could give you an answer.
possibly not very correct, but might useful for you:

nowadays, compiler have their capability to optimize code, include introduce optimization like avx traits:

#[target_feature(enable = "avx2")] 
unsafe fn test(){ // safety: the unsafe mark is only for adding #[target_feature(enable = "avx2")] and ensure compiler enable avx optimization.
    let a=vec![1;100];
    let mut b=vec![2;100];
    b.iter_mut().zip(a.iter()).for_each(|(b,a)|*b+=a);
    println!("done")
}
fn main(){
    unsafe{test()}
}

which shows:

	vmovdqu	(%rbx,%rdx,4), %ymm0
	vmovdqu	32(%rbx,%rdx,4), %ymm1
	vmovdqu	64(%rbx,%rdx,4), %ymm2
	vmovdqu	96(%rbx,%rdx,4), %ymm3
	vpaddd	(%r14,%rdx,4), %ymm0, %ymm0
	vpaddd	32(%r14,%rdx,4), %ymm1, %ymm1
	vpaddd	64(%r14,%rdx,4), %ymm2, %ymm2
	vpaddd	96(%r14,%rdx,4), %ymm3, %ymm3

that is actually avx2 code, and what we have done is almost nothing.

as for your program:

// remember this is a safe function, the unsafe flag is just a workaround for playground.
#[target_feature(enable = "avx2")]
#[target_feature(enable = "fma")]
unsafe fn add_arrays(dst:&mut [f64],src:&[f64],c:f64){ // no need to send an extra N here
    dst.iter_mut().zip(src.iter()).for_each(|(d,s)|{*d+=s+c});
}
// for test purpose:
fn main(){
    let mut dst=vec![0f64;1000];
    let src=vec![1f64;1000];
    let c=10f64;
    unsafe {add_arrays(&mut dst,&src,c)};// could be dst.add_arrays(&src,c) if write it as a trait.
}

again, %ymm* occurs.

I have no doubt that, several years later, you could write non-sve code and get benefit from the sve accelerations.

(Although not for now.)

Try compiling your Rust code with -C opt-level=3 -C target-cpu=native. In the aforementioned project, I found that all performance-sensitive loops of the signal processing code compiled with the above settings have been autovectorized by LLVM.

Do not derail the discussion. Obviously, this is not true: the very alternative that this thread is about is well-established ML languages (Python and C/C++/Fortran) versus Rust, not "any language" vs. Rust.

That's not what I am talking about. Of course there can be (and there are) libraries in other languages. I did explicitly acknowledge that Rust lacks many libraries that Python has for ML. The point is exactly that most of the time (assuming offline learning), this is not an insurmountable issue.

I will not attempt to perform general
advocacy for Rust in this thread. You can browse around on https://doc.rust-lang.org for that kind of content.

The only way Rust (and most if not all native languages) can do FFI is by pretending to be C. If it's easy to write C extensions to R, then it's equally easy to write Rust extensions by declaring your wrapper functions as #[no_mangle] extern "C".

You are asking this question for the second time in the same post. The reasons (and my answer) are identical in this case. There are plenty of reasons to use a language other than Python. If someone likes/needs the safety features and strong typing, the convenient toolchain, or the ecosystem of Rust, that is itself good reason to use the language. If that need comes up in the context of machine learning, it may be (and for me, it is often) worth the tradeoffs even if it's not the mainstream language for ML.

I am definitely not going to engage in futile speculation.

If you see something more than signal processing, you may figure out that, not all code is autovectorized by LLVM.
I happened to know one of the case that autovectorize failed:

*madd_epi16

which is quite difficult for compiler to use:

// steal from https://github.com/JohannesBuchner/paq/blob/master/paq8l/paq8l.cpp, published under GPL v2 or later
int dot_product(short *t, short *w, int n) {
  int sum=0;
  n=(n+7)&-8;
  for (int i=0; i<n; i+=2)
    sum+=(t[i]*w[i]+t[i+1]*w[i+1]) >> 8;
  return sum;
}

it could be easily rewritten as:

int dot_product(short *t, short *w, int n) {
  __m512i sum={0,0,0,0,0,0,0,0};
  for (int i=0; i<n; i+=32){
    sum=_mm512_add_epi32(sum,_mm512_srai_epi32(_mm512_madd_epi16(*(__m512i*)(t+i),*(__m512i*)(w+i)),8));
  }
  return _mm512_reduce_add_epi32(sum);
}

when using Rust, such optimizing trait could not be done automatically.

#[inline(never)]
#[target_feature(enable = "avx2")]
unsafe fn dot_product(t:&[i16], s:&[i16])->i32{
  t.chunks(2).zip(s.chunks(2)).map(|(t,s)|(t[0] as i32*s[0] as i32+t[1] as i32*s[1] as i32)>>8).sum()
}
fn main(){unsafe{
    println!("{}",dot_product(&[1,2,3,4,5,6,7,8,9,8,7,32000,5,4,3,2],&[1,2,32000,4,5,6,7,8,9,8,7,6,5,4,3,2]));
    println!("{}",dot_product(&[1,21234,3,4,5,6,7,8,9,8,7,32000,5,4,3,2],&[1,32000,3,4,5,6,7,8,9,8,7,6,5,4,3,2]))}
}

no vpmaddwd occurs although _mm256_madd_epi16 is a valid avx2 binding.

__m256i _mm256_madd_epi16 (__m256i a, __m256i b)
#include <immintrin.h>
Instruction: vpmaddwd ymm, ymm, ymm
CPUID Flags: AVX2

the example above shows that, if you have a good C program, convert it into Rust is not so easy, which result to spending lots of time.
I agree this is not an insurmountable issue since re-writing the whole ecosystem with rust is also not an insurmountable issue, too.

  • the safety features and strong typing
    when you have to call C program, things is quite different from your view.
    • for "strong typing", remember that you are making offline programs, there is no way that a interface sending different type of data to a same function.
      "If a machine learning model is fit on a 2-variable numerical dataset" -- how to provide a 3-column dataset of categorical features into the pre-trained model?
    • for "safety features", If you have to write C code, the safety of your program depends on the skill that you wrote C code, actually not very safe then.
  • the convenient toolchain
    not really convenient since you have to deal with some C code. I know use pure rust is COOL, but rust with C is not so cool.
  • the ecosystem of Rust
    If there is some useful tool written in Rust, Rust could be a choice.
    The question is that, most of the tools have no Rust FFI, you have to wrote C code then.

do not say that "writing Rust is as easy as writing C", since #[no_mangle] extern "C" could not deal with C macros.

the sample C code is here:

// In C ----------------------------------------
#include <R.h>
#include <Rinternals.h>

SEXP add(SEXP a, SEXP b) {
  SEXP result = PROTECT(allocVector(REALSXP, 1));
  REAL(result)[0] = asReal(a) + asReal(b);
  UNPROTECT(1);

  return result;
}

# In R ----------------------------------------
add <- function(a, b) {
  .Call("add", a, b)
}

The question is NOT #[no_mangle] extern "C", the question is that, you have to expand all the C macro since Rust could not accept a C-like macro.

I agree this is not an insurmountable issue, since what the macro actually would be may not change. Thus figure out what's the meaning of a macro may help.

I tried to modify a pre-defined macro value from a rather small number 128 to a large one, the attemp seems to be accepted, but the value keep unchange for several months

It may not. since you over-estimate the pros(e.g., will prevent sending wrong data to a model, which actually not an issue for machine learning) and under-estimate the cons(e.g., it is not easy using C FFIs for Rust since C FFIs (.h files and libraries) are designed mainly for C, not Rust, what's more, when dealing with SIMD code, C could write code easily, while Rust must fight with many unexcepted things: )

error[E0658]: the target feature `avx512bw` is currently unstable
 --> src/main.rs:7:18
  |
7 | #[target_feature(enable = "avx512bw")]

(I know solve it is not difficult. but what you would choose, a well-written simd library or translate the whole library into Rust?).

Not to interrupt a nice discussion, but I thought it would be on topic to reference the Rust Machine Learning Working Group. The ML book is still quite terse, but it highlights the use and functionality of the linfa crate for building neural nets.

1 Like

I've used rust for my personal AI project (https://github.com/lukaszwojtow/primeclue) and I'm reasonably pleased. It's actually better than tensorflow for certain tasks.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.