Can Rust, one day, unify the math design and processing languages?

Hello all,

currently the most used programming language used to explore and design engineering designs is Matlab, but Python came in a second place and R and Julia and Matematica came in a 5th place.

But when you need to implement this designs in the target machine or in realtime the two languages of choice are C++ and the C.

This creates a kind of two language problem, a kind of mismatched impedance between the two worlds. And a duplication of effort. And currently not a totally open source solution.

This to worlds view, are typical in DSP processing and algorithms design, in games algorithms, in the start of ML algorithms it was also common, in sonar, in RADAR, in Audio processing and in many other fields of engineering.

Could Rust be the language that unify this 2 language problem around numerical math for engineering and math in general?

What could be made to close the gap between the current reality and this objective?

Should it be one of the priorities in the community of Rust developers?

Thank you,

Best regards,
João Carvalho

2 Likes

I don't understand the thesis of your argument ?

What is the "it" referencing here ?

To be honest, the best thing you can do here is to write more Rust code in the field you care about or contribute to existing projects.

There are actually several projects working to improve things (ndarray for N-dimensional arrays, the nalgebra project for linear algebra, the nphysics project which implements a physics engine on top of nalgebra, etc.), so maybe it's just a case of building things and letting the wider community know these things already exist?

9 Likes

Hello Erelde,

it, is referring to what could be made to close that Gap, that would facilitate and guide more programmers to use Rust for Engineering applications that are heavy on the math side.
Ending the two languages problem and duplicate efforts in design and implementation of systems.

Python, has it's drawback in terms of speed of code made in Python, but with the heavy work made in NumPy, ScyPy and some other packages they did it. They called some of the engineering users from Matlab to Python. But the goal would be to have a unified code source, fully function, that could be used for exploration, for prototype, and for final target machine usage. C++ or C can't do it, Python for different reasons certainly can't do it, so maybe Rust could do that unification.

The current state of thing is that the great majority of engineering books are all written for Matlab, because it's the "standard" in industry, and because the books are written in it, colleges teach the usage of Matlab in there courses, because student licenses are low the majority of courses don't make the transition to Python, and because Python isn't also the solution for the target.

So, if one day, there is a good infrastructure for engineering math libs in Rust, like there is in Matlab or Python (different degrees), there could be a real possibility of transition from the combination of Matlab and C++/C to Rust by having the advantage of unifying each of there parts in the process.

But like @Michael-F-Bryan said, the best thing that one can possibly do maybe is to contribute to the development, usage and documentation of libs for it in Rust. And if the tools are good people will transition naturally.

Thank you,

Best regards,
João Carvalho

TBH that doesn't matter. If you really understand the math and algorithms behind whatever you are trying to do, then you can 1. use the relevant libraries in another language, and 2. write the missing pieces if there are any. And if you are not at that level of understanding, you're better off learning the underlying important stuff instead of copy-pasting MATLAB code without knowing what exactly it's doing (and so what its limitations are!).

In my primary field of interest (data science), many people primarily use Python, because that's where the libraries are. I also use Python for the same reason. However, "Python is fast enough with NumPy" is simply not true in, say, about 10% of the real-life problems I encounter. So for that reason, I often find myself rewriting Python prototypes in Rust – I have done this about 3 times in the past year. Once because the code was DSP heavy (feature extraction from, and classification of, some biological signals), another time when I needed to process bioinformatic data (100s of GBs of DNA sequences), and a third time when I needed to write a domain-specific compression format that substituted an inefficient ASCII text format.

If Rust had more ready-made libraries, especially related to DSP, statistics/machine learning and bioinformatics, I would certainly reach a lot more often for Rust at work by default. So I don't think this is a fundamental problem, and I think Rust can get there eventually, since numeric computation doesn't really require DSLs.

One situation where the dynamic/scripting languages are still more convenient is prototyping. One has to be careful and principled, however, when performing such prototyping, so as not to end up with a prototype in production that is slow and brittle due to the lack of native compilation, a flat memory model, and a strong type system.

3 Likes

I think in time Rust will be chosen instead of C or C++, due to Rust programs/libraries "behaving better" ( not crashing due to memory allocation problems etc ), and running as fast as C/C++.

However I doubt Rust will replace languages such as Matlab, Python, R etc.

5 Likes

Hello @geebee22,
that C and C++ will eventually be replaced by Rust, for new projects, that I don't have daughts, because like you said there are many advantages in the Rust ecosystem. But what I really wanted was a unified usage of the two, by having a path in Rust also to more math oriented libs like Python did. I'm not against the separation in usage of code for prototyping and code for target systems, but at least it should use the same language and the same libraries.
Regarding Python, as all you know, Python is only a convenient end user glue, because below it's surface it's all C, C++ and same Fortran. Just like Matlab is.
If you see the code for PyTorch below is C++ and CUDA, if you see the audio processing lib for PyTorch you will see that it's lot's and lot's of wrappers written in C++ to SOXLib written in C :slight_smile:
Turtles below turtles all the way to infinity hehehe
Julia in that regard is more interesting, I think the majority of there libs are done from clean slant, from first principles, and that would be how I would like those Rust math/engineering libs to be made.

Best regards,
João Carvalho

Honestly? I doubt it.

Rust is a language that has a lot of friction for doing things the wrong way. This is great for people who care about the quality of code. Scientists, as a class, don't. And they shouldn't have to! That's not their job. But it's harder to write readable, modular, efficient code than it is to slap things together and call it good until the day it runs too slow. Rust is, as a whole, really well-designed because it has friction in the right places to make you think a little harder and try a little more to find a better (more obvious, more modular, more efficient...) solution. It frontloads that long-term work.

MATLAB and Python are both languages with very little of this kind of friction, and I think that's why they work so well in the design space. Not every little Python script you write in the design phase will actually turn out to be used in actual simulation / implementation. There are scenarios where writing in Python now, and rewriting in another language six months to a year from now is the correct business decision. I don't think this has much to do with libraries. Your first sloppy implementation probably won't even be written by the same people who write the final version. Those are different skill sets.

If any language can bridge the gap in this respect (that is, be a language that's good both for sloppy initial implementation and for long-term development), it's probably Julia. Rust really isn't trying. Rust leans hard into the programmer's side of things, and I think that's good. There need to be languages like Rust. But it won't ever be the language taught to undergraduate physics majors. That would be a waste of their time.

What would that imply? Changing the language to make it as frictionless as Python? That's a fool's errand; besides, most of the friction is there for a good reason. Writing/linking linear algebra libraries? Yes, let's do that! But let's not imagine that any amount of library code, short of a proc macro that interprets a whole new mini-language, will make ndarray as simple as MATLAB.

Incidentally, I think you're wrong here:

I am almost certain that the most used programming language for scientific programming, simulation, and data analysis in engineering design is Excel. And if you're going to say "That's a spreadsheet, not a programming language!" you've missed the point. People use Excel not because it's the best tool for the job (except when it is, of course), but because it's straightforward. A bright 13 year old (or a subject matter expert in a field unrelated to software development) can learn how a spreadsheet works in a day. MATLAB is more complicated on a level, but the principle applies. There will always be people who know linear algebra, have a problem they need to solve with a computer, and have the patience to learn MATLAB or Python or Julia, but not Rust, because Rust (by design!) requires attention to a whole different level of detail.

11 Likes

As a sidenote, I'd like to disagree with this because it could perhaps work in an ideal world, but it turns out to be a tremendous problem in practice. The idealized situation of scientists making a rough prototype (that only has to care about the pure core computation) and then Real Software Engineers™ making it beautiful, robust, and fast (while not needing to understand the science behind it) simply doesn't happen.

What usually happens instead is once the rough "prototype" is ready, it is deployed right into production, because there isn't enough time/domain knowledge/willingness/money on the part of IT/SE to actually rewrite it, so they just shove it in a Docker container (at best) and pray it doesn't break. If the code is Python, it may even run fine for some time. If it's written in Matlab, well, then good luck setting up a full Matlab environment in a way that it can be run unsupervised from e.g. an HTTP handler. And don't even dream about it being fast.

And if the prototype was already written in C or C++ to begin with, then goodness save you from using it. One of the leading bioinformatic software (Phylobayes), a tool for Bayesian analysis of phylogenetics, contains a trivial buffer overflow error that I discovered by running into it with a not-even-close-to-big input (a couple 100s of 500-byte sequences). I immediately fixed it by moving a couple stack allocations to the heap, and submitted a PR. The PR remains untouched, uncommented, unmerged, and generally completely ignored to this day, which means that a critical correctness bug has not been fixed for at least 5 years, even though the repository has regularly been pushed to. The software in question is routinely used for performing research that eventually ends up in leading peer-reviewed papers of the field. Go figure.

What I am trying to say here is: there is no way to perform the idealized splitting of research and development if someone were to take themselves seriously. The only way to write correct, robust, fast, and generally good-quality scientific software is to hire someone with both the required scientific background knowledge and actual industrial experience with software engineering good practice. Unfortunately, this is a really rare species of Pokémon, so usually a team of underpaid graduate students coupled with a team of equally unmotivated software developers it is. However, the fact that this is how it generally ends up being doesn't mean that it's ideal or any good at all.

22 Likes

You're preaching to the choir :slight_smile: I have dealt with all this kind of thing, from unreviewed Perl scripts becoming an essential design flow step to broken C code in production for years (the accepted solution for one bug was "just re-run it until it works"). Things do occasionally get rewritten, but it's not something that happens all the time.

My point is certainly not that this is the "ideal world" solution; quite the opposite. The reality is that the prototype will certainly not be written by a software engineer, and it will probably go into production with minimal or no rewriting (until the point where the sloppiness of the initial implementation actually becomes a bottleneck - I've done that, too).

What I mean by "they shouldn't have to" is that you can't make scientists care about code quality, so it's naive to think the solution is teaching them all to be software engineers. The economic pressures don't work out; it would be like trying to teach politicians not to take bribes. Sloppy prototypes are a fact of life. Rust is a barrier to sloppiness, so it'll never be the preferred prototyping language. If there's a solution to the "two language problem" as OP frames it, it's definitely not "let's teach all the scientists Rust."

3 Likes

Rust, Rust needs file system access, network, string parsing, enums, looping, threading, async, etc.
Rust I believe is also used to make DSL's .

A language (DSL) that is made for math or science will need only what it needs to do the math or science. At the most, rust will eventually replace "c" as the base. But never will it replace the DSL's because they are specific for a reason. Calculating complex math should not require file system access for example.

I'm actually a researcher (currently unaffiliated) working in the fields of biology and geology, and I've published around a dozen of peer-reviewed papers. I'm now on Fedora Silverblue 37 pre-release. For a few days I've been struggling to install matplotlib--it seems like it doesn't work with Python 3.11.0rc1. This is never an issue in Rust as libraries from all Rust Editions are backwards and forwards compatible and interoperable. As I can't set up Python in my Fedora 37 pre-release, last night I gave a try R and installed it without issues but downloading an amazing amount of packages. So, I tried to learn to plot in R. But around a half of the code samples copy-pasted from R tutorials I found on the internet did not work... I don't know why they don't work, it may be due to some backwards compatibility brakes or for whatever reason I couldn't figure out despite R is supposed to be simple and user-friendly. I'm sorry for saying that but the list of programming languages which truly work for me is very short, and Rust is perhaps the only programming languages I gave a try in my life I have never had an issue of any kind with. Wherever I install Rust (even on distros pre-releases), it works flawlessly. And I will talk to other scientists about my good experience with Rust. But, as I understood my co-authors in the current project are going to plot in Excel. I'm honestly not sure if I know anyone who would see a point in learning a programming language. I also got to know a PhD student from one of the most important scientific institutes in Europe who was interested in installing Ubuntu in his room. He had a very unpleasant conversation with the director of the institute about installing "odd" operating systems! That's the reality.

10 Likes

If rust grows in popularity for the next century and the US adopts the metric system next month for world wide commonality, then yes with better than 50% certainty. Otherwise... expect disappointment.

I love Rust and I think I get your pain. Mathematica as slow setting glue for tech mismatches. And some part of me leans into the idea of importing a rust crate called "every_r_function" or "all_wolfram_language" even as my brain recoils in horror and some part screams "why follow?" Organic growth pushing rust is probably better than a synthetic corruption of rust's goals in order for it to be an engineered unifier.

In my mind Rust is great. But not perfect. And that's ok - Rust prevents mistakes & makes parallel processing easy & is empowering & enjoyable & rewarding. But Python can be learned by a pre-teen in eight hours. And Mathematica is perfect when you know a few thousand of mathematica's functions and have one hour to write a program that will only ever be run once and would take a thousand lines of rust code. Matlab is circularly very useful because everyone uses and it has a history of being useful and that trend is likely to continue.

Math and engineering centric languages make the math really clear. Rust prioritizes careful type conversion so mixed datasets turn horizontal tight mathematics into columns of off-screen math and type conversions - and my memory tends to recall what I intended / not what I wrote and this leads me toward computational errors. Your personal kryptonite may vary.

Rust in digital signal processing / embedded... umm. Every few months a new embedded chipset appears and the vendors create a C++ hardware abstraction library that makes the C++ programmer think nothing has changed when actually nothing is the same. Did the vendors publish any rust? Is that friction?

3 Likes

I'm pretty sure you just summarized all corporate software development in one sentence.

4 Likes

That is my experience over many years. Resulting in years of endless bug fixing and/or unhappy customers.

Rust helps a lot here. Even if your prototype is trash pile of unintelligible, unorganised, unoptimised code, it most likely will produce the results you want after a little testing and is most unlikely to suffer from all those pesky mistakes, null pointers, data races, memory leaks etc that tend to slip through even a lot of testing. As a result it runs very reliably.

This continues to amaze me. Code that I wrote only a couple of weeks after starting to learn Rust sure enough found its way into production. It has been running fine ever since.

7 Likes

I don't think that's accurate or at least it's not the full story. "They can write code that doesn't raise a SyntaxError" is not a great definition of knowing the language or knowing how to write correct software. The ease of Python is an illusion: it delays most of the painful errors, but it can't prevent people from writing them. And those writing complex software in Python will end up suffering later, due to runtime errors, whereas Rust programmers will suffer upfront, due to compile errors.

IMO delaying the errors to runtime is not really the better choice. It only seems better if you view the problem on the very superficial level of "getting the code to work". But even code that runs may contain logical errors that a strong type system could have caught (earlier, or at all). And I have to ask: what's better, knowing that your code is incorrect because it doesn't compile, or not knowing about a critical bug that silently causes your computations to yield garbage? One would think that the second option is completely unacceptable in scientific research, and that researchers should thrive to minimize the chances it happens.

10 Likes

To be or not to be? Programmers and scientists have choices.. at least as long as they are willing to use the available technology and not wait for some far off perfect day.

Perhaps like you, I thought at one point how nice it would be if I had not learned coding patterns that "suffer later" - if only I had been introduced to rust first. And because I'm a scientist and have a handy preteen subject, I set out to teach them rust over idle pandemic hours as a first (wait - second, does Scratch count?) programming language. Gifted my child a new laptop. And what was my experiment's conclusion: that they picked up circuit python first on their own time with a texas instruments calculator and mostly ignored the laptop I gave them until it had some games installed. And that's so insane... my kid would rather willingly press the calculator's alpha-shift key hundreds of times to program in circuit python rather than spend time with dad joyfully being force fed large keyboarding skills and the Rust programming language at the same time. Python isn't one of my favorites - but it is common in the sciences maybe exactly because it is easy to learn and delays the pain that comes with multiple CPU threads for as long as it can.

To dodge the answer to your last question- knowing or not knowing: In the sciences, good enough keeps people employed and perfection is so expensive it really isn't ever tolerated. Blame capitalism.

Rust is great for many projects today. And Rust could be a best choice engineering unifier someday. But right now If I see old python code driving servos that needs a minor edit, I mostly know I'l be writing more python.

All the best to you,
D

1 Like

Coming from Python, my advice to you is (1) learn how to install a virtual env for your python project, pipenv or poetry (2) don't use the latest Python rc, unless you absolutely need it, packages tend to lag behind the latest Python release. R is a nice language if you care about advanced statistics and all the plotting, I personally dislike it for many reasons, some you have already hinted.
Rust is great but for scientific applications it needs more work. Rust and PyO3 seems to be the way to go if you want the best of the two.
On a side note Rust current state reminds of Python in early 2000, it's actually numpy, scipy and later pandas that started to give Python momentum.

My personal opinion is that Rust can become a universally usable language (including areas where currently Python is used). Maybe it already is, though the learning curve is a bit difficult yet. I feel like this is the greatest challenge for adaption by many people and organizations.

There may still be languages that serve a better purpose for certain niches, though.

I think the learning curve of Rust can be improved by

  • better documentation in some regards,
  • improving APIs and removing inconsistencies (likely that's very hard to achieve due to required backwards compatibility).

But I believe that Rust might become "the" standard language in the future.

Python comes with a lot of pitfalls too, e.g. regarding mutability, and I don't think all these __special__ __methods__ in Python are easy to overlook. I really like Rust's concept of traits (with a few features that I'm missing, such as bringing certain sealed traits automatically into scope).

Thanks! I'm not coming to Rust from Python. I'm coming to Python from Rust. And the only reason why I gave Python a try is that I wanted a good tool for plotting. But languages as Python and R which breaks backwards compatibility are severely broken by design. :expressionless:

Backwards compatibility is particularly important for science. You programmers do a lot of work "maintaining" your software. But a scientific paper cannot be "maintained" or "updated", it's published in a journal and stays there forever, archived, available unchanged for future generations. At best, a new research can be published expressing a new viewpoint on a particular scientific problem and this is how science works.

Thus, software used in science must also stay there, in perpetuity, being easily downloaded, backwards and forwards compatible forever. Two universal principles of all fields of science are parsimony and falsifiability. Once an experiment no longer can be reproduced (because the programming language evolved...), the work will loose part of its validity as difficult to test.

I'm not saying it's not possible to publish a research based on software written in Python (most scientists don't really care, for their career all the they need is to have new papers published each year) but I don't think this is a correct approach for science. Things should be done well and what we need for science is Rust.