Data-Oriented Programming in Rust

I am reading Data-Oriented Programming: Reduce software complexity by Yehonathan Sharvit

Requirements

  • Immutable Functions (immutable functions return an updated version of the data instead of changing it in place)
  • Persistent Data Structure
  • lodash like functionality

Is there immutable functions (or libraries with immutable functions) in rust? How can I implement immutable functions in Rust?

DOP can make good use of Persistent Data Structure. I found in Persistent Data Structure Support - #2 by asymmetrikon that:

I also found GitHub - orium/rpds: Rust Persistent Data Structures . So what do you recommend I use. Do I actually need the rpds library or Native Rust is Enough?

DOP also need lodash like functionality. I could not find any good implementation of lodash in rust. Is it because all lodash stuff in already built in rust? What is the deal here?

Note that a huge part of the magic of Rust is that it lets you use mutable data structures without most of their usual downsides.

The mutability itself isn't the problem. The problem is the uncertainty of what's going to happen -- if I pass it to something, did it get modified? If I have two lists, could they ever be the same list so that appending to one actually makes the other one longer? If I have something that I didn't pass to something else, could that function modify it anyway because it happened to have kept a reference to it?

AKA the problem is the combination of aliasing and mutability, which is exactly why Rust's model is about allowing one or the other, but not both at the same time.

As Manish wrote in The Problem With Single-threaded Shared Mutability - In Pursuit of Laziness,

Aliasing with mutability in a sufficiently complex, single-threaded program is effectively the same thing as accessing data shared across multiple threads without a lock

So another way of looking at Rust's rules is "If nobody but me can tell that I modified something, then it's clearly fine because they can't possibly care".

And thus you'll often see immutability-inspired APIs that nevertheless use mutation internally for efficiency. And this can generally work pretty well without much extra support stuff. But there are always crates for persistent (aka immutable) data structures should you wish to use them.

16 Likes

That is basically provided by the standard Iterator trait's methods.

1 Like

Persistent data structures are useful if you want to be able to operate on different versions of the same datastructure. For example, you have a set of elements X, add a few items to it to get Y = X + {a, b, c}, and use both X and Y later on.

If you don't need such versioning functionality then don't use them.

Functional programming uses persistent data structures by default as a way of not having to deal with shared ownership of mutable objects, but like @scottmcm said, Rust deals with this issue in a different, more efficient way.

4 Likes

Vanilla Rust should be all you need for situations like this.

It's perfectly fine to use persistent datastructures (the im crate is a really good implementation!) but the reason for doing that is more because it makes sense in the context of your application rather than being a workaround for the way your language works (e.g. you want structural sharing because that's an efficient way of concurrently updating an AST).

As one example of how Rust's type system helps, lodash and someArray.map() need to return new lists because 1) they are eager functions (whereas Rust's Iterators are lazy) so they need to perform the operation immediately and give you the result, and 2) everything is mutable and aliased in JavaScript by default and you don't want to accidentally modify the caller's list.

On the other hand, if I see a function with the signature, fn(&Foo) -> Foo, I know the inputs haven't been modified and the Foo being returned is an updated version.

5 Likes

The only question remaining is:

I feel like the author of this book (and you) discovered a new toy and didn't pause to look around. There are useful generalizations to be made, and for that you generally have to let things sink in.

You implement immutable data structures in Rust the same way as you would in any other language: by not providing any mutating API.

The details depend on the kind of data structure and operations you want to implement.

In Rust if you have a shared reference (&) to an object, it's normally immutable to you, and mutating functions typically take a unique reference (&mut).

I think it doesn't make much sense to apply any philosophy such as this "Data Oriented Programming" in the abstract, you can start working on a some specific code and then see if those ideas apply to it.

4 Likes

I don't think so. Note that topicstarter talks about lodash. That's Javascript library. Which explains a lot of things.

There are no immutable functions and, more importantly, there are no need to have immutable functions.

The deal here is that Rust represent the next logical step after data-oriented programming.

The problem that data oriented programming tries to fix is shared mutable data: all these stubborn and hard to fix errors which happen when you modify some data in one part of your program without properly changing some other data in another part of your program.

Data-oriented programming solves that issue with very heavy hammer: it just makes shared mutable data impossible because it makes mutable data impossible. That's the same idea which lies in the basement of other functional languages, too.

It's safe style but not very efficient: very often you need to copy a lot of data when you are doing very tiny modification.

Rust does half-step back: the central idea of Rust is that you don't need to forbid mutable data to make mutable shared data impossible. If you can, somehow, ensure that you can only modify data when nobody else may look on it you get the same benefits (two different parts of programs can not disagree about what they are observing if there are no such parts and if there are only one owner of data), but it's more efficient (and often easier to read and write, too).

You can not meaningfully employ this style in Javascript because ensuring that there are only ever one owner of some data without compiler help is very error-prone.

But in Rust it's just most natural style! The whole language is built around that idiom! Just use it and you would get most of benefits of data-oriented programming without even need to think about that term.

But there are one important exception: interior mutability. If you use that one then you get most (although not most) problems of shared mutable state. But it's hard to employ that by accident: compiler would never make anything internally mutable implicitly, all these constructs are very explicit.

P.S. As for lodash… I don't think there are anything like that for Rust. Because most common operations are, indeed, built into standard library and there are many crates which provide extensions for it, but they are, usually, small, focused, crates, not large ones which include a lot of tools, like lodash.

10 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.