Persistent Data Structure Support

cljgorust · August 23, 2019, 10:56pm

While Rust has immutability by default, is performance the reason Rust does not use

Are there Persistent Data Structure ways to always do the same thing? For example, update Vector. Is that NOT idiomatic?

In a way in that case would that not promote mutabiltiy vs immutability?
Please refer Idiomatic Rust favors Functional or Imperative Style? - #10 by clojure2rust
for context of this question.

asymmetrikon · August 23, 2019, 11:05pm

There's crates like im which provide structures like these. As the im documentation says, though:

While immutable data structures can be a game changer for other programming languages, the most obvious benefit - avoiding the accidental mutation of data - is already handled so well by Rust's type system that it's just not something a Rust programmer needs to worry about even when using data structures that would send a conscientious Clojure programmer into a panic.

Rust uses mutable data structures by default because they're a good general fit - they're often more efficient, and because of Rust's rules they provide much of the safety and utility gotten from immutable structures in other languages. However, there are totally cases where immutable structures are better in Rust, it's just down to a case-by-case basis.

scottmcm · August 24, 2019, 12:11am

Persistent data structures always have overhead if you're not using their persistent-ness. If you do need the persistent-ness, then they're amazing, but spending O(log(n)) instead of O(1) for an update is pretty impactful if the old one isn't needed. Also, the APIs for persistent tend to be less nice since they have an extra output.

cljgorust · August 24, 2019, 3:59am

So we have variables immutable by default and data structures mutable by default.
Is this not in conflict even though I understand that it is safe?
Would that not promote mutability based implementations? Is Functional Style not dependent on Promoting Immutability?
We still consider writing Functional style in Rust across functions Idiomatic as stated in Idiomatic Rust favors Functional or Imperative Style?

Sorry for so many questions but they seem to be related hence wanted to put them together. Thanks in advance.

Cerber-Ursi · August 24, 2019, 4:17am

Mutable structure stored in immutable variable is effectively immutable, it seems that this is the point. You must declare variable as mutable to use the structure's mutability, but it is always here, just blocked when you don't need it.

cljgorust · August 24, 2019, 5:20am

Well I think default immutability in Rust is there for a reason - it promotes Functional Programming.
But

just blocked when you don't need it.

Makes it possible to use Mutability whenever you want it.
I guess while it is safe in Rust we should refrain from using Mutability all over hence the variable by default is Immutable, thus promoting Functional Style.

So use Mutability locally limiting to function scopes as far as possible and Immutability with Functional Programming in general. Goes back to the conclusion in Idiomatic Rust favors Functional or Imperative Style.

ZiCog · August 24, 2019, 6:02am

From time to time over the many years I have tried to understand the obsession with "functional" programming and immutable data.

I can appreciate that things get much easier to reason about if your functions have no state and no side effects and they predictably produce the same output for the same input every time.

I really cannot fathom why a function should not be allowed to have local data that is mutable. It's not a side effect that can upset something else "out there", it's not visible to the outside world at all. So why not have it?

I cannot fathom how one is actually supposed to do anything useful without mutable data. The simplest thing in the world is a counter. I can have a "functional" function that helps counting by outputting whatever it's input is by 1. Fine, but where do I keep the count that I need?

How do I write a "functional" PID controller?

My observation is that immutable data will always eat memory and suck performance. If it were mandatory in Rust that would bar rust from use in a huge area of applications. Not a good idea.

My feeling at the moment is that Rust's rules about data ownership, anti-aliasing, lifetimes etc upset all these paradigms, OOP, Functional, etc. Rust is making me look at my data a lot more carefully before I start.

cljgorust · August 24, 2019, 7:06am

Yes, that is why locally using mutability is fine but if it goes across functions it would be a concern in general.

True that Rust makes it safe but mutability across functions is still adding code complexity. The code complexity is what we are trying to avoid besides side effects which Rust actually guards very well.

So Persistent Data Structures produce new structures after a change and is done efficiently without cloning each time. In general when updating key-value maps, vectors, sets it follows creation of new structure with Persistent Data Structures. The original structure is unchanged. However, in few cases, like application state as in counter example, even Clojure for example uses an atom which allows mutability.

H2CO3 · August 24, 2019, 8:49am

It seems to me that you regard this is a black-and-white, hard "either this XOR that" question.

That's pretty much not the case. I never liked the sort of extremist "thou shalt only use the One True Paradigm™" point of view that certain programming languages (and/or their creators) seem to want to impose on users. For reasons of clarity, performance, maintainability, compatibility, and several others, it's pretty much necessary to mix and match paradigms, and deciding when to use each particular style is part of the tasks of a professional programmer. Yes, sometimes it's hard and there are good arguments in favor of both sides. That's why it is nice to have the choice.

(As an aside, systems programming languages are all about giving and having control anyway. If we have control over ostensibly low-level details such as when to allocate memory, then why not have control over the style as well?)

I also disagree that the ability to mutate std data structures "promotes mutability". Since Rust has value semantics and transitive immutability (for example, you can mutate something through &mut &mut, but not through &&mut), the fact that bindings and references are immutable by default means that the instances of your data structures, or rather, the variables through which you access them, also end up being immutable by default. Rust taught us that the best way to reason about mutability in type systems is not by some sort of "intrinsic mutability" of a type or a value, but by the mutability of a specific binding (variable, pattern, …) to that value.

To that end, providing classical vectors, hash tables, etc. doesn't at all hurt or contradict the default of functional and immutable style. And indeed, these compact implementations often have superior performance characteristics (at least in a single-threaded context) compared to the usually tree-based, pointer-hopping persistent data structures. Thus, in my opinion they are the right choice for a systems programming language as practical as Rust.

One other thing that came to my mind is this. I don't think Rust's goal or nature is primarily about promoting functional programming per se. Rust wants to help us write correct, high-quality, easily testable and maintainable software. Functional programming and immutability is not the goal – it is a tool. It did prove to be a very practical means of avoiding certain classes of errors, but it is not the only such technique.

And if we can avoid errors even in the mutable, imperative style by devising clever type checking, and in some cases the imperative code is easier to understand or change, for example, then I think it's completely justified to choose it over immutability. We want to achieve correctness, and we do it in different ways. Sometimes, functional programming and immutability are the best way. In other cases, mutability and imperative programming are the best way. Pick whichever one is warranted by the problem you are solving.

ratmice · August 24, 2019, 9:11am

What you (ZiCog) describe is generally referred to as a benign effect,
There are in fact languages capable of inferring when effects are benign.
See the paragraphs surrounding the fib3 example here, for one such language.
https://koka-lang.github.io/koka/doc/kokaspec.html#sec-runst

however many uses of effects are not benign, and outwardly observable.
persistent structures can and do leverage benign effects to perform in place mutable updates of
unshared structures when it is safe to do so.

Yandros · August 24, 2019, 10:54am

"Classical" languages
Mutation is easy / intrinsic, so sharing is dangerous;
- ⟹ mutable data structures
"Functional" languages
Sharing is great when there is no mutation, so let's disallow / disincentivize mutation.
- ⟹ persistent data structures
Rust (ignoring Interior Mutability)
Get mutation when not sharing (&mut _), get sharing when no mutation (& _).
- ⟹ your choice

ZiCog · August 24, 2019, 11:13am

ratmice,

I'm not sure I understand what you are trying to tell me there.

The 'benign" effect I see there in koka is that it's variables are on some kind of heap which is getting mutated. But the compiler knows that heap space is not visible and so proves to itself there are no side effects. Seems like business as usual. How is that different than local variables in C or such? Apart from the proof part perhaps.

I'm familiar with the idea of immutable data structures, if not the implementations, that one can mutate something without your changes impacting other users of the structure, effectively getting a new version that contains mostly the old data plus "diffs" from the original.

If everything in Rust worked like that it would not be able to fulfill it's advertised role as a systems programming language. It would just be yet another functional programming language with nothing new to offer and not have attracted the attention it has.

I'm pretty sure that a purely functional Rust would alienate 90% of programmers anyway simply due to the fact that the source code to pretty much any program is then incomprehensible

Aside: My first Rust ever program finds anagrams in the dictionary files you find on Linux machines. I was amazed to find that it was faster than my former C++ attempt and it is currently the fastest single core solution in a little programming challenge between languages going on here: Liberation through Computer Literacy - Page 44 - Raspberry Pi Forums

The latest entry in that challenge is in Haskell, it's an order of magnitude slower and so far I do not understand the source!

After being attracted to Rust for it's safety it was this result that convinced me it was worthy of much deeper investigation.

If anyone can beat that using a purely functional style Rust and immutable data structures, or any other pure functional programming language, I might start to listen to the pure functional evangelists.

ratmice · August 24, 2019, 11:49am

ZiCog,

I guess what i'm trying to say is that given destructive updates/benign effects, you only need to pay the cost of persistence 'old data plus "diffs"', as you say, when the data structure is used persistently. The OP coming from clojure is likely aware of this (where they have very good support for it) using the terminology transience.

I.e. sometimes you only need pay the cost of persistence when using the data structure persistently.
an option a non-persistent datastructure doesn't give you in the first place.

About your program and performance, sometimes you can achieve better performance using persistent structures typically when building the resulting structure in parallel rather than your single-core case.
for instance when they are not modified after sharing, and you never pay additional overhead for non-destructive updates.

I'm not really trying to convince you of anything, not that you should always use them, nor ever use them.
But there are cases where they are useful and nice whether you like them or not

ZiCog · August 24, 2019, 1:23pm

If there is a way to make a faster anagram finder in Rust, or the functional language of your choice, and immutable data structures and multi-core I would love to see it.

Perhaps some functional, multi-core guru here would like to contribute a solution to the challenge linked above.

I don't really have an opinion about pure functional programming and immutable data structures, perhaps I'm too dumb to see the attraction.

I can see the attraction of using them to achieve "undo" in applications.

alanhkarp · August 24, 2019, 6:00pm

There is at least one counterexample, Sisal, a functional language designed for scientific computing. At the time the project ended, programs written in SISAL had 95% of the performance of comparable FORTRAN programs.

That seems impossible. How can you modify the individual elements of a matrix without either copying the whole thing or building enormously long linked lists? The key insight was that the compiler could almost always prove that an in-place update would not violate the functional programming rules. In other words, the language has functional semantics, but the compiler is free to introduce local data that is mutable.

Having said that, I do not believe that a purely functional language can meet the requirements Rust supports.

ZiCog · August 24, 2019, 6:20pm

Interesting.

SISAL looks like it's even readable by normal humans! SOmewhat remarkable for a purely functional language. http://www2.cmp.uea.ac.uk/~jrwg/Sisal/01.Introduction.html

And it got some modernization in 2018.

cljgorust · August 25, 2019, 12:56am

Code speaks better ( please see at the end of this post). So let me give an example to show that mutable data structures don't necessarily promote imperative style and we can restrict mutability to a single function.
Below, main function is passing immutable and treats the result as immutable data.
Anyone can explain why does Rust make this so easy? So it is the variable binding that is controlling mutability. I like this since mutability is now only in push_to_numbers function.

Thanks to @Cerber-Ursi to state this as

Mutable structure stored in immutable variable is effectively immutable, it seems that this is the point.

And to @H2CO3 to state this:

I also disagree that the ability to mutate std data structures "promotes mutability".

And to @Yandros to state this

Get mutation when not sharing ( &mut _ ), get sharing when no mutation ( & _ ).

"Code is below-->"

fn main() {
        let numbers = vec![1i32, 2, 3];
        println!("Immutable numbers {:?}", numbers);
        let with_number_pushed = push_to_numbers(numbers, 6);
        println!("Immutable with_number_pushed {:?}", with_number_pushed);
    }


    fn push_to_numbers(mut num: Vec<i32>, n: i32) -> Vec<i32> {
        println!("Mutable initial vector: {:?}", num);
        num.push(n);
        println!("Mutable final vector: {:?}", num);
        num
    }

ZiCog · August 25, 2019, 6:11am

Wow! You are giving this Rust noob a headache.

One can then pass an immutable through intermediate functions that can see but not touch it on the way through:

fn main() {
    let numbers = vec![1i32, 2, 3];
    println!("Immutable numbers {:?}", numbers);
    let with_number_pushed = push_1(numbers, 6);
    println!("Immutable with_number_pushed {:?}", with_number_pushed);
}

fn push_1(numbers: Vec<i32>, n: i32) -> Vec<i32> {
    println!("push_1: Immutable intermediate vector: {:?}", numbers);
    let with_number_pushed = push_2(numbers.to_vec(), n);
    println!("push_1: Immutable intermediate vector: {:?}", with_number_pushed);
    with_number_pushed
}

fn push_2(mut numbers: Vec<i32>, n: i32) -> Vec<i32> {
    println!("push_2: Mutable initial vector: {:?}", numbers);
    numbers.push(n);
    println!("push_2: Mutable final vector: {:?}", numbers);
    numbers
}

Which is interesting.

Yandros · August 25, 2019, 9:14am

Rust mutability is based* on whether there is unique or shared access to the value we wish to mutate.

Since owning a T implies having unique access to that T, owned stuff is always** mutable. Not declaring a binding (a variable) mut is just a "lint", to tell Rust: Hey, I know I own this thing, so I technically have unique access to it, but please do as if didn't have such unique access. So this is a property that only holds for the duration of that binding: one can always rebind it with mut, be it within another function as you did, or be it within the same scope!

fn main ()
{
    // immutable?
    let v = vec![42];
    dbg!(&v);
    let mut v = v;
    v.clear();
    dbg!(&v);
}

* ignoring Interior Mutability, see this blog post

** However unicity of the access does not have to be transitive w.r.t. to indirection, since an owned variable can point to a shared element, such as `T = &'_ U`, or `T = Rc<U>` (not possible to go from `&mut T` to `&mut U`).

Thus, as a counter example to the snippet of code shown above, imagine a function taking a &Vec<i32> parameter:

fn foo (v: &Vec<i32>) // shared access to `v`
{
    // Not matter how hard you try,
    // there is no sound way to mutate
    // the vector `v` refers to.
}

ZiCog · August 25, 2019, 9:59am

I'm wondering where use of "mut" in that position shows up in the Rust documentation, can find in anywhere.

fn push_to_numbers(**mut** num: Vec<i32>, n: i32)

I might never have guess one could do that unless I did it by accident.

Is that in anyway different from rebinding a non-mut parameter a mut local variable:

fn push_2(numbers: Vec<i32>, n: i32) -> Vec<i32> {
    let mut numbers = numbers;
    numbers.push(n);
    numbers
}

Topic		Replies	Views
Data-Oriented Programming in Rust	8	6144	October 18, 2022
Idiomatic Rust favors Functional or Imperative Style? help	19	7612	August 29, 2019
[Im]mutable variable vs [Im]mutable data type vs compile time [un]known data size	4	1743	April 6, 2023
Is immutability by default worth the hassle?	69	5890	November 25, 2022
Im crate; vector.append... does not look immutable to a newbie help	6	1482	June 14, 2022

Persistent Data Structure Support

* ignoring Interior Mutability, see this blog post

** However unicity of the access does not have to be transitive w.r.t. to indirection, since an owned variable can point to a shared element, such as T = &'_ U, or T = Rc<U> (not possible to go from &mut T to &mut U).

Related topics

** However unicity of the access does not have to be transitive w.r.t. to indirection, since an owned variable can point to a shared element, such as `T = &'_ U`, or `T = Rc<U>` (not possible to go from `&mut T` to `&mut U`).