"Recursive" lazy list processing in Rust?

SICP introduced the stream paradigm implemented by lazy lists. I want to understand whether the expressive power of Rust is capable to express the same idea easily.

Rust book discussed the iterators, which could be used to modal lazy lists. However, the machinery is linear, which does not obviously cover the full power of stream power. It is similar to a control flow without looping structures.

To illustrate, we consider the lazy list of Fibonacci sequence. It could be, of course, implemented by an iterator linearly, as Rust by examples illustrated. Essentially, it is iterating the process (a,b)->(b,a+b). However, there is another point of view: we can describe the Fibonacci lazy list f as: the first two terms are 1,1, and the lazy list starting from the third term f[2..], is the term-wise sum of the f and f[1..]. It could also be described by a system of "updating" constraints:

f = Cons(1,g) // The first term of f is 1, and the rest is g
g = Cons(1,add_list(f,g)) // The first term of g is 1, and the rest is the term-wise sum of f and g

Or expressed in Haskell, let fib = (1:1:zipWith (+) fib (tail fib)).

Let me point out how this differs from the paradigm given in the Rust book. The examples given in the Rust book, the iterators are passed consecutively through "devices" such as map, filter, here we have "feedback loops". In terms of the constraints, the terms appears at the left hand side (f in the preceding Fibonacci system) could also appear in the right hand side (add_list(f,g)).

Generally, we consider a system of "updating" constraints, and we update the lists through these constraints repeatedly. To be preciser, a constraint look like f=h(g_1,g_2,...), where h is a computable function. The adjective "updating" means that we use the lists g_1,g_2,... to update f. That is to say, the system is used to describe a computational system, iteratively updating the lists on the left side, rather than a system of constraints to "solve".

There is a somewhat nontrivial example:

a = Cons(-1,-2,3,4,5,-6,b) // The first terms of a is -1,-2,3,4,5,-6, and the rest is b
b = add_list(c,d) // b is the term-wise sum of c and d
c = filter(>0,a) // c consists of the strictly positive terms in a
d = filter(<0,a) // d consists of the strictly negative terms of a

There are a lot of abuses of notations, but I hope that this is comprehensible. At the first stage, we have a=-1,-2,3,4,5,-6,..., c=3,4,5,..., d=-1,-2,-6,... and b=2,2,-1,... where we denote by ... unknown terms at the moment. Then we update a to -1,-2,3,4,5,-6,2,2,-1,..., c to 3,4,5,2,2,..., d to -1,-2,-6,-1,... and b to 2,2,-1,1,..., and then we update a to -1,-2,3,4,5,6,2,2,-1,1,... and c to 3,4,5,2,2,1,.... If we proceed further, we fall into an infinite recursion. This shows that, in general, a system of "updating" constraint might not end up with a terminate computational process. However, we can still compute the first 10 terms of a. This system could be expressed in Haskell: let v = ((-1):(-2):3:4:5:(-6):zipWith (+) (filter (>0) v) (filter (<0) v)).

I wonder whether this system could be expressed in Rust (or better, we can meta-program a translator from "updating" constraints to Rust programs), without any sacrifice of time and space asymptotically, and hopefully more efficient than direct representations in Haskell or Lisp.

I would like to add that this is more theoretical, although it might have some applications for simulations of signal processing systems. In practice, especially for concrete cases, one should avoid such kind of representations, because the control flow, and consequently, the time and space complexity is very vague in these representations. At least for me, it is not very obvious that the first Fibonacci example, described in this way, will lead to a computation with linear time and constant space.

I haven't really thought of the more general case yet, but here is what I have written to emulate Haskell's let fib = (1:1:zipWith (+) fib (tail fib)):

#![deny(bare_trait_objects)]

fn fib () -> impl Iterator<Item = u64>
{
    use ::core::iter::once;

    once(1).chain(once(1))
        .chain_lazy(|| Box::new(
            Iterator::zip(
                fib(),
                fib().skip(1), // tail
            ).map(|(x, y)| x + y)
        ) as Box<dyn Iterator<Item = u64>>)
}
1 Like

Thanks, but will this suffer from the efficiency problem? I am a newbie, but it seems to me that fib() is called redundantly. As a consequence, the time cost is exponential.

I tried to take 1000 terms (I modified code so that it will give the result modulo 65536), which yields a timeout.

let fib = (1:1:zipWith (+) fib (tail fib)) is very magic. The feedback of fib does not result in a re-computation, hence the time cost is linear. The GC ensures that the space cost is constant. If I want to do it in Rust, even to naively emulate, it seems to me that I need to fight against the ownership system. In this case the dependency is simple (in order to compute a term, only constant number of preceding terms is needed) so that it seems still controllable, but the later example shows that the dependency could be complicated.

This fancy is exactly the reason why one should usually avoid this kind of representations. One needs to be very careful to examine the time and space cost of such a computation, not obvious at all.

1 Like

I'm not sure that it's Rust's ownership system per se that's the issue here. I think you're running into the practical differences between Haskell and languages like C and Rust. To put it broadly: In Rust you describe the steps the machine should take. In Haskell you describe the algorithm.

Rust provides abstractions but they're relatively thin abstractions over the machine's APIs (aka machine code). On the other hand a lot of work has been put into Haskell compilers so they can convert the algorithms you describe in to efficient machine code.

It might be possible to implement Haskell-like abstractions in Rust (perhaps using macros?) but it would be a lot of work to get right.


[Incidentally I'd be tempted to implement (a,b)->(b,a+b) using successors and then map, instead of explicitly making my own iterator struct. But that's completely beside the point.]

Yes, this kind of abstract recusrive approach is definitely not the most efficient one, since, for instance, it requires heap-allocated indirection (as recursive types do).


Functional language memory sharing seems to have the side effect of memoizing the list elements, and a memoized fibonacci sequence has indeed a linear cost (in time).

My example was aiming instead at expressing the recursive definition within Rust, without sharing or memoization, hence an exponential cost.

I have managed to get the following standalone (no Iterator for the definition) abstraction, sugared with a macro:

fn fib () -> List<u64>
{
    list![1 ;; 1 ;; fib() + fib().tail()]
}

#[allow(unused_must_use)] // false positive
fn main ()
{
    // with internal iteration
    let mut elems = ::arrayvec::ArrayVec::<[_; 30]>::new();
    fib().try_for_each(|elem| elems.try_push(elem));
    dbg!(elems);
}

I don't think in this special case that one really depends much on Haskell. This is why I referred to SICP, which instructs Scheme that does not have much abstraction as Haskell. This is where SICP and Scheme are awesome: they taught me that simple abstractions are enough to formulate such a concept. If we look at Yandros' code closely, you see that it is very similar to how Scheme mimics the lazy list, including the use of Box, except that in Yandros' code, fib() should be replaced by a pointer to fib to reuse the computed results, where we need to worry about the ownership.

I feel like I missing something fundamental about this topic so I might be missing a important property, but I took a shot of translating both example into Rust.

Fibonacci
nontrivial

1 Like

The time cost of this implementation to take n terms of Fibonacci sequence is Θ(n^2).

I was not very clear about the "updating" process. It is in fact, appending.

I decided to go and try again the "elegant" recursive-definition approach, but this time trying to get linear cost. The trick here is modelling Haskell's lazy values in Rust. Although there are great crates for that, I have implemented a sketchy version of it (since all the computations are single-threaded, I have used Rc + RefCell, but of course using Send/Sync wrappers instead would allow it to be usable in a multi-threaded scenario):

fn fib () -> LazyList<u64>
{
    // notice the fib instance is shared within the sum.
    LazyList![1 ;; 1 ;; { let fib = fib(); fib.tail().clone() + fib }]
}
2 Likes

Thanks for all helps. unsafe seems necessary especially for performance reasons. Yesterday, I learned that the general paradigm has something to do with functional reactive programming, which I know nothing before. That is to say, this paradigm seemed to be considered before and does have several applications. I see several crate: reactive_rs - Rust, https://aepsil0n.github.io/carboxyl/carboxyl/index.html, while I cannot tell whether they fully support the recursive stuff as mentioned before.

After getting a much better understating of how theses lists work I was able to write the non-trivial example in constant space and as a normal iterator.

Of course, the non-trivial example could be written out in direct representation, which would then use pretty much the same algorithm as haskell would, modulo garbage collection. Playground

When the infinite recursion is reached, it will blow up the stack as expected.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.