When to be careful with recursion --blowing up the stack

This is not particularly Rust-specific question but Rust forum are the most friendly towards newbie question in my opinion so here goes.

pub struct Quickunion {
    ids: Vec<u64>,
}

impl Quickunion {
    pub fn new(size: u64) -> Self {
        Self {
            ids: (0..size).collect(),
        }
    }

    pub fn get_root_of_recursive(&self, id: u64) -> u64 {
        if self.ids[id as usize] == id {
            id
        } else {
            self.get_root_of_recursive(self.ids[id as usize])
        }
    }

    pub fn get_root_of_iterative(&self, id: u64) -> u64 {
        let mut i = id;
        while self.ids[i as usize] != id {
            i = self.ids[i as usize];
        }
        i
    }
}

In the two implementation of the same method, why should I worry about blowing up the stack? If I see the compiler output, it's largely similar, with the iterative version with one extra mov instruction.

Does the compiler implement TCO for you? And what are the signs to avoid recursion in general?

LLVM can perform TCO although it's not particularly clever as far as I know. Last time I read about this, there were ongoing efforts to implement TCO in the Rust compiler itself, although I didn't follow them closely so I don't know if there's been any progress on those.

In general, I wouldn't worry about recursion by default. Sometimes recursion is the natural solution – go for it if you feel it makes intent clear and the code more readable.

If you ever encounter a case where recursion would be more elegant but it ends up blowing up your stack, you can always just rewrite the offending code in the iterative style. Until then – don't try to optimize prematurely.

1 Like

Some time ago I wrote this (with great help from this forum). Feel free to use it (licensed under MIT / Apache blah blah blah).

use std::iter::IntoIterator;

pub enum Thunk<R> {
    Value(R),
    Continuation { generate: Box<dyn FnOnce() -> Thunk<R>>, fold: Box<dyn FnOnce(R) -> R> },
}

/// example usage:
/// ```rust
/// use genotick::exec::thunk::execute;
/// use genotick::exec::thunk::Thunk;
/// fn fib(n: u64) -> Thunk<u64> {
/// if n == 2 || n == 1 {
///        Thunk::Value(1)
///    } else {
///        let fold = Box::new(move |r| r + execute(fib(n - 2)));
///        let generate = Box::new(move || fib(n - 1));
///        Thunk::Continuation { generate, fold }
///    }
/// }
///
///
/// fn main() {
///    println!("{}", execute(fib(10)));
/// }
/// ```

pub fn execute<R>(t: Thunk<R>) -> R {
    let mut folders = vec![];
    let mut next = t;
    let value = loop {
        match next {
            Thunk::Value(v) => break v,
            Thunk::Continuation { generate, fold } => {
                folders.push(fold);
                next = generate();
            }
        }
    };
    folders.into_iter().rev().fold(value, |value, folder| folder(value))
}

#[cfg(test)]
mod test {
    use crate::exec::thunk::{execute, Thunk};

    #[test]
    pub fn test_thunk() {
        let fib_20 = execute(fib(20));
        assert_eq!(fib_20, 46499);
    }

    #[test]
    pub fn test_thunk_order() {
        let recursive = sub_rec(10);
        let thunksive = execute(sub_thunk(10));
        assert_eq!(recursive, thunksive);
    }

    #[test]
    pub fn test_stack_overflow() {
        let sum = execute(sub_thunk(1_000_000));
        assert_eq!(sum, 500000);
    }

    fn fib(n: u64) -> Thunk<u64> {
        if n == 3 || n == 2 || n == 1 {
            Thunk::Value(1)
        } else {
            let fold = Box::new(move |r| r + execute(fib(n - 2)) + execute(fib(n - 3)));
            let generate = Box::new(move || fib(n - 1));
            Thunk::Continuation { generate, fold }
        }
    }

    fn sub_rec(n: i32) -> i32 {
        if n == 0 {
            0
        } else {
            n - sub_rec(n - 1)
        }
    }

    fn sub_thunk(n: i32) -> Thunk<i32> {
        if n == 0 {
            Thunk::Value(0)
        } else {
            let generate = Box::new(move || sub_thunk(n - 1));
            let fold = Box::new(move |r| n - r);
            Thunk::Continuation { generate, fold }
        }
    }
}
1 Like

Do you know the performance impact of your workaround?

Unfortunately, it allocates so it's rather slow (comparing to normal recursion).
At the same time, its usage seems simpler to me: it doesn't need "magic" accumulator parameter for TCO recursion.

1 Like

If you're implementing the disjoint set data structure, you should be aware that for a correct implementation, you need to flatten the path to the root every time you call get_root, and in fact you should merge in the direction that minimizes either rank or size (see wiki).

This change would make TCO impossible as it needs to do work on the way back, but since the merging behaviour results in paths of O(inverse ackermann(n)) length, you're going to need an n inconceivably larger than what fits in memory to hit the recursion limit.

1 Like

paths of O(inverse ackermann(n)) length

I guess it can only be bounded by O(log n) (and only O(n) if union direction isn't controlled) because path compression only contributes to amortized time?