When to be careful with recursion --blowing up the stack

h_a · December 23, 2019, 7:54pm

This is not particularly Rust-specific question but Rust forum are the most friendly towards newbie question in my opinion so here goes.

pub struct Quickunion {
    ids: Vec<u64>,
}

impl Quickunion {
    pub fn new(size: u64) -> Self {
        Self {
            ids: (0..size).collect(),
        }
    }

    pub fn get_root_of_recursive(&self, id: u64) -> u64 {
        if self.ids[id as usize] == id {
            id
        } else {
            self.get_root_of_recursive(self.ids[id as usize])
        }
    }

    pub fn get_root_of_iterative(&self, id: u64) -> u64 {
        let mut i = id;
        while self.ids[i as usize] != id {
            i = self.ids[i as usize];
        }
        i
    }
}

In the two implementation of the same method, why should I worry about blowing up the stack? If I see the compiler output, it's largely similar, with the iterative version with one extra mov instruction.

Does the compiler implement TCO for you? And what are the signs to avoid recursion in general?

H2CO3 · December 23, 2019, 8:06pm

LLVM can perform TCO although it's not particularly clever as far as I know. Last time I read about this, there were ongoing efforts to implement TCO in the Rust compiler itself, although I didn't follow them closely so I don't know if there's been any progress on those.

In general, I wouldn't worry about recursion by default. Sometimes recursion is the natural solution – go for it if you feel it makes intent clear and the code more readable.

If you ever encounter a case where recursion would be more elegant but it ends up blowing up your stack, you can always just rewrite the offending code in the iterative style. Until then – don't try to optimize prematurely.

lwojtow · December 23, 2019, 11:40pm

Some time ago I wrote this (with great help from this forum). Feel free to use it (licensed under MIT / Apache blah blah blah).

use std::iter::IntoIterator;

pub enum Thunk<R> {
    Value(R),
    Continuation { generate: Box<dyn FnOnce() -> Thunk<R>>, fold: Box<dyn FnOnce(R) -> R> },
}

/// example usage:
/// ```rust
/// use genotick::exec::thunk::execute;
/// use genotick::exec::thunk::Thunk;
/// fn fib(n: u64) -> Thunk<u64> {
/// if n == 2 || n == 1 {
///        Thunk::Value(1)
///    } else {
///        let fold = Box::new(move |r| r + execute(fib(n - 2)));
///        let generate = Box::new(move || fib(n - 1));
///        Thunk::Continuation { generate, fold }
///    }
/// }
///
///
/// fn main() {
///    println!("{}", execute(fib(10)));
/// }
/// ```

pub fn execute<R>(t: Thunk<R>) -> R {
    let mut folders = vec![];
    let mut next = t;
    let value = loop {
        match next {
            Thunk::Value(v) => break v,
            Thunk::Continuation { generate, fold } => {
                folders.push(fold);
                next = generate();
            }
        }
    };
    folders.into_iter().rev().fold(value, |value, folder| folder(value))
}

#[cfg(test)]
mod test {
    use crate::exec::thunk::{execute, Thunk};

    #[test]
    pub fn test_thunk() {
        let fib_20 = execute(fib(20));
        assert_eq!(fib_20, 46499);
    }

    #[test]
    pub fn test_thunk_order() {
        let recursive = sub_rec(10);
        let thunksive = execute(sub_thunk(10));
        assert_eq!(recursive, thunksive);
    }

    #[test]
    pub fn test_stack_overflow() {
        let sum = execute(sub_thunk(1_000_000));
        assert_eq!(sum, 500000);
    }

    fn fib(n: u64) -> Thunk<u64> {
        if n == 3 || n == 2 || n == 1 {
            Thunk::Value(1)
        } else {
            let fold = Box::new(move |r| r + execute(fib(n - 2)) + execute(fib(n - 3)));
            let generate = Box::new(move || fib(n - 1));
            Thunk::Continuation { generate, fold }
        }
    }

    fn sub_rec(n: i32) -> i32 {
        if n == 0 {
            0
        } else {
            n - sub_rec(n - 1)
        }
    }

    fn sub_thunk(n: i32) -> Thunk<i32> {
        if n == 0 {
            Thunk::Value(0)
        } else {
            let generate = Box::new(move || sub_thunk(n - 1));
            let fold = Box::new(move |r| n - r);
            Thunk::Continuation { generate, fold }
        }
    }
}

Phlopsi · December 24, 2019, 9:25am

Do you know the performance impact of your workaround?

lwojtow · December 24, 2019, 9:40am

Unfortunately, it allocates so it's rather slow (comparing to normal recursion).
At the same time, its usage seems simpler to me: it doesn't need "magic" accumulator parameter for TCO recursion.

alice · December 24, 2019, 10:40am

If you're implementing the disjoint set data structure, you should be aware that for a correct implementation, you need to flatten the path to the root every time you call get_root, and in fact you should merge in the direction that minimizes either rank or size (see wiki).

This change would make TCO impossible as it needs to do work on the way back, but since the merging behaviour results in paths of O(inverse ackermann(n)) length, you're going to need an n inconceivably larger than what fits in memory to hit the recursion limit.

qnighy · December 25, 2019, 1:21am

paths of O(inverse ackermann(n)) length

I guess it can only be bounded by O(log n) (and only O(n) if union direction isn't controlled) because path compression only contributes to amortized time?

system · March 24, 2020, 1:21am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[Solved] Simulate recursion with stack and closures help	8	644	January 12, 2023
Alternative way of thinking a [Recursion] algorithm to avoid stack overflow help	3	4396	January 12, 2023
Do I get TCO here? (And is it bad style?) code review	3	789	June 12, 2022
On guarding against stack overflows while using recursion	10	1480	January 10, 2023
What to say to a C programmer when they say … community	27	3511	April 11, 2020

When to be careful with recursion --blowing up the stack

Related Topics