Using async/await for recursion without stack overflows

I've more than once needed to transform a simple recursive algorithm (such as visiting a DOM tree) into explicit iteration with a manually maintained stack of pending work (e.g. nodes to visit) to avoid stack overflows (e.g. with a deeply nested DOM tree).

I seems like async/await syntax could be used to make this transformation much easier, with a simple executor which pushes futures onto a stack.

I have a crude proof of concept (playground) which lets me write something like this:

async fn sum(node: Rc<Node<usize>>) -> usize
{
    let mut result = node.val;
    for child in &node.children {
        result += TreeExecutor::recurse(sum, Rc::clone(child)).await;
    }
    result
}

fn main() {
    let t = Tree {
        root: Some(Rc::new(Node {
            val: 77usize,
            children: vec![
                Rc::new(Node {
                    val: 123,
                    children: vec![],
                }),
                Rc::new(Node {
                    val: 312,
                    children: vec![],
                }),
            ],
        }))
    };

    let result = TreeExecutor::map(sum, Rc::clone(t.root.as_ref().unwrap()));
    println!("Got result {}", result);
}

That kind of works, but needs the data to all be 'static - hence the Rc<Node> and a lot of clones. I'd really like to be able to have a scoped version which would let you traverse a local data structure without needing anything 'static. I think that means threading a reference to the executor (or something related) through the recursive calls, and it seems like the lifetime handling would be quite similar to how arenas can be implemented but I think with extra lifetimes.

Here's my attempt below (playground link), which doesn't compile:

error[E0597]: `te` does not live long enough
   --> src/main.rs:160:36
    |
160 |     let result = TreeExecutor::map(&te, sum, Rc::clone(t.root.as_ref().unwrap()));
    |                                    ^^^ borrowed value does not live long enough
161 |     println!("Got result {}", result);
162 | }
    | -
    | |
    | `te` dropped here while still borrowed
    | borrow might be used here, when `te` is dropped and runs the destructor for type `TreeExecutor<'_, '_>`

I'm hoping that people here can provide the missing insight and/or point out the silly lifetime errors. :slight_smile:

I have just come across the async-recursion from this thread which I'll also look at for inspiration.

use std::fmt::Debug;
use std::future::Future;
use std::pin::Pin;
use futures::future::FutureExt;
use futures::task::noop_waker_ref;
use std::cell::RefCell;
use std::task::{Poll,Context};
use std::rc::Rc;
use std::marker::PhantomData;

#[derive(Debug)]
struct Node<T: Debug> {
    val: T,
    children: Vec<Rc<Node<T>>>,
}
#[derive(Debug)]
struct Tree<T: Debug> {
    root: Option<Rc<Node<T>>>,
}

// Lifetimes:
// 'data: lifetime of the data we want to traverse.
// 'te: lifetime of the TreeExecutor
type TETask<'t> = Pin<Box<dyn Future<Output=()>+'t>>;
struct TreeExecutor<'te, 'data> {
    tasks: RefCell<Vec<TETask<'te>>>,
    phantom: PhantomData<RefCell<&'data str>>,
}

impl<'te, 'data: 'te> Default for TreeExecutor<'te, 'data> {
    fn default() -> TreeExecutor<'te, 'data> {
        TreeExecutor {
            tasks: RefCell::new(vec![]),
            phantom: PhantomData,
        }
    }
}

struct PendingFuture<R> {
    result: Rc<RefCell<Option<R>>>,
}

impl<R> PendingFuture<R> {
    pub fn new() -> Self {
        PendingFuture {
            result: Rc::new(RefCell::new(None)),
        }
    }

    pub fn get_result_ptr(&self) -> Rc<RefCell<Option<R>>> {
        self.result.clone()
    }
}

impl<R> Future for PendingFuture<R> {
    type Output = R;

    fn poll(self: Pin<&mut Self>,
                _cx: &mut Context<'_>) -> Poll<Self::Output> {
        match self.result.borrow_mut().take() {
            None => Poll::Pending,
            Some(r) => Poll::Ready(r),
        }
    }
}

impl<'te, 'data> TreeExecutor<'te, 'data> {
    pub fn map<T, R, F, Fu>(te: &'te TreeExecutor<'te, 'data>, f: F, arg: T) -> R
    where Fu: Future<Output=R>,
          F: Fn(&'te TreeExecutor<'te, 'data>, T) -> Fu
    {
        let mut fut = f(te, arg).boxed_local();

        loop {
            match fut.as_mut().poll(&mut Context::from_waker(noop_waker_ref())) {
                Poll::Pending => {
                    println!("Future not ready yet");
                }
                Poll::Ready(r) => {
                    return r;
                }
            }
            let (num_tasks, mut task) = {
                let mut tasks = te.tasks.borrow_mut();
                let num_tasks = tasks.len();
                (num_tasks, tasks.pop().unwrap())
            };
            // Try to poll the task
            match task.as_mut().poll(&mut Context::from_waker(noop_waker_ref())) {
                Poll::Pending => {
                    println!("New task not ready yet");
                    // Put it back, before any new ones which may have been added.
                    {
                        te.tasks
                          .borrow_mut()
                          .insert(num_tasks-1, task);
                    };
                }
                Poll::Ready(()) => {
                    println!("New task ready!");
                }
            }
        }
    }

    pub fn recurse<T, R, F, Fu>(te: &'te TreeExecutor<'te, 'data>, f: F, arg: T) -> Pin<Box<dyn Future<Output=R>>>
    where Fu: Future<Output=R>,
          F: Fn(&'te TreeExecutor<'te, 'data>, T) -> Fu,
          F: 'static,
          R: 'static,
          T: 'static
    {
        // Instead of actually recursing, we create a future
        // and push it onto a queue, but return a different
        // future which just stalls.
        let result = PendingFuture::<R>::new();

        let result_ptr = result.get_result_ptr();

        let task = async move {
            let res = f(te, arg).await;
            *result_ptr.borrow_mut() = Some(res);
        }.boxed_local::<'te>();

        {
            te.tasks.borrow_mut().push(task);
        }

        result.boxed_local()
    }
}

async fn sum<'te, 'data>(te: &'te TreeExecutor<'te, 'data>, node: Rc<Node<usize>>) -> usize
{
    let mut result = node.val;
    for child in &node.children {
        result += TreeExecutor::recurse(te, sum, Rc::clone(child)).await;
    }
    result
}

fn main() {
    let t = Tree {
        root: Some(Rc::new(Node {
            val: 77usize,
            children: vec![
                Rc::new(Node {
                    val: 123,
                    children: vec![],
                }),
                Rc::new(Node {
                    val: 312,
                    children: vec![],
                }),
            ],
        }))
    };

    let te: TreeExecutor = Default::default();
    let result = TreeExecutor::map(&te, sum, Rc::clone(t.root.as_ref().unwrap()));
    println!("Got result {}", result);
}

What you are trying to do here is harder than just using a manually maintained stack, to be honest.

While this isn't what you asked for, I have started writing programs where I immediately spawn a new thread with a large stack size 1GB in my case) where all the programs actual work is done. This deals with the fairly small default stack size.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.