How Does Rust Handle Stack Management and Continuations?

I recently read a paper titled Stacks and Continuations, which discusses different stack management strategies as follows:

  1. Contig - Stack frames are allocated in a large, fixed-size contiguous region, which is the native stack discipline used by C.
  2. Resize - Stack frames are allocated in a contiguous region that is initially small. The stack is copied to a new region that is twice as large when the stack overflows.
  3. Segment - Stack frames are allocated in contiguous fixed-size segments, which are linked together.
  4. Hybrid - A hybrid scheme that uses a resizable stack until the stack grows to the size of a segment, at which point the call stack is managed as a segmented stack.
  5. Linked - Each individual stack frame is allocated as a mutable heap object, and the stack is represented as a linked list of frames.
  6. CPS - A direct translation of the first-order CPS representation where return continuations are represented by immutable, heap-allocated continuation closures.

I’m curious, which strategy does Rust follow?

Thank you!

1 Like

Rust thread stacks are allocated by the operating system (same as a program written in C or any other language that does not specially intervene) and are, in this vocabulary, “contig”.

If you need unbounded stack space, you can achieve that with stacker, which allocates new stack segments from the heap.

Rust async code works in an entirely different way — space for local variables of async blocks (whose scope crosses an await suspension point) is allocated in the manner of an enum, which has the effect that the exact amount of “stack” space needed is known statically (except for calls of Boxed futures, which extend into the heap).

Neither normal nor async Rust supports continuations (unless you count passing a partially-executed Future around, but you cannot do very much with that), so no special mechanisms are needed to track when stacks may be freed; they are freed after the thread or async task that was using them terminates.

2 Likes

The first.

It had segmented stacks of some form pre-1.0, but no more.

Thank you for your detailed explanation ! This makes things much clearer. :smile:

This isn't quite accurate. Only the variables which are held over .await points become part of an enum. This can easily include all locals if one isn't careful, since locals are normally dropped at the end of the scope, but one can write async code in a way which minimizes stored state. The variables which are not held over an .await point are processed normally, on the system stack. In particular, an async fn which doesn't .await is almost indistinguishable from an ordinary function (there is a tiny bit of async state for bookkeeping, but basically the whole computation would happen on the system stack as usual).

You are correct; my statement was too strong and I have corrected it.

But regarding the memory usage characteristics of async blocks, it’s worth noting that in the current implementation, all variables that are in scope of the await, and haven't been moved out of, get stored, even when not doing so is an obvious optimization. For example:

pub async fn example() {
    other().await;
    let x = 1;
    println!("{x}");
    other().await;
}
pub async fn other() {}
$ cargo +nightly rustc --lib -- -Zprint-type-sizes
...
print-type-size type: `{async fn body of example()}`: 8 bytes, alignment: 4 bytes
print-type-size     discriminant: 1 bytes
print-type-size     variant `Unresumed`: 0 bytes
print-type-size     variant `Suspend0`: 1 bytes
print-type-size         local `.__awaitee`: 1 bytes, type: {async fn body of other()}
print-type-size     variant `Suspend1`: 7 bytes
print-type-size         local `.__awaitee`: 1 bytes, type: {async fn body of other()}
print-type-size         padding: 2 bytes
print-type-size         local `.x`: 4 bytes, alignment: 4 bytes
print-type-size     variant `Returned`: 0 bytes
print-type-size     variant `Panicked`: 0 bytes
...

There is no need to store the integer x — in particular, it has no destructor, so it is not needed at the end of the block — but it is stored anyway, so if you want a memory-efficient async fn, you may need to add blocks so that no unnecessary variables are in scope at the await points:

pub async fn example() {
    other().await;
    {
        let x = 1;
        println!("{x}");
    }
    other().await;
}
$ cargo +nightly rustc --lib -- -Zprint-type-sizes
...
print-type-size type: `{async fn body of example()}`: 2 bytes, alignment: 1 bytes
print-type-size     discriminant: 1 bytes
print-type-size     variant `Unresumed`: 0 bytes
print-type-size     variant `Suspend0`: 1 bytes
print-type-size         local `.__awaitee`: 1 bytes, type: {async fn body of other()}
print-type-size     variant `Suspend1`: 1 bytes
print-type-size         local `.__awaitee`: 1 bytes, type: {async fn body of other()}
print-type-size     variant `Returned`: 0 bytes
print-type-size     variant `Panicked`: 0 bytes
...

(I had to learn about this behavior when trying to make some async code fit on a machine with a small stack; even though these Futures are conceptually moving data off the stack, they still have to be returned by value from their async fns, and the total size can quickly add up when those futures manipulate large values and call other futures that do the same.)

3 Likes