let mut b = false;
async { // 1
if (b) {
println!("hello world");
}
b = true;
async { // 2
async { // 3
some_future.await
}.await
}.await
}.await
Let's say some_future.await returns pending, the control returns to ---> 3 ---> 2 ---> 1 ----> put 1 to wait queue ===> executor.
When some_future is awaken, then 1 is scheduled again; because all 1, 2, 3, some_future shared the same waker.
OK, then 1 begin to run, my question is why this time, hello world is not printed out?
Seems only saving the program counter can achieve this.
While, the magic of saving program counter must also includes saving all other registers and all values in the stack to some struct, and store everything back to stack before re-scheduling---------- it almost same with stackful coroutine.
I know that is not what async block has been implemented. But what is the real one?
Polling a future is performed by the runtime, and the runtime polls all futures which are not yet finished in a loop (according to some more or less sophisticated scheduling policy).
All 1, 2, 3 are futures too, and don't finish neither...., only one of 1, 2, 3, some_future should be considered by runtime to re-schedule, I think it should be the top most one: 1, which calls 2, 3 and reaches some_future at last (or something like call 2, 3 untill some_future, like restoring PC and stack).
If some_future is called directly by runtime, then how the stack balance is kept after returning to 3 --- in fact it can't return to 3, it returns to runtime at this case, and the code after that will not execute
That's basically what happens, except that it is heavily optimized. For example, you don't need to save the entire stack, you need to save only the local variables that are live over an .await point. And you don't need to save the program counter, you only need to save the next entry point of a coroutine. Thus you have a state machine, where states correspond to the start, exit and all .await points in the async block, and the data of the state is all live local variables of that async block (local borrows mean that this state enum is usually self-referential). There is only a finite number of states (a single await within a loop corresponds to a single state, the loop's execution is entirely determined by the current local variables). The size of each state is also finite, since there is a finite number of finite-size local variables at that state. So the entire state enum is finite, and its size is known at compile time, since the number of states and the sizes of all local variables are known at compile time.
This means that the entire execution of the state machine is encoded in the fixed-sized Future object, which can be allocated anywhere, stack or heap. Basically the "stack" of the coroutine is that allocation (plus some temporary usage of the program stack for temporary local variables).
Yes. The code is completely transformed to be a bunch of enums. Variables in the function become data fields in the enum. What is executed by poll doesn't look anything like your async function.
The related term for this is "CPS transformation" with CPS standing for continuation passing style, except Rust uses a single function with enums instead of multiple closures/lambdas.
Then local variables related to the load/store instructions in the async block or poll() are spaces in stack or spaces in the fix-sized Future object?
If in the stack, then although some variable needn't to be captured, but should take place in the new stack when future gets called again, because the instructions only know one stack layout.
That means future object will save the stack offset for each variable, make some holes in stack for variables not captured but still have a slot in the old stack
Variables that are carried across an await are stored in the future, not on the stack (except temporarily, up to the optimizer). The future's data layout is chosen to store those variables; there is no need for it to match a stack frame layout.
No. The local variables which are not live over an await point are not stored in the future. Also, the executed code is quite different from the "linear" code in the async block, as you can see from kornel's example.
The compiler always knows which variables need to be stored and which are transient, and whether the stored variables need to be copied between the future and the stack.
I think the search keywording for this question is: CPS, which is the abbreviation of continuation passing style. The short answer is in the above, that is, to separate the code into several states.