Twiggy reports join3 as garbage, but why and can this code be eliminated?

In an embedded project with embassy, I have an async fn that ends with

async fn ..... {
    // ....
    let _ = join3(a_fut, b_fut, c_fut).await;

This is to get the async code parts to run, but they basically will never end.

To look at the memory use of my project, I ran twiggy garbage on my build and found:

 Bytes │ Size % │ Garbage Item
  6784 ┊  0.64% ┊ <embassy_futures::join::Join3<Fut1,Fut2,Fut3> as core::future::future::Future>::poll::h1d435c3442504560
  2620 ┊  0.25% ┊ Subroutine[0][12388]

I now wonder why join3 is reported as dead code? And is there a way to remove it or replace it by other means that get the async parts to run?

I'm not familiar with twiggy, per the document, I think it works like a linker. I'm not sure how to interpret the report, but if it means Join3::poll is not reachable, I would guess maybe you didn't spawn your async fn onto any executor?

see embassy_executor::SpawnToken and embassy_executor::task:

you get an emabssy_executor::Spawner typically from the main task:

1 Like

The async parts are all running as they should. I see this in the (debug) output they produce. Two of the futures are async blocks, a bit like this...

pub async fn some_task(...) {
    // ....
    let a_fut = async {
        loop {
            // intinite loop with some .await s in there
    let b_fut = async {
        loop {
            // intinite loop with some .await s in there
    let _ = join3(a_fut, b_fut, c_fut).await;

... and the task is spawned from main.

again, I don't know what's the precise meaning of the twiggy report, but if all the futures are spawned and polled as expected, then I would guess maybe twiggy is tracing the reachability at basic blocks granularity (as opposed to functions granularity), in which case the report could simply indicate the fact that the future is never resolved (its poll() always returns Pending).

I don't know the exact details, but the join3 should be implemented in a way similar to the following (pseudo code)

fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll {
    // a, b, c should be some "fused" wrapper future types
    let (a, b, c) = pin_project(...);
    let a = a.poll(cx);
    let b = b.poll(cx);
    let c = c.poll(cx);
    if a.is_ready() && b.is_ready() && c.is_ready() {
        Poll::Ready((a.get(), b.get(), c.get()))
    } else {

given some of the futures always return Pending because they are in infinite loops, the condition of the if would never be true, so the basic block of the true branch would be dead code.

if my guess is correct, I don't think there's anything to worry, the llvm optimizer should in theory also detects the dead code and eliminate them as well. what optimization flags are you using to build the code?

in your example code shown above, is there reasons you cannot make a_fut, b_fut or c_fut themselves tasks and spawn them on the executor?

although the join() combinator would have similar effect of "running" the futures "concurrently", its intended usage is to wait for all of the futures to resolve and return their Output values at once.

in the fork-join concurrency model, the fork primitive starts execution of tasks, and the join primitive waits for the tasks to finish. although the join part indicates an implicit fork, it feels a bit off to use fork to start concurrent executions.

besides that, join may be less efficient than an actual executor, because each time any of the futures is waken, all (three in the case of fork3) of the futures are polled even if they are NOT wake up by their wakers (although well behaved futures should handle spurious wakeups, it is less efficient), while the executor will only schedule the task that is actually waken up by the waker to be polled.