Assume there is a spawn task with a loop that keeps on invoking tokio::time::sleep(…).await
and nothing else. The documentation says that dropping the join handle does not abort the task etc. Is it then guaranteed that the task will continue to be pulled and be progressing in general until the runtime is shut down, or is there some mechanism that disposes of or heavily deprioritizes detached, marginally active tasks? I have a background job that is supposed to stay alive, and it seems to die after some hours or days, but it also seems to help to keep the join handle around.
As per the docs:
A
JoinHandle
detaches the associated task when it is dropped, which means that there is no longer any handle to the task, and no way tojoin
on it.[…]
If a
JoinHandle
is dropped, then the task continues running in the background and its return value is lost.
That is: a task continues to run to completion even after its JoinHandle
has been dropped. Keeping the handle alive does not keep the task alive.
If you've got a task that is exiting unexpectedly, I'd look at what that task is doing; Tokio isn't terminating it for you.
Yeah, this is my understanding too. I never saw it exiting; it just stops making progress, it feels like. In addition to the sleep(…)
I mentioned, I also have tracing::info!(…)
, and the logs just stop appearing after some day of running (and the actual work stops being done). Wanted to ask anyway to ensure there was nothing behind the scenes in the runtime. Maybe it was a coincidence that keeping the join handle was helping.
Tokio does not dispose of background tasks in any way.
If you have a task that sometimes appears to stop running, then you probably have a different task that is blocking the thread. You may be able to use tokio-console to debug this by looking for tasks whose "busy" number is increasing but the "poll" number is not increasing.
Thank you! Was not aware of that tool.
One other thought: In my scenario, there are at least two threads with the multi-thread runtime used. I see that everything else is working fine. To elaborate, it is an Axum server, serving requests just fine at all times. Is then your explanation about having “a different task blocking the thread” plausible still?