Async: Best way to download many files without overloading the client and/or the server? (native & web/wasm)

I am to write a tool that whose purpose is to download an unspecified number of files over http (let's say 10_000 files for the sake of argument).

I want to do it using async/await because one of my target environments is web/wasm, so I can't start threads.

I think I could have used a bounded spmc channel, but I don't know of any spmc implementations that work for web/wasm.

My current idea is to let the application have N "active" async tasks, so the client and/or server doesn't have to handle 10_000 connections at once.

I am wondering if there exists a crate/macro for this already?

Something like, if I wanted a maximum of 8 tasks simultaneously:

simultaneous_max(8, || {
    // spawn may be tokio::spawn, or wasm_bindgen_futures::spawn_local, or something else
    Facade::spawn(async { /**/ });
});

Or should I just slap a Mutex into a lazy_static, let the async tasks call a helper method when done, and call it a day?

=====

I also want to know what anyone with experience with similar scenarios has to say about this - is it a good idea to limit the number of simultaneous tasks? Should I be doing something else?

=====

Thanks.

1 Like

Am I overthinking this? Should I just fire away all the requests at once?

buffer_unordered first old thing that springs to mind. Out of a browser you should likely reuse connections; can't think of anything specific that is easier than a DIY solution but likely others may have something in mind.

2 Likes

Streams are a good way. Another option is to use a semaphore:

1 Like

Depends on your environment. A browser will generally limit a page to a certain number of concurrent downloads from a single remote hostname anyway, so you might not need to worry.

1 Like

Ended up doing

async fn process_parallel() -> Result<(), MyError> {
    let mut par_things = FuturesUnordered::new();
    // ...
    par_things.push(async move {
        // ...
    });
    while let Some(data_result) = par_things.next().await {
        let _ = data_result?;
    }
}

These posts contained some further helpful tips that I will be looking at shortly:

Thanks for the input, everyone!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.