Download multiple files in parallel

#1

Hi,

i need to access and process lots of URLs from a list (each response can be processed individually). It would probably be too slow to do it sequentially, so i want to parallelize it so that there are at maximum n downloads running. Conceptually, what’s the best approach to do this? Should i use threads where each one has a reqwest::Client? Is async the way to go?

If there are any examples out there, or you think i should check out a crate other than reqwest, i’d appreciate a link :slight_smile:

0 Likes

#2

I would spin up N threads and use reqwest::Client from each to launch a download. I’d use crossbeam-channel to communicate the results from each thread back to your main thread, if necessary.

You could use async for this, but it seems like overkill at this point in time.

3 Likes

#3

I prefer to use async in such situations, because it suites the task well. Fortunately, the Stream has all the power to simplify our work:

stream::iter_ok(urls_to_parse)
    .map(|url| {
        hyper::Client::new().get(url).and_then(|res| do_stuff(res))
    })
    .buffer_unordered(5)
    .collect()

This will work OK with a single thread.

1 Like

#4

I agree with @BurntSushi

I could be wrong, but I feel like adding a runtime like Tokio is a big learning curve and adds a lot of complexity for simply parallelizing downloads, although from a performance aspect it would be ideal.

Spawning new threads is a lot simpler (in my opinion at least) and the overheard of the spawning a new thread is minimal compared to the time of downloading the files.

1 Like