I got a program which call an API, then cache its result into a database. Currently, the (simplified) code is like this:
// This function call an API. It is provided by an external crate and I can't change it
async fn call_api(id: String) -> Result<ApiResponse>
fn save_to_cache(data: ApiResponse);
fn get_from_cache(id: String) -> Option<ApiResponse>
// This function get the data from the database, and if not cached, fetch the API
async get_or_fetch(id: String) -> Result<ApiResponse> {
match get_from_cache(id) {
Some(val) => Ok(Val),
None => call_api(id)
}
}
fn main() {
let ids = vec!["1", "2", "5", "1", "5"];
let mut tasks = vec![]
for id in ids {
tasks.push(
thread::spawn(async move {
let data = get_or_fetch(id).await.unwrap();
do_stuff(data)
})
)
}
}
One problem I got is if an ID already have an API request pending, the program may initiate another one while it could just wait for the first one to finish and get its result from the database. And since the ratelimiting on the api is quite severe, and I may need to do thousands of requests, it would be quite worth it to prevent duplicate requests.
Is there a way to block and make the other threads wait until the request is finished? I don't mind adding crates.