How to use async for reading files in loop

Hi All,

As you all know by now, I am reading around 10000 files and seeing ways to optimise it. I came across async crates like Futures, Tokio and async-std. Based on these crates, I tried an example.I am getting lifetime error. Any suggestion? Also, I am not sure if this correct and will it improve my performance. Any thoughts?

use async_std::task;
use std::collections::HashMap;
use futures::executor;
use futures::stream::StreamExt;
use futures::stream::futures_unordered::FuturesUnordered;

pub struct MyStruct {
    hash1: HashMap<String, f32>
}

fn main() {

    let mut my_struct = MyStruct {
        hash1: HashMap::new()
    };
    executor::block_on(read_files(&mut my_struct));
    println!("Done");
}

async fn read_file(i: usize, my_struct: &mut MyStruct) {
    // This function reads from a file using tokio::fs::file and get v
    // For now I am writing a dummy value
    let v = (i * i) as f32;
    my_struct.hash1.insert(String::from("Sensor1"), v);
}

async fn read_files(my_struct: &mut MyStruct) {
    let file_names: Vec<usize> = (0..10000).collect();
    file_names
    .iter()
    .map(|file_name| read_file(*file_name, my_struct))
    .collect::<FuturesUnordered<_>>()
    .collect::<Vec<_>>()
    .await;
}

You should not use async/await to optimize File IO. Async/await is great for network IO, but results in worse performance for File IO.

The error is because a mutable reference requires exclusive access. To make it work, you would return (String, f32) from the futures, and put them into the hash map in one collected place to avoid sharing the MyStruct.

Ah. Yes. Sorry. I am new to async so got confused. And if it is not good for optimising file IO, then I think, there is no point proceeding further. Thanks :slight_smile:

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.