Asynchronous data generation

Hello world. My app needs a large amount of data, that takes a lot of time to be agregated, processed and updated. Then my strategy is to generate the data ahead of time and store it, so that when any part of my code needs the data, it just has to read the data source.

Here is my issue: as far as I know, the only way to get data from another module is to call a function. But the function itself can only access data that has been passed to it, or that it processed at call time. Then how can my modules share a common data source without storing it to the hard drive ?

Have you considered storing your data in a static item?

Oh ! That's a good idea. I've always been intimidated by static item's because they are delicate to manipulate. But I think it's time to use them :blush:

The other option is to store the data at a higher point in the call stack and explicitly pass it down to the functions that need to access it.

3 Likes

Please don't manipulate them! I.e. don't use static mut, unless you can absolutely, most certainly, 100% not avoid it

2 Likes

I'm confused :sweat_smile:. You mean That I should use an inmutable static container, with mutable items ?

Because I do need to update continuously my data store .

I feel like these two statements somewhat contradict each other. If you want mutable access to a static variable, please use thread-safe interior mutability by wrapping the data in a Mutex or RwLock. See this answer for an example.

2 Likes

Thank you for the advice. I'm going to do it so. There is no contradiction: the data is updated, but ahead of time, before it is requested.

1 Like

Ah, I see, I misinterpreted your first statement, my bad.

Note that with that sort of dynamic design (i.e. one part of your program generating the data and some other part(s) referencing it), I think it would be cleaner to do what @DanielKeep suggested and pass the data container explicitly down to your producers and consumers, rather than adding the indirection of a static variable. Rather than something like this:

use std::sync::Mutex;
use std::thread;
use std::time::Duration;

static DATA: Mutex<Vec<u8>> = Mutex::new(Vec::new());

fn producer() {
    DATA.lock().unwrap().push(1);
}

fn consumer() {
    let value = DATA.lock().unwrap().pop();
    
    println!("{value:?}");
}

fn main() {
    thread::spawn(producer);
    
    thread::sleep(Duration::from_secs(1));
    
    consumer();
}

Playground.

I'd do something like this:

use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;

fn producer(data: Arc<Mutex<Vec<u8>>>) {
    data.lock().unwrap().push(1);
}

fn consumer(data: Arc<Mutex<Vec<u8>>>) {
    let value = data.lock().unwrap().pop();
    
    println!("{value:?}");
}

fn main() {
    let data: Arc<Mutex<Vec<u8>>> = Arc::new(Mutex::new(Vec::new()));
    let data_for_producer = data.clone();
    
    thread::spawn(|| producer(data_for_producer));
    
    thread::sleep(Duration::from_secs(1));
    
    consumer(data);
}

Playground.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.