How can I initialize fields of struct asynchronously

I have a struct containing another struct that would take a long time to initialize. In order not to block the initialization of A, I try to figure out a way to separate the initialization of B:

pub struct A{
    pub a: u16,
    // maybe some data structure needed!
    pub b: B,
}

pub struct B{}

impl A{
    pub fn new() -> Self{
        // launch async task to init b, for example tokio::spawn
        // let b = tokio::spawn(async{B::new()})
        Self{
            a: 0,
            b: b
        }
    }
}

impl B{
    pub async new() -> Self{
        // takes some time
        B()
    }
}

When some method of A is called, it could check whether b has been initialized successfully and decide what to do (Image B represents an optional network connection).

I found OnceCell that is semantically similar, but it seems to work only for static(global) variables. I was wondering if there is any similar data structure that works for me:

pub struct A{
    pub a: u16,
    pub b: LazyInit<B>,
}

pub struct B{}

impl A{
    pub fn new() -> Self{
        Self{
            a: 0,
            // launch async task to init b, for example tokio::spawn
            b: LazyInit::new(||B())
        }
    }
    
    pub fn test(){
        if b.has_init(){
            /// ...
        }else{
            /// ...
        }
    }
}

impl B{
    pub new() -> Self{
        B()
    }
}

No, why would that be the case? statics are not really special like that. They are simply globals that live forever. If anything, they are harder, and not easier to initialize than function locals, due to the requirement of the initializing expression to be const. If you can construct an OnceCell to initialize a static, so can you use it to initialize a normal local variable.


However, I still don't get why you don't just make A::new async too.

1 Like

Because making A::new async would actually make its initialization slow since we still need to get instance by calling .await. But I want to get A instance as early as possible (which also means the application can be ready as early as possible). On the other hand, B instance is not necessary but bonus, so I don't really have to wait it be ready.

The difference between constructing an OnceCell static variable and constructing a normal field is the lifecycle issue I think. To construct static variable, just need:

static A_N: OnceCell<usize> = OnceCell::new();
///...
A_N.set(0);
///...
A_N.get();

However, to construct an OnceCell field like the example above, we have to separate the set and get into different async tasks which brings the lifecycle issue:

pub struct A{
    pub a: u16,
    // maybe some data structure needed!
    pub b: OnceCell<B>,
}

pub struct B{}

impl A{
    pub fn new() -> Self{
        let b = OnceCell::new();
        // this would take the ownership of b
        // or requiring lifecycle be `static` because it's used in spawned task
        tokio::spawn(async{
            b.set(B::new());
        });
        Self{
            a: 0,
            b
        }
    }
}

impl B{
    pub new() -> Self{
        // take long time
        B{}
    }
}

This has nothing to do with either OnceCell or the field being local, and everything to do with tokio::spawn() requiring a 'static bound. You don't need OnceCell for this example at all; the problem still persists if you simply try to mutate the field through a regular mutable reference.

What you need for this code to compile is:

pub struct A {
    pub a: u16,
    pub b: Arc<Mutex<Option<B>>>,
}

pub struct B {}

impl A {
    pub fn new() -> Self {
        let b = Arc::new(Mutex::new(None));
        let b_copy = b.clone();

        tokio::spawn(async move {
            *b_copy.lock().unwrap() = Some(B::new());
        });

        A { a: 0, b }
    }
}

(Playground.)

Note that if B::new() is itself async, you might want to make the spawned task a little differently:

tokio::spawn(async move {
    b_new = B::new().await;
    *b_copy.lock().unwrap() = Some(b_new);
});

...to avoid holding the lock across await.

1 Like

I think this works for me, only except that RwLock would be better than Mutex I think, since writing just happens once and reading happens all the time.

If you're going to clone the struct and have multiple references to b, then rwlock may be worth it. If you only ever have one reference, stick with the mutex.

This might actually be a case where you do want to hold the (write) lock across .awaits while initializing B; the methods on A that want to access B can then determine if it's been initialized (and/or wait for said initialization to complete) by attempting to take the (read) lock.

1 Like

Actually, maybe you do want to use oncecell:

pub struct A{
    pub a: u16,
    pub b: Arc<OnceCell<B>>,
}
pub struct B {}

impl A {
    pub fn new() -> Self {
        let b = Arc::new(OnceCell::new());
        let b_copy = b.clone();

        tokio::spawn(async move {
            *b_copy.set(B::new());
        });

        A { a: 0, b }
    }
}

It's the Arc that resolves the lifetime issue, if I'm understanding correctly.

Edit: you'll actually need to use OnceLock for this, unless you're using the single threaded Tokio runtime. (You would know if you were)