Threads sharing heap memory that is constant after initialization

If it does not use a lot of memory, and initialization does not use a lot of cpu, is it faster to somehow share one object that is constant after initialization or to have a separate object and initialization for each thread.

If you have heap allocated data that is only read, not written, then there is no reason not to share it.

If your threads can be scoped, then just have them borrow the data. If not, put the data in Arc and clone the Arc for each thread.

There will be a huge amount of borrowing without mutating the data (after its initialization). Speed of this operation is crucial. Will the Arc slow these borrows down as apposed to using thread local data ?

Borrowing from an Arc is very cheap, and can be done once when the thread starts. The cost of doing something with the data should completely overshadow the borrowing.

My application is for algorithmic differentiation. I think that the implementation of each operator must do a borrow to get the global operator information. This is initialized and never changes. The actual tape that records the operations is planned be be a local variable, but the information about each operator is global. For example, which function to I use to evaluate this operator in forward or reverse mode. I want to make the recording of a function as fast as possible. Part of the plan is to allow for multiple threads each recording a different function at the same time.

Sure. All I was saying is that you only need to "borrow" once per thread, since after you borrow you can use the borrowed operation info for the duration of the thread.

struct OperatorInfo {
    op1_info: (), // details omitted
    op2_info: (), // details omitted
}

fn main() {
    let op_info = OperatorInfo {
        op1_info: (), // details omitted
        op2_info: (), // details omitted
    };

    let shared_op_info = Arc::new(op_info);

    thread::spawn({
        let op_info = Arc::clone(&shared_op_info);
        move || {
            let op_info = &*op_info;

            // op_info is now a simple reference to the OperatorInfo struct,
            // no more "borrowing" is needed in this thread.
        }
    });
}

playground

Note that cloning the Arc just increments its reference count. The per-thread overhead of cloning the Arc, and borrowing from it, is so minimal it might as well be non-existent.


Also note that an Arc, and the overhead of the Arc, is the equivalent of what would be called a plain old reference in garbage collected languages. It is common to think that an Arc has overhead when you first encounter it in Rust, because it has a visible and explicit API, but the truth is that this (miniscule) overhead exists but is hidden from sight in most other languages.

1 Like

Thanks !!

1 Like

The following is controversial.
If you need read-only access in multiple threads, you can leak your data as well

Box::leak(Box::new(data_var))

This simplifies the API often, because it only needs references, never Arc.
But the memory allocated by data is never freed.

My rule of thumb is:
Leaking is ok if you do it ONLY in main