Avoiding global objects

I'm working on a GUI application where I've been using thread local storage to store program state and also the UI elements. I've wanted to write helper functions that use these same elements, but I can't if I do it within a section where this global object has already been borrowed mutably. I thought this was a clean approach, but it doesn't allow for refactoring into smaller functions very easily. I've seen suggestions in topics about this that were back in 2015 that discuss passing around objects. Given that I'm working on a GUI app with callbacks that sometimes don't take any arguments, I don't think this kind of solution will work for me. Does anyone have any suggestions or can point me to any project that might be solving this problem in a nice way?

You might look at using something like RefCell to help you pass the state around. In conjunction with Arc you can pass them around between threads. A TUI project I've worked on called Glitter used that pattern. It passes the application model around to the GUI elements using RefCell which can then borrow the model mutably and then update it. Hopefully that helps a little bit, good luck! :smile:

1 Like

Are you really sure that you can put a RefCell inside of an Arc? I think Rust would be unhappy about this due to the RefCell not being Sync...

Thanks @dfockler for the suggestion. If it makes any difference, I don't need to actually access these variables from other threads, I was just used thread local storage because the consensus online seemed to be that globals were bad.

Thread-local storage is really a hack to keep global variables working in multi-threaded programs :slight_smile:

If you don't need multi-threaded access, then RefCell should do just fine.

You are probably correct! Is there a way to do threadsafe interior mutability?

You can put the RefCell behind a Mutex, for example.

Why would you do so when Mutex<T> has interior mutability already?

3 Likes

Since mutex provides (temporary) &mut access, the RefCell here is totally unnecessary :slight_smile:

@dfockler So the answer for threadsafe mutability is basically Mutex, RwLock or atomic types from the std::sync::atomic module. You can also try searching crates for some concurrent data structures.

1 Like

Argh, sorry - I meant to say put the object behind the Mutex, not the RefCell - thanks for pointing that out.

1 Like

Yes, my hands and brain were out of sync when I wrote that.

Note that RwLock isn't Sync or Send unless the underlying value is both, so it restricts its uses a bit.

I didn't know that. I guess it makes sense that if you are locking something through the Mutex then it is ensured that it won't be able to mutate in multiple places at once.

Yes. Another comment which I'd have on RWLock is that it is less useful than many people think:

  • Acquiring an RWLock involves locking a mutex + extra stuff, so a Mutex is faster under low contention
  • If the reads are fast, you never amortize the overhead, so a Mutex is always faster
  • Even if the reads are slow, the initial mutex-locking can still end up being a bottleneck with many readers

My usual guidelines for concurrent data structures would be:

  • If you do not expect contention, a Mutex is likely to be the fastest and least memory-hungry option
  • When contention could be a problem, try to acquire the contended locks less (by sharing less data, or sharing it less frequently), and for shorter periods of time (through finer-grained locking)
  • Only after you have unsuccessfully gone through both of these design stages, should you start thinking about more clever concurrent data structures like reader-writer locks or lock-free/wait-free objects. Make sure you understand the underlying design tradeoffs, and pick wisely.
3 Likes

I don't want to sidetrack this thread too much, but I'll just add that a shared-nothing (or as close to that as possible) design (which you sort of touch upon) will generally scale (across core counts) better. Of course, being stuck with locks and (mutably) shared memory is a fact of life a lot of times too :slight_smile:.

1 Like

The discipline I usually try to follow is to share only share data which...

  • ...is immutable (&data) or extremely rarely mutated (RCU)
  • ...has to be mutated regularly for the program to work correctly (application-dependent data structure, either Mutex-protected or something more clever depending on contention)

Even in the later case, one can often eliminate a great deal of contention by working on a thread-private cache instead of always modifying the shared object. This is how scalable memory allocators work, for example.

The "shared-nothing" designs typically involve sharding the workload across threads (= # of cores, in general), with each thread handling a subset of the load and having its own thread-local resources. Communication between the threads is minimized, but when it needs to happen, is done via message passing using a mesh of something like spsc queues (which can be dirt cheap, i.e. no atomic ops, on some archs).

A good example of this (in OSS land) is http://www.seastar-project.org/ (and scylladb, which uses it underneath). This is a C++ project, but could conceivably be done in Rust.

1 Like

So this conversation started to move towards multi-threaded programming, which isn't the exact problem I'm having even though my program is multithreaded. I'm writing a GUI program and I want to persist state (the UI elements, some specific state variables, references to other threads, etc.) and I'm uncertain how to do it in a way that doesn't involve passing around variables (which is impossible for my use case where I have callbacks triggered as part of the event loop).

I was originally using thread_local! but the scope of borrows was too large, so if I wrote a helper function and had it borrow a state variable I got a panic. I can't add an argument to my helper function because I want it usable in . Basically I need something equivalent to thread_local but where it's more strictly scoped.

What I'd like to do is the equivalent of this:

// This requires complex initialization that is done in main()
let state = ...;

fn main() {
    setup();
    gtk::main_loop();
}

fn setup() {
    // Initialize state
    helper();
}

fn callback() {
    // borrow state
    helper();
}

fn helper() {
    // borrow state
}

I think the "right" way to do this is to use static global variables using RefCell<Option<T>> to encapsulate my data implementing Sync for my types, and manipulate the values directly.

1 Like

Thanks for the code example, it clarifies things a lot.

As you mentioned, we could try to use RefCell<Option<T>>:

static state: RefCell<Option<State>> = RefCell::new(None);

But rustc won't like this for two reasons:

  • RefCell is not Sync, and Rust wants statics to be Sync.
  • Initializing the RefCell requires a compile-time function call, and that's not in Stable Rust yet.

Now, we may decide to use a static mutable variable instead:

static mut state: Option<State> = None;

But that would make every code that reads or writes the state unsafe. What a bother!

Now, most libraries which use callbacks allow you to pass arguments to them. If this is the case for GTK, you'll probably want to use that instead. It would work as follows:

fn main() {
   let state: Rc<State> = ...;
   setup_callback(callback, state.clone());
}

If you can do this, you will get late initialization without the overhead of an Option and a RefCell!

A more idiomatic Rust alternative would be to make the callback a closure and move "state" into it, but it depends on whether the Rust GTK bindings can accept Rust closures as callbacks...

2 Likes

Thanks HadrienG, that definitely summarizes what I've run into pursuing this. There isn't a way to pass arguments to the callback because it actually is queued up in another thread. The code looks roughly like:

thread::spawn(|| {
    glib::idle_add(|| { glib::idle_add(callback); });
}

So here I want to pass the state out-of-band somehow. Otherwise I could figure out a way to refactor out this callback. Right now this is done because I'm using mpsc to send data between the threads, so whenever a response is ready from the second thread it queues up the receiver on the main thread using glib::idle_add. Maybe there's a way to hook into a periodic event loop through GTK that still allows it to be performant. I think this would be necessary as I move towards a more async design anyways, but I haven't dug into that this far.

Wait, let's take a step back here.

  • Why do you need two nested calls to glib::idle_add?
  • Why does this code need to run in another thread?
  • If you can pass a closure to glib::idle_add, why do you bother with passing the state out of band?

Instead, you could try something along these lines:

let callback = move || callback_impl(callback_args);
glib::idle_add(callback);