Global variable

Say I have a commandline app where I could specify an option --myopt which turns something on.
I want to have that option available in all my functions in main.rs.

The only way I found it using a static mutable variable and using unsafe at two places.

#[derive(Debug)]
struct Global {
    myopt: bool
}

static mut GLOBAL: Global = Global{ myopt:  false };

impl Global {
    fn set(&mut self, value: bool) {
        self.myopt = value;
    }

    fn get(&self) -> bool {
        self.myopt
    }
}


macro_rules! set_myopt {
    ($fmt:expr) => (
        unsafe {
            GLOBAL.set($fmt)
        }
    );
}

macro_rules! get_myopt {
    () => (
        unsafe {
            GLOBAL.get()
        }
    );
}

fn main() {
    println!("Show default setting of myopt: {:?}", get_myopt!());
    let myopt = true;  // assume gotten from cmdline
    // activate what we got from cmdline
    set_myopt!(myopt);
    println!("Show active setting of myopt: {:?}", get_myopt!());
}

I'm not exactly a fan of unsafe. So, my question: how could I solve it in a better way?

Thanks.

--
Manfred

2 Likes

Why do you need mutable access to this variable everywhere in your program? Convinient way to solve this problem is to create the configuration struct (Global in your case, but i would rename it to smthng like Config) on very beginning of program, and then pass it wherever its needed. Global mutable state, however achieved, is in general an evil idea.

2 Likes

Sure Config is better. Global was just a name given in the minimal example.

I find it just inconvenient to pass Config around as I change it (if the option was specified at the command line) a single time at program start.

It may looks inconvinient on the beggining, but later it safes you from real problems. Answering initial question: you can achieve this using thread_local crate, or lazy_static+Mutex.

2 Likes

The main problem with mutable statics is that they are only memory-safe in the right usage conditions. Set them and read them concurrently from two different threads and you get a data race. This dependency on correct usage patterns is the very defining characteristic of unsafe in Rust, so hiding the unsafety through encapsulation as you are doing here is not correct.

Some alternate solutions, in orders of decreasing preference:

  • Pass configuration state explicitly. => Most idiomatic, avoids global variable trouble
  • Synchronize concurrent access to global mutable state (with Mutex, AtomicPtr...) => Has a performance and complexity cost
  • Use thread-local state => I don't think that is appropriate for your use case as you probably want to set the configuration in an application-wide fasion.
  • Expose and use carefully an unsafe interface => If you are exposing an interface which can blow up when used incorrectly, this is the right thing to do.
3 Likes

lazy_static is your best bet if you want to stick with a global: example

I think for me the only viable solutions are one of the following

  1. Use unsafe as in my example as I only set the variable once in the beginning of the program. So this is kind of a mitigation.
  2. Pass a Config struct around
  3. Use lazy_static

Thanks to all of you. Best, Manfred

But in the end I assume that lazy_static also has to use unsafe.

It does, but you can/should assume they got it right :slight_smile:. It uses std::sync::Once internally, IIRC.

Note that an abstraction that uses unsafe internally is not problematic in and of itself. Otherwise, you wouldn't even be able to use the std library.

What is unacceptable is exposing an interface which pretends to be safe, but can break type/memory/thread-safety if used in the wrong way. Or, in more advanced usage scenarios, exposing an unsafe interface without documenting under which assumptions this interface is safe.

Unsafe in code means "Dear compiler, please let me do some things which are potentially memory/type/thread-unsafe", whereas unsafe in interfaces means "Dear user, this interface is only safe if specific precautions are taken when using it, please be careful with that contract".

1 Like

I would say this is actually a good thing. By having to explicitly pass in the information you need it's very easy to see which parts of an application depend on what. It also tends to make testing and maintaining things a lot easier in the long run (e.g. with dependency injection and all that).

I've spent quite a long time writing Rust code and then when I started working on a C# application at work I found that having globals is a great way to accidentally make a tightly coupled ball of spaghetti.

I originally started programming using Python and when I saw Rust forcing you to stop using globals my reaction was quite similar. My advice would be to give it a try and see how your application works out by populating your Config struct in main() then passing references to the bits of code which need it.

1 Like

While it's valuable and important to understand why Rust makes globals a PITA, it's also important to acknowledge that sometimes they're useful, particularly when the application is small/immature. That's partly the reason crates like lazy_static exist in the first place. Once the application reaches a certain size, it's possible to refactor away from the global if it becomes a problem.

So we should be pragmatic, rather than dogmatic :slight_smile:.

4 Likes

It’s important to note that if all you need is a single Boolean, lazy static is overkill. Use an atomic Boolean!

2 Likes

In some ways, yes. But AtomicBool has different semantics - code can toggle it as many times as it wants, whereas lazy_static is an init-once semantic.

In my case the single bool was the minimal example. In my real cmdline app I (currently) have 9 options.

8 of them I could easily pass around as they were used only in 1 or 2 functions. The 9th was the beast which I really need in many places as that option --no-colors tells the application to not use colored crate when issuing messages.

A command line app with many options is the only situation where till now I sometimes felt a need to use the config struct globally.

Ah, that’s fair.

I ran into this issue myself recently. I had a large struct (lots of Vecs of Vecs, etc) that was constructed from input data, a config file of sorts. It's created once and referenced by basically everything. At first I passed it around, then I refactored things so that a reference got passed by way of another struct. The first struct was the rules of a game, and the second struct was game state at some point in time. I was very happy and a little proud of myself until...

... Well Rust is very performant (great for game AI) and threading in Rust is supposed to be extremely easy. I wanted to use all available CPUs and my problem was trivial to parallelize. I could clone and pass each thread its own copy of an initial game state struct. -- Wait, not so fast! You can't just use a borrow in a closure-thread, because Rust couldn't be sure when it's safe to free the game-rules struct.

I didn't want to have to copy the rules for every thread because it was such a large struct. That meant I needed either some sort of Arc-based wrapper, which would require extensive code refactoring at the very least, or I could somehow make the game rules static. But not even lazy-static helped me since loading the rules required a command-line argument which was not itself static.

I ended up using unsafe blocks and a static mutable Option. I'm still a Rust newbie and I'm sure there's a better way, but this seemed like a fair compromise that required the least amount of refactoring.

You mentioned Python and sure enough, Python's modules can be thought of as singletons with a kind of "Once" thread-safe initialization (first import). That does work great for configuration data, and finding a way to do a similar kind of thing with Rust has been one of my toughest struggles -- I mean one of the best learning opportunities that Rust has graced me with me so far.

You probably don't want to do that, naively using unsafe to mutate a static variable which is accessed by multiple threads will probably result in data races and is asking for trouble.

It's safer (and probably requires less code) to use lazy_static!() and a Mutex or RWLock.

use std::sync::Mutex;

lazy_static! {
  static ref MY_FOO: Mutex<Foo> = Mutex::new(Foo { ... });
}

Before write and publish code using unsafe {}, you should ask yourself a few questions:

  1. Can I implement core functionality of this crate without this unsafe block? If yes, just get rid of this unsafe block and use safe alternative from popular crates.

  2. Does this unsafe {} block directly related with the core functionality of this crate? If no, find the safe alternative from popular crates. If you can't find any, make a new crate to provide safe abstraction. And also consider 3. on it.

  3. Does this unsafe {} block is really needed? Mostly if not always there's 100% safe alternative there. Sometimes those safe alts are not as performant as you need like this code would be executed 10M times per second or runs on process signal handler so you can't allocate on it etc. In this case, you should pedantically test and prove this code is safe no matter what input is fed. You should pay more than half of the time you spend on the code to check unsafe blocks, as it's the only source of memory bugs on your responsibility.

2 Likes