How come global variables create safety issues?

I know that Rust disallows the user to create global variables (unless if they are const) due to safety reasons but what is the reason behind it?

If it was safe to mutate static mut, then you could do the following:

static mut A: u8 = 0;

fn main() {
    std::thread::spawn(|| {
        loop { A += 1; }
    });

    loop { A += 2; }
}

This would be a data race, as you are mutating the same memory location from multiple threads without synchronization. Data races are UB, which means that no safe code must be able to cause them.

11 Likes

It's not about safety but how the global variables are implemented. Executable file formats have sections dedicated for initializing global variable. So they're initialized at compile time, packed into binary, and loaded by operating system without executing any user code.

And yes, mutating initialized global variables is totally different story. Rust is made for multithreaded environment, and mutating shared value without proper synchronization tends to leads data race which is UB.

If you've not heard about the UB before, checkout this Wikipedia article. Note that the term UB has strictly defined meaning.

2 Likes

While what you say about executable file formats is true I do not buy into that explanation as to why globals may cause safety issues.

Global variable created like that do exist. So there is no issue null references and such UB.

Such global variables are initialized before any of ones code runs. So there is no possibility of using uninitialized data.

So far everything is perfectly safe as far as I can see. One can well imagine defining a language where use of such globals was allowed and perfectly normal thing to do. Indeed many languages have such globals.

The issue comes when ones language definition supports the concept of threads as pointed out above.

Mind you, I can imagine designing a language that allows for multi-threaded code and global variables. Those globals would all be guarded by mutual exclusion mechanisms.

After all, even Rust reads and writes to standard I/O from multiple threads. They are global resources. No problem.

3 Likes

A big issue with good old fashioned global variable is that there is an ever increasing chance that whatever globals you define will end up having the same name as some global in some create you take into use. Then you are scratching your head wondering why your program no longer compiles. Or it it does why it no longer works as now you and some crate might be scribbling to the same thing.

So at least things need limiting to some local name space.

This is not about threads necessarily. Mutable aliasing is problematic in general. With a mutable static Vec you could hold a reference to an element as the vector resizes and moves all its elements to a new memory location.

8 Likes

Indeed. But things like the Vec resizing issue are not limited to items in the global name space. One can have that problem with local Vecs and references.

Is it true to say then that, the problem with globals in Rust is that the borrow checker would have no idea who owned the global in the first place, and so can't even start to to track borrows and moves.

3 Likes

That’s exactly my intuition, too. The problem is that there’s no possible owner, neither for global statics since they can be accessed from different functions (the point of being global) nor for local statics, since a function can be called multiple times. Multiple times in parallel, or even on a single thread with (possibly mutual) recursion.

By the way, happy cake day :partying_face:

4 Likes

Just fyi, if you're sure you need to do it, you can use OnceCell Lazy with a Mutex to do it (you could also use the lazy_static! macro instead of OnceCell too)

use once_cell::sync::Lazy;
use some::kind::of::Mutex;

static GLOBAL: Lazy<Mutex<String>> =
    Lazy::new(|| TokioMutex::new("Hellow".to_string()));

Now you can safely access GLOBAL by locking the Mutex.

3 Likes

Also the parking_lot crate has a different Mutex type that offers a const constructor, so you can just directly initialize a static variable with that one.

3 Likes

Would it be possible to use this Mutex when using tokio, instead of tokio's own Mutex?

Only if you don't block on the lock call (only using try_lock for example) and if you don't keep it locked across an await.

Yes, from the tokio::sync::Mutex documentation:

Which kind of mutex should you use?

Contrary to popular belief, it is ok and often preferred to use the ordinary Mutex from the standard library in asynchronous code. This section will help you decide on which kind of mutex you should use.

The primary use case of the async mutex is to provide shared mutable access to IO resources such as a database connection. If the data stored behind the mutex is just data, it is often better to use a blocking mutex such as the one in the standard library or parking_lot . This is because the feature that the async mutex offers over the blocking mutex is that it is possible to keep the mutex locked across an .await point, which is rarely necessary for data.

A common pattern is to wrap the Arc<Mutex<...>> in a struct that provides non-async methods for performing operations on the data within, and only lock the mutex inside these methods. The mini-redis example provides an illustration of this pattern.

Additionally, when you do want shared access to an IO resource, it is often better to spawn a task to manage the IO resource, and to use message passing to communicate with that task.

tokio::sync::Mutex

This is covered in more detail in the chapter on shared state from the official Tokio tutorial.

1 Like

How are you supposed to lock the std Mutex in an async context? Would you just accept that it might block your executor?

If the lock is only ever held for a very short time, it isn't a problem. You will typically get Send-related errors if you try to keep it across an .await.

From looking at the implementation, it seems like the reason why tokio::sync::Semaphore and hence tokio::sync::Mutex cannot have const constructors is that they contain a std::sync::Mutex<Waitlist> internally. Quite ironic that tokio’s mutex not having const fn new() boils down to the ordinary Mutex not being const-constructible. If tokio just switched to using parking_lot... but I guess that won’t happen.

Tokio actually has a parking_lot feature you can enable, but it isn't on by default.

Oh, right, I stopped one file before getting there, because it already spelled std in its name. Perhaps one should open an issue asking to make a few constructors const when parking_lot is enabled?

I think that kind of conditional compilation is unlikely to be accepted, but you could propose having it unconditionally enabled for Tokio v0.3, which is in the works.

I also noticed the parking_lot feature yesterday. But yeah, that one optional feature having const constructors but the default one not having it would probably not be accepted