Static mutables in tests

How do (unsafe) static mutables work with tests? Are all tests separate, or can a data race happen when two tests access it at the same time?

Eg.

static mut SOMETHING ...;

#[cfg(tests)]
mod tests {
    #[tests]
    fn do_something() {
         init_something();
         ...
    }

    #[tests]
    fn do_something_else() {
         init_something();
         ...
    }
}

Does it ensure that the tests do not interfere?

Tests run in parallel by default. You can protect your static by making your tests (or your init function) use some form of locking, or you can can run cargo test -- --test-threads=1 to run the tests in serial.

Crates like once_cell or lazy_static provide safe and simple ways of initializing static variables on first use only.

I know they run in parallel. However, do they share all global variables? Because it wouldn't be very good if tests used (thread-safe, eg Mutex) global variables that they didn't expect to be randomly changed by other tests. The book isn't really clear about it. It says:

make sure your tests don’t depend on each other or on any shared state, including a shared environment, such as the current working directory or environment variables.

Then it dives an example about files on disk. Do global variables (unsafe or using interior mutability) count as "shared state"?

Units tests are run on multiple threads within a single process, so they share the same global variables.

Integration tests, on the other hand, are compiled into one binary per source file, and each binary is run in a separate process, so you can use these if you want separate address spaces and global variables.

2 Likes

The latter :warning: as @mbrubeck explained.

Important mention in that regard, the ::serial_test crate:


More generally, static mut is a very dangerous construct, so indeed try and replace them with the safer alternatives out there:

static mut isn't so dangerous in this case. It is written to once in the first line of main and only read after that.

In such case please use the OnceCell for the guaranteed safety. static mut generally has nonlocal effect and humans are not smart enough to handle nonlocal unsafety.

It's only used in one module, so there is no nonlocal effect.

Then it has at least some amount of locality, much better than nothing. But still, if you or someone else make some mistake within that module it's UB and everything can be messed up. Why don't you chose guaranteed to be safe option over safe-only-if-I-havent-make-mistake?

You may ask what if the once_cell itself made some mistake. That's a fair concern, but I can quote the Linus's law for it. once_cell is a popular library so it may has more eyeballs to catch bugs.

1 Like

I see you have posted a new thread where you "hide" the XY-problem. That also means that you haven't been able to solve your problem, so let's try to help you with that.

You haven't shared any of your code / layout, so all I cna do is guess, but I'd say that your situation is the following:

static mut GLOBAL: Option<Thing> = None;

unsafe // Safety: Cannot be called in parallel
fn init (...)
{
    let _ = GLOBAL.replace(Thing::new(...));
    ...
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn first ()
    {
        unsafe { /* Safety: none whatsoever! */ init(...); }
        stuff_that_uses_global();
    }

    #[test]
    fn snd ()
    {
        unsafe { /* Safety: none whatsoever! */ init(...); }
        stuff_that_uses_global();
    }
}

Which indeed makes your tests suffer from UB since a data race is possible.

You can solve the UB using once_cell (or lazy_static!, but for this pattern once_cell API (the only difference between those two crates) seems more appropriate):

use ::once_cell::sync::OnceCell;

static GLOBAL: OnceCell<Thing> = OnceCell::new();

// Note: Only the first call gets to init the global.
fn init (...)
{
    let _ = GLOBAL.set(Thing::new(...));
    ...
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn first ()
    {
        init(...);
        stuff_that_uses_global(); // <-+ may
    }                             //   | use
                                  //   | the
    #[test]                       //   | `GLOBAL`
    fn snd ()                     //   | value
    {                             //   | initialized
        init(...); // -----------------+ here
        stuff_that_uses_global();
    }
}

So we have solved the UB, but the code still has a logic bug: it can and will suffer from race conditions, whereby if the different test runs initialize the value differently, then all the tests but one (the one that wins the race to initialize GLOBAL) will be using the wrong global.

  • This can be showcased by not ignoring the result obtained when calling set:

      fn init (...)
      {
    -     let _ = GLOBAL.set(Thing::new(...));
    +     GLOBAL.set(Thing::new(...)).expect("GLOBAL already init");
      }
    

If that is your case, this indeed showcases that @Hyeonu was right: you have been using a global variable with thus non-local state in a very limiting way: just imagine how easy it would have been to do let state = init(...); and then pass &state around in the test functions!

  • This is, by the way, one solution: you could design your API to take a &'_ State parameter instead of relying on GLOBAL to be around, and then, if you really want to, you can define a GLOBAL: OnceCell<State> that lets you offer a top-level API where that state parameter is hidden. But within your own internal tests, you wouldn't need to use the global.

Otherwise, the solution here would be to have the tests run in a serialized manner:

use ::once_cell::sync::OnceCell;

static GLOBAL: OnceCell<Thing> = OnceCell::new();

// Note: Only the first call gets to init the global.
fn init (...)
{
    GLOBAL.set(Thing::new(...)).expect("GLOBAL already init");
    ...
}

#[cfg(test)]
mod tests {
    use super::*;

    use ::serial_test::serial;

    #[test]
    #[serial]
    fn first ()
    {
        init(...);
        ::scopeguard::defer!({ drop(GLOBAL.take()); }); // Automatic clean-up

        stuff_that_uses_global();
    }

    #[test]
    #[serial]
    fn snd ()
    {
        init(...);
        ::scopeguard::defer!({ drop(GLOBAL.take()); });

        stuff_that_uses_global();
    }
}

And there you have it. A zero-unsafe code that does the right thing, even while testing.

The other thread was about a different issue, not really related to this one (which is about tests with a global static mut initialised at startup.

I don't really want to add an extra dependency in oncecell. I'll probably just use an AtomicBool to make sure it is only initialised once, and it will be rather obvious if I forget to initialise it. In this case it is initialised once, only once, and always to the same value. Only in tests does it potentially have an issue.

Taking a state parameter doesn't really make sense, because there's only one value to initialise it to. Everything should use the same values. It would be like having a hash table with one specific lookup table calculated on startup, then creating a new instance and passing around everywhere.

Also, for the other thread, it doesn't make sense to pass state parameters either.

If you want to avoid dependencies outside std, you can use the pattern shown in the std::sync::Once docs to safely initialize your static mut (though I'd personally move the static declarations inside the function).

Try search the word "once_cell" from your Cargo.lock file. If it exist, adding it into your Cargo.toml doesn't add any extra dependency but make already existing one usable.

And the one AtomicBool flag would not be enough. You need at least two AtomicBool to signal the initialization process is started and finished.

1 Like

And here I'd like to share my own question from a little before, where this was already discussed:

1 Like