Mutate lazy_static for tests

#1

I have a static variable initialized by lazy_static. For tests only, I want to set or modify the content of the static. Is this somehow possible without making it possible to be changed (safely) from non-test code?

I have (approximately):

lazy_static!{ static ref FOO: Vec<_> = Vec::new() }
...
#[test]
fn with_other_foo() {
    // mutate FOO
    // run test-code
    // change FOO back
}

Is this somehow possible?

#2

Since tests run concurrently (a thread per test within one single process), using global mutable data is a really bad idea (you will have tests that sometimes pass and other times fail, because of race conditions).

However, I can see the point of having globals for tests. The solution? The ::std::thread_local! macro. It is very similar to lazy_static!, except you don’t need to put the refkeyword and those globals do not implement Deref, but provide a .with() method where a closure may be fed a reference to the global’s value:

type T = Vec<i32>;

thread_local!{
    static FOO: T = vec![42];
}
// ...
#[test]
fn with_other_foo() { FOO.with(|slf| {
    let _: &T = slf; // we only get access to a **shared** reference of the global

    assert_eq!(
        slf,
        &vec![42],
    );
})}

Now, to get mutation, since you only get a shared reference, you need inner mutability, and since we are in a single-threaded context (unless your test does spawn threads), the obvious candidate is RefCell(or Cellfor Copytypes):

use ::std::cell::RefCell;

type T = Vec<i32>;
thread_local!{
    static FOO: RefCell<T> = RefCell::new( vec![42] );
}
// ...
#[test]
fn with_other_foo() { FOO.with(|refcell| {
    assert_eq!(
        &*refcell.borrow(), // read the vec
        &vec![42],
    );
    {  // mutate the vec within a scope
        let v: &mut T // Victory!
            = &mut *refcell.borrow_mut()
        ;
        v.push(69);
    }
    assert_eq!(
        &*refcell.borrow(), // read the vec again
        &vec![42, 69],
    );
})}
1 Like
#3

I did not think of tests running concurrently. Thank you for pointing that out.

I also thought of the approach using RefCell, but that would require the non-test code to deal with the RefCell as well, which I do not want.

#4

The other solution, if you really need a global variable, and you may multithread or don’t like the poor ergonomics of thread_local!, is to declare staticvariables inside the function’s scope, to prevent issues with other tests (you will thus need to define one static per test):

use ::std::sync::RwLock;

use ::lazy_static::lazy_static;

type T = Vec<i32>;
// ...
#[test]
fn with_other_foo() {
    lazy_static!{
        static ref FOO: RwLock<T> = RwLock::new( vec![42] );
    }

    assert_eq!(
        &*FOO.read().unwrap(), // read the vec
        &vec![42],
    );
    {  // mutate the vec within a scope
        let v: &mut T // Victory!
            = &mut *FOO.write().unwrap()
        ;
        v.push(69);
    }
    assert_eq!(
        &*FOO.read().unwrap(), // read the vec again
        &vec![42, 69],
    );
}
#5

I already have the global variable defined elsewhere, it is required to be global by the non-test code. The global variable is constructed by querying the environment variables. I want to add tests that check the behavior of my code under different environment variables.

Maybe this is what I want: mock lazy_static for the test in case the static is not pure.

#6

Well now I’ll ask why you need the global variable. In my case I only use a global variable with tests to express constraints with assertions (example), without having to deal with passing mutable references down the chain.

For the rest, you can just declare [mutable] local variables and then feed them (by [mutable] reference) to the code you are testing.

#7

I have

lazy_static! {
  static ref GLOBAL_CONFIG: MyConfig = ... // read config from std::env
}

The library code uses this config. I want to run tests for different configs.

I would like to know if changing the library code is my only option if I want to write tests for different configs.

#8

If your library code does not contain a mutex or other lock on the static variable, then yes. You will need to have at least some way to change this variable if you want to change it.

That change could be something as simple as having a function

fn with_config<O, F: FnOnce(&Config) -> O>(f: F) -> O {
    #[cfg(not(test))]
    {
        lazy_static!{ static ref FOO: Vec<.. };
        // use FOO
    }
    #[cfg(test)]
    {
        lazy_static!{ static ref FOO: Mutex<Vec<... };
        // lock and use FOO
    }
}

But you’ll need at minimum that. If you do go this route, you’ll also want an additional mutex lock in test code which you lock for the duration of the test - so no other test which involves modifying the config happens in the middle of the test.


That should work.

But regardless, I would strongly recommend considering redesigning this. By having a single global you’re preventing users from configuring the library how they want to. If it’s a public library, you’re preventing users from using their own non-env configuration. Even if it’s a private library, you’re still locking yourself into always using the same configuration for all places your using the library. Having the same code instantiated with different configurations is often quite useful, and using a static like this prevents that.

2 Likes
#9

That’s what I suspected. In that case you will only get shared references to GLOBAL_CONFIG (lazy_static's API), so unless you have fields with inner mutability (atomics, RwLock or Mutex) in MyConfig you cannot do what you want.

#10

Exactly. A very common pattern is to create the Config struct not inside a static, but simply at the beginning of the main function. If you have default values for most settings, then you can set those by implementing Default for that struct.

Then, the rest of your library / application struct all contain shared references to the Config (or Rc/Arc if you don’t mind paying a runtime cost in exchange of getting rid of noisy lifetimes annotations). Now Rust will make sure everything is used correctly.

And since a test function is like a main function, you can then create and mutate the Config structure, and then follow the classic code pattern.

Example: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=cffc76febe3c4a7f060f956cbaed8f75

1 Like
#11

I know. Unfortunately, the pattern of “passing down” and sharing the config within the program is not possible in my (very special) use-case.
I am currently exploring different options.

#12

not even with “dynamic lifetimes” (Rc / Arc) ?

#13

I guess I can’t recommend much without knowing your particular restrictions, but in the past I’ve used both thread_local!() and scoped_thread_local!() to avoid passing parameters to things. Both of those require you control the runtime though!

If it’s a situation like exposing C bindings where you don’t control what threads your code is called from since you aren’t calling it yourself, I’d probably recommend splitting out the code which uses the config from the wrapper code. Then the config-using code can be unit tested, and the wrapper code can just call the config-using code passing in the static config?

1 Like
#14

In cases like this, it might be better to consider the entire environment as a test case, at a higher level of granularity: a separate CI setup and invocation for your program, that runs all tests with each config. You’ll probably find soon enough that there are more “global” things, like the contents of files or directories, that need to be varied.

2 Likes
#15

This could be done with trait objects (dynamic typing) after having defined all the API of your Config with a trait. But it does require changing the intial code for it to accomodate to the runtime “cast” (vtable) required to call a trait object’s method; which looks excessive if it is just to enable “mock-testing”. If you are nevertheless interested, I recommend you take a look at rust logging facade, designed with precisely that in mind: that downstream crates override the logging implementation.

1 Like
#16

This seems very interesting and is quite close to what I actually need. They have:

static mut LOGGER: &'static Log = &NopLogger;
static STATE: AtomicUsize = ATOMIC_USIZE_INIT;
fn set_logger(..) { 
  // unsafely modify static vars using compare-and-swap
}

I will check if this approach is also valid in my case. Also, I will try to redesign the library so that it does not require this hack :sweat_smile:

#17

Lucky you I wanted to tinker a bit with it so you can fetch from this example (compare running it vs testing it):

The example supposes you start from something like this:

#[derive(Copy, Clone)] // Clone will be useful for test mutation
struct Config {
    debug: bool,
    count: usize,
}

lazy_static!{
    static ref CONFIG: Config = Config {
        count: 42,
        debug: false,
    };
}

and that you then use CONFIG.count and CONFIG.debug around your code.

Then you need to do the following:

  • abstract the API using a trait, and make the code use the trait (i.e. call the getters instead of direct field access):
// We need to abstract the behavior with a trait
trait IsConfig
{
   fn debug (&self) -> bool;
   fn count (&self) -> usize;
}

impl IsConfig for Config {
   #[inline] fn debug (&self) -> bool { self.debug }
   #[inline] fn count (&self) -> usize { self.count }
}

this way we get the following property:

// Code invariant:
// the identifier CONFIG is global
// and dereferences to something implementing
// the `IsConfig` API;
// e.g.
//  `CONFIG.debug()`
// and
//  `CONFIG.count()`
// can always be called.
  • the trick comes here:
    1. rename CONFIG in the lazy_static!definition to INITIAL_CONFIG (we will “undo” this with a clever [pub] use self::INITIAL_CONFIG as CONFIG)

    2. using #[cfg(test)] conditional compilation, we apply the previous step if cfg(not(test)) and this way nothing changes when not testing

    3. but now let’s do the magic for cfg(test). We want to use &'static dyn IsConfig. The layer of indirection allows us to change the pointer to our mock IsConfigs. But to change we need inner mutability. Luckily &_ is Copy, so we use Cell and thread_local! to get what we want. We call that static OVERRIDEN_CONFIG.

    4. now remains the problem of the ergonomics: instead of CONFIG.count() we need to use OVERRIDEN_CONFIG.with(Cell::get).count(). Such ergonomics can be fixed with a little Deref magic (called ConfigProxy):

lazy_static!{
    static ref INITIAL_CONFIG: Config = Config {
        count: 42,
        debug: false,
    };
}
cfg_if!(
if #[cfg(not(test))]
{
    // pub /* if needed */
    use self::INITIAL_CONFIG as CONFIG;
}
else
{
    struct ConfigUninit;
    impl IsConfig for ConfigUninit {
        fn debug (&self) -> bool { panic!("uninit") }
        fn count (&self) -> usize { panic!("uninit") }
    }

    use ::std::cell::Cell;
    thread_local!{
        static OVERRIDEN_CONFIG
            : Cell<&'static dyn IsConfig>
            = Cell::new(&ConfigUninit)
        ;
    }

    struct ConfigProxy;
    impl ::std::ops::Deref for ConfigProxy {
        type Target = dyn IsConfig;

        #[inline]
        fn deref (&self) -> &Self::Target {
            OVERRIDEN_CONFIG.with(Cell::get)
        }
    }

     // pub /* if needed */
    static CONFIG: ConfigProxy = ConfigProxy;
});

Et voilĂ !

fn main ()
{
    // [src/main.rs:82] CONFIG.count() = 42
    dbg!(CONFIG.count());
}
    #[test]
    fn with_debug_and_count_3 ()
    {
        let mut config = INITIAL_CONFIG.clone();
        config.debug = true;
        config.count = 3;
        OVERRIDEN_CONFIG.with(|slf|
            slf.set(Box::leak(Box::new(config)))
        );
        
        assert_eq!(
            CONFIG.count(),
            3,
        );
        // We leak mem::size_of::<Config>() bytes for each test;
        // We could use Box::from_raw + Cell::replace to fix that
    }
running 1 test
test tests::with_debug_and_count_3 ... ok

test result: ok. 1 passed; 0 failed;

EDIT: using this pattern only to modify attributes from a Cloneable Config struct is a little overkill (it actually does not require using trait objects: we could have replaced every dyn IsConfig occurrence with Config (except for the intial value of the global pointer, that would have required some effort).

The good thing here, on the other hand, is that, by using trait objects / dynamic typing, we are really able to “override any method”: we just have to define our own MockConfig and then impl IsConfig for MockConfig { as we see fit.

1 Like
#18

Thanks, this looks really good!
I will try to implement something like this.