[Kernel] `static` written once and read without `unsafe`

I'm developing a hobby operating system in Rust and stumbled upon a problem regarding statics (global variables). I have this static:

pub static mut VA_SIZE: usize = 0;

It contains the size of the virtual address space; this size cannot be known at compile-time and can only be figured out at boot time depending on the amount of physical memory available.

This static is set once in the early boot process through an unsafe:

unsafe {
    VA_SIZE = VA_BASE + min(PHYS_MEM_SIZE, LOWMEM_SIZE as u64) as usize;
}

In this execution context, SMP is disabled so only one logic CPU is running thus we have exclusive access to the static.

The problem is that trying to read the static afterwards is considered unsafe (which I didn't know and surprised me, I thought only writing was unsafe); while it is perfectly fine since no once mutates it. I don't want to wrap all readings by unsafe.

I could write a function such as:

pub fn get_va_size() -> usize {
    unsafe {
        VA_SIZE
    }
}

But would prefer to avoid it and access directly the variable.

I thought of creating a non-mutable static, getting its address, casting it as a mutable pointer and use it to set the value. But I don't known the consequences of doing that (certainly UB, but in practice, what could realistically happen?).

Try to avoid static mut; there's a chance it will be deprecated in the future. Instead, I would suggest using a wrapper like this:

pub struct VaSize(UnsafeCell<MaybeUninit<usize>>);

unsafe impl Sync for VaSize {}

impl Deref for VaSize {
    type Target = usize;
    fn deref(&self) -> &Self::Target {
        unsafe { &*self.0.get().cast::<usize>() }
    }
}

pub static VA_SIZE: VaSize = VaSize(UnsafeCell::new(MaybeUninit::uninit()));

// Set the value like so:
unsafe { *VA_SIZE.0.get() = MaybeUninit::new(va_size) };

And due to the Deref impl it can be accessed as a usize like any other most of the time.

Don't ever use this logic; if something's UB, you mustn't use it. Even if it works in your current situation, there is absolutely no guarantee it will work in any other (e.g. different platform, different rustc version, different invocation of rustc, different code).

5 Likes

This is an interesting alternative to static mut indeed, and also ergonomic for the reader thanks to Deref.

This is quite verbose though, but I think I still prefer it to the get_va_size() function!

Is the MaybeUninit really necessary?

The MaybeUninit isn't actually necessary, but there are a couple benefits to using it:

  • It more accurately represents the intention of your code - there's no magic number or anything.
  • It might allow for better optimizations by the compiler.
  • Tools like Miri will be able to detect access of an uninitialized value and throw an error, whereas Miri will be totally fine with you reading a zero value (which you don't want to do).
2 Likes

Why? What's wrong with the function?

Anyway, you could also just put it behind an atomic, which obviates the need for mut, unsafe, MaybeUninit, etc. and gives you a safe and easy way to access your global (playground):

use std::sync::atomic::{ Ordering, AtomicUsize };

static VA_SIZE: AtomicUsize = AtomicUsize::new(0);

fn set_va_size(new_size: usize) {
    VA_SIZE.store(new_size, Ordering::SeqCst);
}

fn get_va_size() -> usize {
    VA_SIZE.load(Ordering::SeqCst)
}
6 Likes

It's only a matter of elegance, I don't have strong opinion on that; using a function would satisfy me, I was more curious about the alternatives (if any).

About the atomic, I don't think this is an optimal solution since there is no need for atomics: one writer with exclusive access, and, after that, no writers and only readers. Also, SeqCst seems needlessly restrictive in this execution context. Or am I overlooking something?

You didn't ask for an optimal solution, but I highly doubt using a single atomic would have a noticeable performance impact anyway. Using an atomic has no additional cost on some common architectures, most notably on x86/x64. If you look at the generated assembly, you can see that the sequentially consistent load generates a mov, just like a plain old unsynchronized read would.

As a principle, I always start with SeqCst unless I surely know that something less conservative is appropriate. It's the easiest way to be sure. And, again, since you would only write this atomic once, there shouldn't be a problem with speed. It's much easier to start optimizing correct code when (or if) it matters than it is to try to debug a hard-to-reproduce race condition, especially inside a kernel.

4 Likes

Relaxed should be fine here. Sequential consistency is not needed since there's only one atomic variable, and Acquire/Release are only needed when there is shared mutability present. Using Relaxed also allows for loads and stores to be identical to their unsynchronized counterparts on both x86 and ARM.

7 Likes

Note that the way VaSize is written right now is unsound, it allows safe code to read from an unitialized value. You should either make the creation of a VaSize unsafe or not use MaybeUninit and force to give the usize a default value.

4 Likes

Yeah, use an atomic. Never use static mut if you can help it. (Or alternatively, you could use Once. Something like this:

use spin::Once;

static VA_SIZE: Once<usize> = Once::new();

// Initialize
VA_SIZE.call_once(|| VA_BASE + min(PHYS_MEM_SIZE, LOWMEM_SIZE as u64) as usize);

Then if you implement threads later, and they run at the same time this is computed, call .wait() on it and they'll block until the once has been initialized.

1 Like