Raw pointer to static item

Hi all,

I'm writing an emulator, and I'm wondering if there's any way to get a raw pointer to a static item at compile time. The code below errors:

pub static mut registers: [u32; 16] = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0];
pub const reg32: *mut u32 = &mut registers as *mut _ as *mut u32;

I've tried const fn's, and it still doesn't work.

And before you ask, yes, this does have a noticeable impact on performance. Having a pointer that evaluates to a constant value at runtime makes things 20-30% faster, at least in my benchmarks. Using a raw pointer also avoids the overhead bounds checking -- all indexes into this array are masked (i.e. reg & 7) at some point.

You don't need raw pointers for this, you can use get_unchecked.

I could be wrong, but I don't think this is possible. Constant evaluation wraps up well before the part of compilation where the address of a static will become known. You might have to use a lazily-initialized static instead of a constant.

Will that affect performance? The whole purpose of having a constant is to inline the value into the expression, eliminating the need for a lookup (the compiler knows exactly where the reg32 array is going to be). But if Rust inlines the address of the static somehow (?), then I'm happy with that too.

1 Like

Does this support unions? I have three overlapping register files (8-bit, 16-bit, 32-bit), and it's a lot easier (and faster) for me to do reg8[1] instead of reg32[0] >> 8 & 255.

You can call [T]::get_unchecked anywhere you can call <[T] as Index>::index.

I think the simplest thing to do, which should have no overhead, is to just write unsafe { &mut registers as *mut _ as *mut u32 } whenever you were planning to write reg32, or assign that expression to a local variable and pass it to any functions that need it. The casting has no run-time cost.

Is there any way to let the compiler know that the register array is going to be at a constant location? I tried the following:

pub union Registers {
    reg32: [u32; 16],
    reg16: [u16; 32],
    reg8: [u8; 64],
}
pub static mut registers: Registers = Registers {
    reg32: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
};

unsafe { *(unsafe { &mut registers as *mut _ as *mut u32 }).offset(6 as isize)}
// mov     rax, qword ptr [rip + example::registers@GOTPCREL]
// mov     eax, dword ptr [rax + 24]

In this case, the first load is unnecessary, since the registers array is going to remain constant (for C, Clang generates a single move, mov eax, dword ptr [rip + registers + 24]).

The one below does exactly what I need, except I don't want to hardcode all the addresses.

pub const reg32: *mut u32 = 65536 as *mut u32; 

(unsafe { *reg32.offset(6 as isize) }) as u32 
// mov eax, dword ptr [65560]

By the way, thanks for all the help!

I got error[E0013]: constants cannot refer to statics and error[E0764]: mutable references are not allowed in constants, when I tried this.

1 Like

This doesn't compile: playground.

1 Like

I've been playing around with this for a while, but there's one more wrinkle:

pub union Registers {
    reg32: [u32; 16],
    reg16: [u16; 32],
    reg8: [u8; 64],
}
pub static mut registers: Registers = Registers {
    reg32: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
};

fn test() -> u32 { 
    unsafe { *(&mut registers as *mut _ as *mut u32).offset(6) }}
}

It produces two memory accesses (one for the registers static item, the other for the actual read itself):

        mov     rax, qword ptr [rip + example::registers@GOTPCREL]
        mov     eax, dword ptr [rax + 24]
        ret

Is there any way to tell the compiler that the value registers is fixed and isn't going to change? That would reduce this to one read operation. Getting rid of mut makes the entire struct read-only, unfortunately.

As @jethrogb said, you can use (ordinary union access and) get_unchecked:

fn test() -> u32 {
    unsafe { *registers.reg32.get_unchecked(6) }
}

I can't imagine that does more than one memory access.

Edt: hmm, looking at the generated assembly I see it does the same thing as you saw; I don't know why. I'd still recommend this version over the one with casts.

Is there a way to set values? I don't see a set_unchecked method.

The double indirection happens because you can reassign the registers variable:

    unsafe {
        registers = Registers {
            reg32: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        };
    }

I'm looking for a way to prevent this, making the registers global itself immutable but the values that it points to mutable. That'd let the Rust compiler optimize out one of the two loads. Is this possible?

Yes, call get_unchecked_mut to get a mutable reference, then use an ordinary assignment.

1 Like

That doesn't change the address of the variable, though. I think the real problem is that the address isn't a compile-time constant. It can't change within a single execution of your program, but it may vary from one execution to the next.

2 Likes

Is there a way to make it a compile-time constant?

I don't think the compiler can know in advance where in memory the OS will load your program.

So I'm guessing that there's no way for Rust to get a constant reference to anything on the heap, at least statically. That's a bummer.

Thanks for all the help, though!

Static variables aren't heap-allocated, but I think the point stands.

Even though the address may be determined at launch time, the offset is a link-time constant and this should all be fixable using relative addressing and/or relocations. Nonetheless, there's no reason this would work better/faster with raw pointers vs. references.

pub static mut registers: [u32; 16] = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0];

pub unsafe fn set_to_one(index: usize) {
    *registers.get_unchecked_mut(index) = 1;
}

This just compiles to a mov to load the relative address and another to set the value. Pretty sure that's optimal.

2 Likes