I'd like a way to mark some memory accesses as both volatile and atomic (with relaxed ordering), but apparently I can only do either of the two. Is there a way to achieve this?
Some context. I have a rather large struct
(~2 MiB) of u64
-like fields representing registers of an FPGA. My software interacts with the FPGA firmware by reading and writing these fields. As far as this project goes, I'm only interested in the target aarch64-unknown-linux-gnu
, so it's OK if a solution/workaround is only valid/sound on that platform.
The register reads and writes need to be volatile, as the compiler should not be allowed to e.g. optimize away a write that is never read, or merge two writes with the same value. So right now I perform the accesses using ptr::read_volatile()
and ptr::write_volatile()
. I actually use something heavily inspired by the crate volatile-register which, with some caveats (e.g. here and here), works just great.
For those unfamiliar, excerpt of code for a read-only and a write-only register:
use core::cell::UnsafeCell;
use core::ptr;
/// adapted from crate vcell
#[repr(transparent)]
pub struct VolatileCell {
value: UnsafeCell<u64>,
}
impl VolatileCell {
pub const fn new(value: u64) -> Self {
VolatileCell {
value: UnsafeCell::new(value),
}
}
#[inline(always)]
pub fn get(&self) -> u64 {
unsafe { ptr::read_volatile(self.value.get()) }
}
#[inline(always)]
pub fn set(&self, value: u64) {
unsafe { ptr::write_volatile(self.value.get(), value) }
}
}
/// adapted from crate volatile-register
/// Read-Only register
pub struct RO {
register: VolatileCell,
}
impl RO {
#[inline(always)]
pub fn read(&self) -> u64 {
self.register.get()
}
}
/// Write-Only register
#[repr(transparent)]
pub struct WO {
register: VolatileCell,
}
impl WO {
#[inline(always)]
pub fn write(&self, value: u64) {
self.register.set(value)
}
}
Now, I'd like to have multiple threads being able to interact with these registers concurrently. The docs for ptr::write_volatile()
have something to say about it:
Just like in C, whether an operation is volatile has no bearing whatsoever on questions involving concurrent access from multiple threads. Volatile accesses behave exactly like non-atomic accesses in that regard. In particular, a race between a
write_volatile
and any other operation (reading or writing) on the same location is undefined behavior.
And indeed Miri shows an error if I write to the same register from different threads. (To be honest, I don't really see how ptr::write_volatile()
can be any different from a relaxed atomic store... But I accept there are things I don't yet understand!)
I could make the accesses atomic by basically replacing the VolatileCell
with an AtomicCell
, and then Miri accepts the code. (Note that I only need Relaxed
memory ordering, I'm not interested in any stronger ordering.) But then the writes are non-volatile, and the compiler starts optimizing away reads and writes that it really shouldn't.
Here's the AtomicCell:
use core::sync::atomic::{AtomicU64, Ordering};
#[repr(transparent)]
pub struct AtomicCell {
cell: AtomicU64,
}
impl AtomicCell {
#[inline(always)]
pub const fn new(value: u64) -> Self {
AtomicCell {
cell: AtomicU64::new(value),
}
}
#[inline(always)]
pub fn get(&self) -> u64 {
self.cell.load(Ordering::Relaxed)
}
#[inline(always)]
pub fn set(&self, value: u64) {
self.cell.store(value, Ordering::Relaxed)
}
}
unsafe impl Send for AtomicCell {}
unsafe impl Sync for AtomicCell {}
So the question is: is there anything I can do to tell the compiler I want these reads/writes to be both volatile and atomic with relaxed ordering?
For what is worth, volatile and relaxed atomic both compile down to the same asm, at least on the platform I'm interested in. So it should really be "just" a matter of telling the compiler how it's allowed to optimize my code.
(As an historical note, all atomic loads and stores used to be volatile too.)