Volatile + relaxed atomic load/store

I'd like a way to mark some memory accesses as both volatile and atomic (with relaxed ordering), but apparently I can only do either of the two. Is there a way to achieve this?


Some context. I have a rather large struct (~2 MiB) of u64-like fields representing registers of an FPGA. My software interacts with the FPGA firmware by reading and writing these fields. As far as this project goes, I'm only interested in the target aarch64-unknown-linux-gnu, so it's OK if a solution/workaround is only valid/sound on that platform.


The register reads and writes need to be volatile, as the compiler should not be allowed to e.g. optimize away a write that is never read, or merge two writes with the same value. So right now I perform the accesses using ptr::read_volatile() and ptr::write_volatile(). I actually use something heavily inspired by the crate volatile-register which, with some caveats (e.g. here and here), works just great.

For those unfamiliar, excerpt of code for a read-only and a write-only register:
use core::cell::UnsafeCell;
use core::ptr;

/// adapted from crate vcell
#[repr(transparent)]
pub struct VolatileCell {
    value: UnsafeCell<u64>,
}
impl VolatileCell {
    pub const fn new(value: u64) -> Self {
        VolatileCell {
            value: UnsafeCell::new(value),
        }
    }
    #[inline(always)]
    pub fn get(&self) -> u64  {
        unsafe { ptr::read_volatile(self.value.get()) }
    }
    #[inline(always)]
    pub fn set(&self, value: u64) {
        unsafe { ptr::write_volatile(self.value.get(), value) }
    }
}

/// adapted from crate volatile-register
/// Read-Only register
pub struct RO {
    register: VolatileCell,
}
impl RO {
    #[inline(always)]
    pub fn read(&self) -> u64 {
        self.register.get()
    }
}

/// Write-Only register
#[repr(transparent)]
pub struct WO {
    register: VolatileCell,
}
impl WO {
    #[inline(always)]
    pub fn write(&self, value: u64) {
        self.register.set(value)
    }
}


Now, I'd like to have multiple threads being able to interact with these registers concurrently. The docs for ptr::write_volatile() have something to say about it:

Just like in C, whether an operation is volatile has no bearing whatsoever on questions involving concurrent access from multiple threads. Volatile accesses behave exactly like non-atomic accesses in that regard. In particular, a race between a write_volatile and any other operation (reading or writing) on the same location is undefined behavior.

And indeed Miri shows an error if I write to the same register from different threads. (To be honest, I don't really see how ptr::write_volatile() can be any different from a relaxed atomic store... But I accept there are things I don't yet understand!)

I could make the accesses atomic by basically replacing the VolatileCell with an AtomicCell, and then Miri accepts the code. (Note that I only need Relaxed memory ordering, I'm not interested in any stronger ordering.) But then the writes are non-volatile, and the compiler starts optimizing away reads and writes that it really shouldn't.

Here's the AtomicCell:
use core::sync::atomic::{AtomicU64, Ordering};

#[repr(transparent)]
pub struct AtomicCell {
    cell: AtomicU64,
}
impl AtomicCell {
    #[inline(always)]
    pub const fn new(value: u64) -> Self {
        AtomicCell {
            cell: AtomicU64::new(value),
        }
    }
    #[inline(always)]
    pub fn get(&self) -> u64 {
        self.cell.load(Ordering::Relaxed)
    }
    #[inline(always)]
    pub fn set(&self, value: u64) {
        self.cell.store(value, Ordering::Relaxed)
    }
}

unsafe impl Send for AtomicCell {}
unsafe impl Sync for AtomicCell {}

So the question is: is there anything I can do to tell the compiler I want these reads/writes to be both volatile and atomic with relaxed ordering?

For what is worth, volatile and relaxed atomic both compile down to the same asm, at least on the platform I'm interested in. So it should really be "just" a matter of telling the compiler how it's allowed to optimize my code.

(As an historical note, all atomic loads and stores used to be volatile too.)

1 Like

I recommend that you use volatile operations for this. It's true that you can't use them in parallel on ordinary memory, but you are not working with ordinary memory — reading or writing to your registers are IO operations with side-effects and do not behave like ordinary memory.

Is this UB? Well, I don't think the UB-rules about this kind of volatile access are set in stone. If you're not comfortable with that, then use inline assembly instead.

Honestly, UB-wise, I would be more worried about those caveats you mentioned that you are ignoring. You shouldn't create reference to your FPGA registers, because references assume that they point at ordinary memory.

I recommend an approach like what I suggested in post 4 of your second link.

2 Likes

Do you mean one possible way is to use volatile-only operations (not atomic), put an unsafe impl Sync for VolatileCell {}, and assume this is not UB because I'm not using ordinary memory?

That could be fine for me. I've tested this, and it surely seems to work fine (a part from Miri complaining about it). But "seems to work fine" is not always enough, hence my OP above.

I have thought about that, also due to this thread on IRLO, but I feel like I lack some basic understanding and don't see how that is supposed to help. Like I showed in the Compiler Explorer link above, there is no difference in the assembly between a volatile write or relaxed atomic store. Why does me writing the assembly explicitly help? Also, how does inline asm affect inlining?

Thanks, I'll reconsider your approach from that thread. However spurious reads are really not a big concern in my specific application, so that's lower priority (I did include those links in my OP for reference to future readers). Also I think that issue is somewhat orthogonal to the one from this current topic: regardless of how I "structure" my struct (with or without references to the fields), at some point I'll obtain a pointer and I'd like to make volatile reads/writes concurrently.

The thing you're worried about is how the compiler will optimize code containing your writes. Inline assembly is as close to unoptimizable as you can get.

I don't know. Try it.

I don't think it's orthogonal. A special hardware memory location that's only ever accessed in a volatile manner is, in my book, quite different from a memory location that also has non-volatile accesses. (And creating a reference to it is essentially that.)

1 Like

Yes, I understand and in general I agree. My point is that even if I structure my code like you suggested in the other thread thus avoiding ever creating a reference (which I'll try to do at some point!), the question of this thread still applies: how to get atomic semantics on top of volatile semantics?

OK, makes sense. I'll try that, thanks!

Sorry for pressing, let me ask this again to make sure I understood you correctly:

Rust does not have volatile atomics. My suggestion is that you use the volatile operations as-if they are atomic, and do nothing else beyond that.

Yes, but I would not do so by writing unsafe impl Sync for VolatileCell {} because I don't recommend using VolatileCell in the first place, because it creates references to the memory.

1 Like

The point is that you don't want to use &VolatileCell because that would imply you're using regular memory. Instead you want something like VolatileCellRef which stores a raw pointer. Then that could implement Sync or Send and Copy.

1 Like

Alright, thank you both @alice and @SkiFire13!

I think I finally understand, thank you for your patience. :slight_smile: It finally clicked now that avoiding the reference and this whole volatile/atomic are very related.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.