Wrapping peripherals into a singleton

I read the Embedded Rust Book and try to apply what's written, but I don't understand how to define peripherals as a singleton.

Following Chapter 3.1, I came up with the following solution for defining peripherals on SoC:

use crate::mmio::{R,RW};

const AUX_BASE_ADDR: usize = 0xfe000000;

#[repr(u32)]
pub enum AuxPeripheral {
    MiniUART = 1,
    SPI1 = 2,
    SPI2 = 4
}

#[repr(C)]
struct Registers {
    pub aux_irq: R<u32>,    // read-only memory-mapped register
    pub aux_enb: RW<u32>    // read-write memory-mapped register
}

pub struct Aux {
    p: &'static mut Registers
}

impl Aux {
    pub fn new() -> Aux {
        Aux {
            p: unsafe {
                &mut *(AUX_BASE_ADDR as *mut Registers)
            }
        }
    }

    pub fn enable(&mut self, p: AuxPeripheral) {
        self.p.aux_enb.write(self.p.aux_enb.read() | p as u32)
    }
}

I call it from my kernel::main():

pub fn main() {
    let mut aux = Aux::new();
    aux.enable(AuxPeripheral::MiniUART);
}

But how can I wrap all of it in a Singleton? The code snippet from Chapter 3.3 doesn't show where initialization itself is done:

struct Peripherals {
    serial: Option<SerialPort>,
}
impl Peripherals {
    fn take_serial(&mut self) -> SerialPort {
        let p = replace(&mut self.serial, None);
        p.unwrap()
    }
}
static mut PERIPHERALS: Peripherals = Peripherals {
    serial: Some(SerialPort),
};

Probably it's something obvious, but maybe it is the unexpected switching from using Timer in one chapter to SerialPort in another confuses me.

I understand, what Singleton does: it replaces reference to peripheral with None. But how do I obtain the reference in the first place? Where do I call Aux::new()?

static mut PERIPHERALS: Peripherals = Peripherals {
    serial: Some(SerialPort),
};

This is that singleton, SerialPort is just a placeholder for anything that you should only have 1 of.

So, do I just need to do something like

static mut PERIPHERALS: Peripherals = Peripherals {
    serial: Some(Aux::new()),
};

?

Yes, and never use Aux::new anywhere else. Just use that singleton.

Unfortunately, it doesn't work that way.
Having what I have in my module (the first code block in the first post), I tried to do:

use core::ptr::replace;
use crate::soc::auxiliaries::{Aux, AuxPeripheral};


struct Peripherals {
    aux: Option<Aux>
}

impl Peripherals {
    fn take_aux(&mut self) -> Aux {
        unsafe {
            let p = replace(&mut self.aux, None);
            p.unwrap()
        }
    }
}

static mut PERIPHERALS: Peripherals = Peripherals {
    aux: Some(Aux::new())
};

pub fn main() {
    // let mut aux = Aux::new();
    // aux.enable(AuxPeripheral::MiniUART);

    let mut aux = unsafe { PERIPHERALS.take_aux() };
    aux.enable(AuxPeripheral::MiniUART);
}

I was getting an error: error[E0015]: calls in statics are limited to constant functions, tuple structs and tuple variants, so I tried to make my Aux::new() function const. This way I'm getting even more errors:

error[E0017]: references in constant functions may only refer to immutable values
  --> src/soc/auxiliaries.rs:26:17
   |
26 |                 &mut *(AUX_BASE_ADDR as *mut Registers)
   |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ constant functions require immutable values

error[E0019]: constant function contains unimplemented expression type
  --> src/soc/auxiliaries.rs:26:17
   |
26 |                 &mut *(AUX_BASE_ADDR as *mut Registers)
   |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This takes me further away from the original example

:thinking: the documentation at https://rust-embedded.github.io/book/peripherals/a-first-attempt.html shows unsound code: get_systick() or SystemTimer::new need to be marked unsafe fn!

A sound pattern applied to your example would be something like this:

/// SAFETY BOUNDARY
/// It must not be possible to trigger UB with safe code outside this
/// module
mod library {
    #[repr(u32)]
    pub
    enum AuxPeripheral {
        MiniUART = 1,
        SPI1 = 2,
        SPI2 = 4
    }
    
    mmio_struct! {
        #[repr(C)]
        pub
        struct Registers {
            /// read-only memory-mapped register
            pub aux_irq: u32,
            
            /// read-write memory-mapped register
            pub aux_enb: u32,
        }
    }

    const AUX_BASE_ADDR: *mut Registers =
        0xFE000000_usize
    as _;

    pub
    struct Aux {
        registers: &'static mut Registers,
    }
    
    use ::core::sync::atomic::{
        AtomicBool,
        Ordering,
    };

    static AUX_EXISTS: AtomicBool = AtomicBool::new(false);
    
    impl Aux {
        pub
        fn try_new () -> Option<Self>
        {
            if AUX_EXISTS.swap(true, Ordering::AcqRel) {
                None
            } else {
                let registers = unsafe {
                    &mut *AUX_BASE_ADDR
                };
                Some(Aux { registers })
            }
        }
    
        pub
        fn enable (self: &'_ mut Self, p: AuxPeripheral)
        {
            let aux_enb = self.registers.aux_enb();
            aux_enb.write(aux_enb.read() | p as u32);
        }
    }
    
    impl Drop for Aux {
        fn drop (self: &'_ mut Self)
        {
            AUX_EXISTS.store(false, Ordering::Release);
        }
    }
}
5 Likes

Thank you very much, @Yandros, your snippet looks pretty clear to me (although I didn't go far with atomics and Drop trait, yet) except AUX_BASE_ADDR declaration:

const AUX_BASE_ADDR: *mut Registers =
    0xFE000000_usize
as _;

I haven't seen such constructs before. If I understand it correctly, as _ means that type doesn't matter. But why? When do this come in play? Can't we just use as usize as it is memory address and nothing else? Also I haven't seen numbers written in number_type notation before. Is it syntax to define data of a particular type?

as casts things around acording to some rules, in this case you can cast a usize to a thin pointer. You can also cast between integer types and in some cases, enums.

You can postfix any numeric literal with the numeric primitive type that you want it to be. In this case, we want it to be a usize. Underscores are valid in numeric literals, and don't change the value of the literal. They only exist for readability.

2 Likes

@Yandros, in the example you provided you suggest using atomic to store peripheral state. But you also use volatile read/write to access memory-mapped registers, which is not thread safe. The more I read about volatile and atomic accesses the more I get confused.

Lets take an example: the program starts on four cores and initializes four GPIO pins 0 to 3 -- each pin on its core. As I understand it, I should use atomic access to memory-mapped register for pins initialization. Or am I wrong? Why don't people normally use atomics to access MMIO?

Good question! Volatile and atomic have different purposes:

Atomics

  • Make the read-write operations on some integer-sized part of memory be done atomically, e.g., in the case of an atomic write, the memory goes from holding the value before the write right into holding the value that has been written, no intermediate state whatsoever.

    This is what prevents multiple CPUs that interact with the same part of memory in parallel from potentially observing the weird intermediate states (which is what happens when the read-writes are not atomic, resulting in a data-race and undefined behavior).

    Atomic operations provide even stronger tools, since you can usually even perform read-and-update operations atomically, such as incrementing a value.

    • Counter-example: If we didn't have these read-and-update atomic primitives, and just had plain old atomic read and plain old atomic write, we would not be able to increment a value without suffering from a race condition:

      use ::crossbeam::atomic::AtomicCell;
      
      fn bad_increment (x: &'_ AtomicCell<u32>)
      {
          x.store(x.load().wrapping_add(1));
      }
      

      Indeed, this code, despite not being racy at the hardware level (this is not UB), still suffers in practice from a race condition: by the time x.load().wrapping_add(1) is stored, another thread may have updated x, thus making the read value x.load() obsolete.

  • If this were all there was about atomics, they would still be a poor primitive, since we would not be able to create higher-level constructs on top of them, such as locks / mutexes: indeed, for atomic operations to be used to synchronised non-atomic mutation elsewhere in the code (this is exactly the intended use of a spin-lock), they must not be reordered nor elided by the compiler.

    For instance, the following function is unsound:

    fn bad_increment (x: &'_ NonUniqueMut<u32>)
    {
        static IS_LOCKED: AtomicBool = AtomicBool::new(false);
    
        while IS_LOCKED.swap(true, Ordering::Relaxed) {
            ::std::thread::yield_now();
        }
        let at_x: &mut u32 = unsafe {
            x.assume_unique_unchecked()
        };
        *at_x = at_x.wrapping_add(1);
        IS_LOCKED.store(false, Ordering::Relaxed);
    }
    
    • Playground (Note: given that the playground runs on a x86 architecture, the atomics in practice perform SeqCst operations instead of being truly Relaxed, so there is no apparent issue in the playground).

    Indeed, if this function ran on a single thread, then if the mutation of x was reorder outside the IS_LOCKED scoping logic, the results / semantics of the code would be the same, the change would not be observable. And this is the kind of stuff that a compiler may use to reorder the code in a more efficient fashion (note: on a single thread, the logic involving IS_LOCKED could thus maybe go as far as to be elided / removed from the binary).

    So one of the main strengths of atomics is them having/offering non-Relaxed memory Orderings, so as to be able to synchronize with non-atomic memory.

TL,DR

atomics are used to avoid data-races on memory accessed in parallel, and enable higher-level synchronization primitives

Volatile memory / accesses

The purpose of volatile accesses is to express to the compiler that memory marker volatile is not "regular" memory and that:

  • the contents of the memory may spuriously change at any moment (hence the name volatile). Thus the compiler is not allowed to assume / anticipate the value it is going to read (since when it does, it elides the read).

    • Example: the compiler will not optimize away *x == *x into true when x points to volatile memory, since by the time the value *x is queried a second time it may have changed.
  • writing to the memory may have side effects, thus the compiler is not allowed to elide / coalesce writes to that memory.

    • Example: the compiler will not optimize away *x = 42; *x = 0; into *x = 0;

MMIO

I don't know much about MMIO, but it seems to me that it is a "hack" to perform IO operations under the illusion of manipulating memory; bytes are read from input by reading at a special address in memory, and bytes are output(ted?) by writing at a special address in memory. These are the exact semantics that volatile memory offers. So do use volatile memory for MMIO.

  • Warning: Rust currently has no way to wrap memory as volatile, due to how transitivity of normal references work. This means that any MMIO crate offering a VolatileCell<T> wrapper with a read_volatile(&self) -> T API is unsound w.r.t. volatile semantics.

    Only raw pointers can be "tagged" / wrapped in such a way that they perform volatile read/write on the memory they "raw-point" to; hence my example in the previous code.

What about data races on MMIO?

was your question. Indeed, the documentation about write_volatile() clearly states:

Since Rust does not yet offer a atomic_volatile_write as of yet, and unless
atomic ⟹ volatile (I have not found anything in Rust suggesting it, despite how convenient that would be),
you have no choice but to protect / synchronize the volatile accesses with a lock:

pub
struct VolatileSyncRef<'a, T : Copy + 'a> {
    ptr: RwLock<*mut T>, // this could be RwLock<NonNull<T>> as an optimization
    _lifetime: PhantomData<&'a ()>,
}

impl<'a, T : Copy + 'a> VolatileSyncRef<'a, T> {
    pub
    fn new (ptr: &'a mut T)
      -> Self
    {
        Self { ptr: RwLock::new(ptr), _lifetime: PhantomData }
    }

    pub
    fn read (self: &'_ Self)
      -> T
    {
        let lock_guard = self.ptr.read().expect("Poisoned");
        let ptr: *mut T = *lock_guard;
        unsafe { ::core::ptr::read_volatile(ptr) }
    }

    // Do not use the lock if the value is not shared
    pub
    fn read_unique (self: &'_ mut Self)
      -> T
    {
        let ptr: *mut T = *self.ptr.get_mut();
        unsafe { ::core::ptr::read_volatile(ptr) }
    }

    pub
    fn write (self: &'_ Self, value: T)
      -> T
    {
        let lock_guard = self.ptr.write().expect("Poisoned");
        let ptr: *mut T = *lock_guard;
        unsafe { ::core::ptr::write_volatile(ptr, value) }
    }

    // Do not use the lock if the value is not shared
    pub
    fn write_unique (self: &'_ mut Self, value: T)
      -> T
    {
        let ptr: *mut T = *self.ptr.get_mut();
        unsafe { ::core::ptr::write_volatile(ptr, value) }
    }
}

/// # Safety
///
///   - shared read-writes are synchronized thanks to `RwLock`
unsafe impl<T : Copy> Sync for VolatileSyncRef<'_, T> {}
1 Like

I would not call MMIO a "hack". Many processor architectures have been doing this since ever there were computers.

The alternative is for the instruction set architecture to have special input output instructions and a separate "name space" for the IO devices. This does not sit well with the modern day idea of RISC processor design. Complicates things unnecessarily.

Think of it like this: As far as the CPU is concerned the memory address space is input/output. Be it FLASH ROM, dynamic or static RAM or whatever memory. Or input output devices. It's a beautifully regular and elegant interface to everything outside the CPU.

The processor address space is polymorphic and it's all done with hardware traits! :slight_smile:

2 Likes

I should have said "a clever hack" :slight_smile: (based indeed on using memory address space as an abstraction)

As a former designer of both processor and memory architectures, I have to agree with both @ZiCog and @Yandros: MMIO really is a clever hack that permits a processor to see an apparently unified address space. It makes the instruction set simpler and more uniform by shifting some of the burden to the memory subsystem.

Modern memory architectures with multi-level caches have to make special provisions for volatile "memory" (e.g., I/O) so that accesses reach the I/O hardware at the time of data fetch/store, rather than getting caught up in cache lines via prefetch and deferred store.

1 Like