Separating Concerns and Lifetimes

I'm attempting to write a Gameboy emulator, and I'm trying to keep hardware components as isolated as possible. In other words, the CPU, PPU (aka GPU), and the MMU (mem-map unit) are each their own concepts without knowing about each other. If either of these components needs something from the outside world, they invoke a callback/closure that the outside world can subscribe to and fulfill. Then, I have the DMG, which is the motherboard component that acts as a higher-order component by owning instances of each of these smaller components just like physical motherboards. The motherboard acts as the glue between these smaller components by defining closures for its child components' callbacks.

Right now, my Cpu struct needs to be able to fetch an instruction from memory via the program counter. The Cpu struct owns a closure to act as a callback like this:

use std::ops::Fn;

pub type FetchOp = Box<dyn Fn(u16) -> u8>;

pub struct Cpu {
    // ...
    pc: u16,
    on_fetch_op: FetchOp,
}

impl Cpu {
    pub fn new(on_fetch_op: FetchOp) -> Cpu {
        Cpu {
            // ...
            pc: 0,
            on_fetch_op,
        }
    }

    pub fn tick(&mut self) {
        let opcode = self.fetch_op();

        if opcode == 0xCB {
            let cb_opcode = self.fetch_op();

            self.exec_cb_op(opcode, cb_opcode);
        } else {
            self.exec_op(opcode);
        }
    }

    fn fetch_op(&mut self) -> u8 {
        let opcode = (self.on_fetch_op)(self.pc);

        self.pc += 1;

        opcode
    }

    fn exec_op(&self, code: u8) {
        // unimplemented
    }

    fn exec_cb_op(&self, code: u8, cb_code: u8) {
        // unimplemented
    }
}

The Cpu also requires that closure to be defined upon construction since it doesn't really make sense to ever not have it.

Here's my code for the motherboard:

use crate::cpu::{Cpu, FetchOp};
use crate::mmu::Mmu;
use crate::ppu::Ppu;

pub struct Dmg {
    cpu: Cpu,
    mmu: Mmu,
    ppu: Ppu,
}

impl Dmg {
    pub fn new() -> Dmg {
        let mmu = Mmu::new();
        let fetch_op_handler: FetchOp = Box::new(|pc_addr| mmu.read(pc_addr).unwrap());

        let cpu = Cpu::new(fetch_op_handler);

        let result = Dmg {
            cpu,
            mmu,
            ppu: Ppu {},
        };

        result
    }

    pub fn process(&mut self) {
        self.cpu.tick();
    }
}

After trying this out, it looks like I'm running into multiple borrow errors which makes sense now that I see it. Am I on the right track, or should I go about this in a different direction? Callbacks are typically how I've solved circular dependencies in the past for other languages, but is this a practical design choice in Rust?

As you recognized writing Rust with circular dependencies is hard. A nice tutorial about this is Learn Rust With Entirely Too Many Linked Lists.

The problem is that you borrow the mmu immutable in the closure which makes the whole dmg borrowed immutable which in turn makes changing the CPU's state impossible. A way to work around this would be reference-counting. Give each closure a Rc<Mmu> instead of a &Mmu and everything should be fine. If you require the Mmu to be mutable you may want to go for Rc<RefCell<Mmu>>. Interior mutability is sometimes the right thing but personally I try to avoid it when possible.

If you really want your components to be decoupled I'd go with message passing, i.e. channels. However, this probably ends up running everything in their own thread. As you simulate hardware that is probably what you end up with anyway as hardware is inherently concurrent. If you feel so you can also try async/.await with tokio or use an actor system (but this seems to me a little bit to much for a (toy?) project).

1 Like

Since the problem with your current approach is that e.g. Cpu is supposed to have access to the whole Dmg but it is also just a component of that struct. Since you can turn a &mut Dmg into three references &mut Cpu, &mut Mmu, &mut Ppu, any access to e.g. Mmu that is accessible through Cpu would need to alias the mutable &mut Mmu.

Next to solutions that could be quite flexible but also potentially more complex at runtime, like what @farnbams discussed above, I think there is the option of having Cpu just be an abstract view of the whole Dmg. I think I have a way this could be done (writing this post as I’m implementing it):

First separate out the cpu state into its own struct

pub struct CpuState {
    pc: u16,
}

then we can make Cpu itself just a generic wrapper type

pub struct Cpu<C> {
    context: C,
}

where C will later be instantiated as Dmg.

The Dmg struct will be defined like this

mod cpu;
mod mmu;

use cpu::{CpuState};
use cpu::{MmuState};

pub struct Dmg {
    cpu_state: CpuState,
    mmu_state: MmuState,
}

Now, back in the cpu module, we can start specifying what we need to know about the context:

pub trait CpuContext {
    fn on_fetch_op(&self) -> u8;
}

I’ve decided that instead of passing pc to on_fetch_op directly, I’ll allow for public read-access to pc:
Of course, passing pc would’ve been possible too, with fn on_fetch_op(&self, pc: u16) -> u8;

impl CpuState {
    pub fn pc(&self) -> u16 {
        self.pc
    }
}

We also need access to the CpuState from Cpu<C>:

pub trait CpuContext {
    fn cpu_state(&self) -> &CpuState;
    fn cpu_state_mut(&mut self) -> &mut CpuState;
    fn on_fetch_op(&self) -> u8;
}

For convenience, let’s implement

impl<C: CpuContext> Deref for Cpu<C> {
    type Target = CpuState;
    fn deref(&self) -> &CpuState {
        self.context.cpu_state()
    }
}
impl<C: CpuContext> DerefMut for Cpu<C> {
    fn deref_mut(&mut self) -> &mut CpuState {
        self.context.cpu_state_mut()
    }
}

Now we can start implementing methods:

impl CpuState {
    pub fn new() -> CpuState {
        CpuState { pc: 0 }
    }
}
impl<C: CpuContext> Cpu<C> {
    pub fn tick(&mut self) {
        let opcode = self.fetch_op();

        if opcode == 0xCB {
            let cb_opcode = self.fetch_op();

            self.exec_cb_op(opcode, cb_opcode);
        } else {
            self.exec_op(opcode);
        }
    }

    fn fetch_op(&mut self) -> u8 {
        let opcode = self.context.on_fetch_op();

        self.pc += 1;

        opcode
    }

    fn exec_op(&self, code: u8) {
        // unimplemented
    }

    fn exec_cb_op(&self, code: u8, cb_code: u8) {
        // unimplemented
    }
}

Here’s a stub Mmu for now:

pub struct MmuState;
impl MmuState {
    pub fn new() -> Self {
        MmuState
    }
}

pub trait MmuContext {
    fn mmu_state(&self) -> &MmuState;
    fn mmu_state_mut(&mut self) -> &mut MmuState;
}

pub struct Mmu<C>{
    context: C,
}

impl<C: MmuContext> Deref for Mmu<C> {
    type Target = MmuState;
    fn deref(&self) -> &MmuState {
        self.context.mmu_state()
    }
}
impl<C: MmuContext> DerefMut for Mmu<C> {
    fn deref_mut(&mut self) -> &mut MmuState {
        self.context.mmu_state_mut()
    }
}

impl<C: MmuContext> Mmu<C> {
    // guessing the type here
    pub fn read(&self, addr: u16) -> Option<u16> {
        Some(42) // stub
    }
}

In case Mmu<C>::read needs mutable access to self, you’d need to change the type of CpuContext::on_fetch_op to require &mut self as well, but everything still can be made to work.

Finally, let’s pull everything together back in the top level module with Dmg:

use cpu::{CpuState, CpuContext};
use mmu::{MmuState, MmuContext};

pub struct Dmg {
    cpu_state: CpuState,
    mmu_state: MmuState,
}
impl Dmg {
    pub fn new() -> Self {
        Self {
            cpu_state: CpuState::new(),
            mmu_state: MmuState::new(),
        }
    }
}

impl CpuContext for Dmg {
    fn cpu_state(&self) -> &CpuState {
        &self.cpu_state
    }
    fn cpu_state_mut(&mut self) -> &mut CpuState {
        &mut self.cpu_state
    }
    fn on_fetch_op(&self) -> u8 {
        // TOOD
        // how to access method `Cpu<C>::read` here???
        42
    }
}
impl MmuContext for Dmg {
    fn mmu_state(&self) -> &MmuState {
        &self.mmu_state
    }
    fn mmu_state_mut(&mut self) -> &mut MmuState {
        &mut self.mmu_state
    }
}

The remaining problem is: How do I get a &Mmu<Dmg> from a &Dmg to implement on_fetch_op?

We can use the ref-cast crate here.

use ref_cast::RefCast;
#[derive(RefCast)]
#[repr(transparent)]
pub struct Cpu<C> {
    context: C,
}
use ref_cast::RefCast;
#[derive(RefCast)]
#[repr(transparent)]
pub struct Mmu<C>{
    context: C,
}

Now we can do:

use ref_cast::RefCast;
use cpu::{Cpu, CpuState, CpuContext};
use mmu::{Mmu, MmuState, MmuContext};


impl Dmg {
    fn cpu(&self) -> &Cpu<Self> {
        Cpu::ref_cast(self)
    }
    fn cpu_mut(&mut self) -> &mut Cpu<Self> {
        Cpu::ref_cast_mut(self)
    }
    fn mmu(&self) -> &Mmu<Self> {
        Mmu::ref_cast(self)
    }
    fn mmu_mut(&mut self) -> &mut Mmu<Self> {
        Mmu::ref_cast_mut(self)
    }
}

And use it:

fn on_fetch_op(&self) -> u8 {
    self.mmu().read(self.cpu().pc()).unwrap()
}

Here’s the full code. (Including the simplification described below.)

Edit: One further simplification: Using AsRef and AsMut supertraits instead of the cpu_state and cpu_state_mut trait methods:

pub trait CpuContext: AsRef<CpuState> + AsMut<CpuState> {
    fn on_fetch_op(&mut self) -> u8;
}

impl<C: CpuContext> Deref for Cpu<C> {
    type Target = CpuState;
    fn deref(&self) -> &CpuState {
        self.context.as_ref()
    }
}
impl<C: CpuContext> DerefMut for Cpu<C> {
    fn deref_mut(&mut self) -> &mut CpuState {
        self.context.as_mut()
    }
}

Then these can be implemented using the derive_more crate, saving some boilerplate code:

use derive_more::*;

#[derive(AsRef, AsMut)]
pub struct Dmg {
    cpu_state: CpuState,
    mmu_state: MmuState,
}
impl CpuContext for Dmg {
    fn on_fetch_op(&mut self) -> u8 {
        let pc = self.cpu().pc();
        self.mmu_mut().read(pc).unwrap()
    }
}
2 Likes

@farnbams Thanks for the insight. So it does sound like reference counting might be a good approach in this case. I was wondering about that, but I wasn't sure since I haven't used ref count pointers in Rust yet. I hear you on the interior mutability side of things. I read up on that to try to get a feel for when that's useful, and it sounds like interior mutability is mainly useful for "scratchpad"-like scenarios where your want the interface of your struct to appear immutable, yet you might want some deeper implementation details to be mutable as a means to temporarily cache small bits of data for optimizing calculations.

I almost went the channels route myself lol. That way, I can think in terms of sending messages across components, and it kind of aligns to how the physical hardware works as everything typically runs at different speeds/bandwidths/etc. That said, I don't know if it adds or reduces from emulation accuracy, and I've never built an emulator before.

@steffahn So, it sounds like your approach involves taking things a step further by separating data and behaviors into structs. That certainly seems to fit Rust's philosophy. Especially with traits since it's similar to interfaces/abstract classes in other languages, except only the behaviors are reusable, and not the data/fields/props/etc. The intentional lack of reusable state within traits is a seemingly subtle change when first learning about it, but has some pretty big design implications to it in practice when compared to traditional inheritance from other languages.

I struggle with traits and lifetimes still. I'm still trying to understand the mindset behind traits. For example, reusable behaviors is a simple, descriptive reason, but I'm used to inheritance (when composition falls short). So, when I think of reusable behaviors, I'm also thinking of the reusable data that comes with it.

Why does the Cpu struct need a generic parameter though? Is this strictly to fulfill the Deref and DerefMut traits? I've haven't implemented those traits into any of my structs yet, so I'm wanted to make sure I understand.

Furthermore, if we didn't need Deref, would we still need that generic type param, or would it be more proper to use Box<dyn CpuContext> directly in the for the field declaration in the struct? That would incur a heap penalty though... Then again, if we use a generic param that's constrained by a trait like you're doing, then we can side-step the heap in favor of the stack because generics expand code during compilation, and is not runtime. Is that right? If so, then is it a good practice to do this for traits in general?

Dynamic dispatch incurs a number of different penalties as compared to generic code and generics allow users that need dynamic dispatch to opt-into these costs without requiring it everywhere, so it’s usually more idiomatic to use a generic parameter instead of an explicit dyn object whenever you can. @steffahn’s code, for example, can also use dynamic dispatch by using Box<dyn CpuContext> if you have a block like this somewhere:

impl<T:CpuContext+?Sized> CpuContext for Box<T> {
    /* ... */
}

I fact, there’s very little (if anything) that would break by adding ?Sized annotations to allow a Cpu<dyn CpuContext> to exist without an inner Box, but that gets into more advanced techniques.

(Small note to avoid confusion, I just noticed while writing this that I forgot changing back the &mut into & for on_fetch_op and read some time when playing around with the code. My previous post was already inconsistent about that, the github gist does contain the &mut version, too. If &mut isn’t needed for Mmu::read, changing back to & would be a good idea.)


I guess the abstract class comparison is fair. One could think of Dmg as inheriting from Cpu and Mmu, the way my code is structured. Cpu provides some fields it needs (bundled as a struct) and provides some behavior that can be applied to anything that

  • contains the correct data fields (by giving access to a CpuState, and
  • implements the necessary methods, like on_fetch_op, which are more-or-less like abstract methods in OOP, if you will

One difference is that in the cpu module its behavior is not just applied to the Dmg directly. One could’ve done something like this, too, where Cpu becomes a trait, e.g.

trait Cpu: CpuContext {
    pub fn tick(&mut self) {
        let opcode = self.fetch_op();

        if opcode == 0xCB {
            let cb_opcode = self.fetch_op();

            self.exec_cb_op(opcode, cb_opcode);
        } else {
            self.exec_op(opcode);
        }
    }
    // etc...
}
impl<T> Cpu for T where T: CpuContext {}

but that would

  • associate the methods logically belonging to the Cpu directly to the Dmg instead, and
  • not leave such a straightforward way to also have private methods

Good question.. first of all

no it isn’t for that reason, but we can answer the question about Deref first.

Let me, for the sake of demonstrating more easily that something compiles, use links to a playground where I removed the dependency for derive_more (by using these cpu_state() etc. methods again) and I removed the dependency on ref-cast (by inserting some transmute calls).

Here’s the playground.

Now we can see what Deref and DerefMut are good for, as noted they are "for convenice", so let’s see what happens when they’re removed:

   Compiling playground v0.0.1 (/playground)
error[E0599]: no method named `pc` found for reference `&Cpu<Dmg>` in the current scope
  --> src/lib.rs:41:29
   |
41 |         let pc = self.cpu().pc();
   |                             ^^ method not found in `&Cpu<Dmg>`

error[E0609]: no field `pc` on type `&mut Cpu<C>`
  --> src/lib.rs:96:18
   |
96 |             self.pc += 1;
   |                  ^^ unknown field
   |
   = note: available fields are: `context`

error: aborting due to 2 previous errors

Some errors have detailed explanations: E0599, E0609.
For more information about an error, try `rustc --explain E0599`.
error: could not compile `playground`

To learn more, run the command again with --verbose.

In the error, self.cpu() is a &Cpu<Dmg>, in the second error, self is a &mut Cpu<C> with C: CpuContext. The field pc and the method pc() are defined on the struct CpuState though, so we can access the CpuState directly:

let pc = self.cpu().pc();
// becomes
let pc = self.cpu_state.pc();

and

self.pc += 1;
// becomes
self.context.cpu_state_mut().pc += 1;

now everything compiles again.

What the Deref and DerefMut trait implementations offer is allow access to methods and fields of CpuState on a Cpu<C> (C: CpuContext), so in particular on Cpu<Dmg>.

In the second example, you can see I added manually a call to .context.cpu_state_mut(), something that previously the DerefMut implementation provided. The compiler chose to insert an implicit .deref_mut() or .deref() call in both cases:

let pc = self.cpu().deref().pc();
// and
self.deref_mut().pc += 1;

(see how inserting them explicitly still works)

The choice between deref and deref_mut was because .pc() takes &self and += needs mutable access to .pc. The compiler always tries inserting implicit deref/deref_mut if a method or field isn’t available on a type directly.

Now on the generic parameter: Actually it is very easy to get rid of the generic parameter entirely:

Starting with the original playground, if I change Cpu to

#[repr(transparent)]
pub struct Cpu {
    context: super::Dmg,
}

and then let the compiler tell me where I still have parameters C and just remove them, everything works again / still works. The same could be done to Mmu. Then Cpu and Mmu are not much more than just type synonyms for Dmg. We could go even further and remove the CpuContext trait [using methods on Dmg directly] and finally we could make Cpu literally be a type synonym. The problem is: Now we have a dependency from the Cpu type to the Dmg type. Previously the cpu module was entirely self-contained. The entire purpose of the generic type argument is to ensure proper separation of concerns. The cpu module with a generic Cpu<C> type could just as well live in a different crate from Dmg.

This wouldn’t work in a straightforward manner. The way that repr(transparent) and the ref-cast trait trait are used are what ensures that it is possible to go back and forth converting between &Cpu<Dmg> and &Dmg. Or between &mut Cpu<Dmg> and &mut Dmg. Actually both conversions are already happening in the code, and they make sure that the interaction with lifetimes stays pain-free. Let’s trace a call to Cpu<Dmg>::tick in the original code (the github gist)

  • tick is called on a &mut Cpu<Dmg>
  • it uses fetch_op. The fetch_op method gets passed a (re-)borrow of the &mut Cpu<Dmg> with a shorter lifetime
  • fetch_op first calls self.context.on_fetch_op() and later modifies self.pc
    • the .context access turns the &mut Cpu<Dmg> into a &mut Dmg, the on_fetch_op call gets passed a &mut Dmg.
      • on_fetch_op creates a &Dmg from the &mut Dmg for a call to .cpu(), which turns the &Dmg back into a &Cpu<Dmg>. Then the call to .pc() happens after a Deref step going from &Cpu<Dmg> to &CpuState. After the &Dmg is done being used, the &mut Dmg can be used again to call mmu_mut(), creating a &mut Mmu<Dmg> which is passed to read()
    • Modifying self.pc needs to turn self: &mut Cpu<Dmg> into &mut CpuState first (via deref_mut) and then modifies the pc field of that.

Converting from e.g. a &Dmg to a &Cpu would not be possible if Cpu contained a Box<dyn CpuContext>. The reason is that, technically, as kind-of already shown in previous paragraphs, the types Cpu<Dmg> and Dmg are actually the same type at runtime. The &Dmg <-> &Cpu conversion is trivial/free in that case. With a Box, the types would be distinced, and Cpu would contain a pointer to the corresponging Dmg, but the Dmg would have no pointer back.

The only approach using dyn that could work would be to, instead &'a Cpu<Dmg> and &'a mut Cpu<Dmg> use two new types Cpu<'a> and CpuMut<'a>, defined as

pub struct Cpu<'a> {
    context: &'a dyn CpuContext
}

and

pub struct CpuMut<'a> {
    context: &'a mut dyn CpuContext
}

Then conversions &'a Dmg <-> Cpu<'a> and &'a mut Dmg <-> CpuMut<'a> would still be technically possible. The disadvantage with this approach would be

  • that dynamic dispatch is involved, and
  • that two new types would need to be defined, with the methods split between them, plus a way to go from CpuMut to Cpu

Yet another approach could probably be to just say

pub type Cpu = dyn CpuContext;

which avoids the problem of defining new structs here’s an example in the playground (but still introduces dynamic dispatch).

Sounds about right. Generics are monomphized at compile times which means that function calls can be inlined and properly optimized and, I guess, also indirections through the heap can often be avoided.

I don’t really understand what exactly you’re referring to with "this" in this sencence. Or do you just mean: writing generic code with parameters that have trait constraints? That is pretty much exactly what traits are primarily for. Trait objects are just a secondary feature that’s available for certain forms of traits, but the main application of traits is in generic code. And, well, I probably shouldn’t forget their other primary purpose, that is, they are good for function "overloading", I guess. (And probably another secondary purpose is "adding methods to existing types".)

1 Like