Discussion: Context struct pattern

I find myself using a "context pattern" often when developing in Rust. I am probably not the first thing to think of something like this, but I haven't seen anything like it on the web, at least for Rust specifically. So I thought I should share.

The pattern looks like (working definition):

  1. Define a Context struct with data used for the duration of the application
  2. Define a Context::init() -> Result<Self> method to initialize the context. This may Err, for example, on invalid program arguments.
  3. Define a Context::run(&self) method to run the program after context is initialized.
  4. Define further implementation methods inside impl Context with &self to access context.
  5. Finally, define main as fn main() -> Result { Context::init()?.run() }

And here is how it looks in code:

struct Context {
    // program configuration, immutable
    config: Config,

    // other objects used for the duration of the application
    // there could be anything here
    client: HttpClient,
}

// example program config
struct Config {
    username: String,
    password: String,
    url: String,
    debug_mode: bool,
}

fn main() -> Result<()> {
    Context::init()?.run()
}

impl Context {
    fn init() -> Result<Self> {
        // parse env::args or some other source to build Config, Err on invalid input
        let config = build_config()?;
        // initialize other global objects...
        let client = build_client()?;
        Ok(Self { config, client })
    }
}

// helper methods for Context::init() can go here

// main program logic is in this impl Context
impl Context {
    fn run(&self) -> Result<()> {
        self.login()?;
        self.fetch_data()?;
        // etc...
        Ok(())
    }

    fn login(&self) -> Result<()> {
        // here I can use context like `self.config.username`
    }

    // etc...
}

I like this pattern because 1) function signatures are clean and 2) refactoring is made easy since I don't have to pass the same thing around to different functions as often.

An alternative is to use once_cell or lazy_static! for the global variables. But these don't work very well if you want to initialize with some error handling (using Result).

If I need mutability for a field in Context, I would probably use RefCell or some interior mutability on that field.

There are some obvious pitfalls. You need to be careful not to over-extend the scope of an object by throwing it into the Context for convenience. And, you wouldn't want this pattern to get out of hand with a complex program. The impl Context should only hold the "top layer" of application logic. A context struct should be private to a single module.

You could repeat the pattern for a module/feature.

mod feature {
    pub fn do_feature(inputs...) {
        FeatureContext::init(inputs).run()
    }
    // private
    struct FeatureContext { ... }
}

What are your thoughts? Would you adopt this pattern in your code? Why or why not? What would you do differently?

This pattern can be pretty useful when you've got a lot of state or dependencies to keep track of and I've used it in the past.

It also makes managing ownership easy when you just need to hack something together, which is probably why you'll see this pattern used in a lot of C libraries (assuming they don't use global variables directly, of course).

That said, some of the big reasons I wouldn't recommend it in my code are:

  1. It makes testing harder - now you need to create half the world just to test a helper function
  2. It's like global variables in that your code becomes tightly coupled and makes it really easy to share too much between components
  3. The Context ends up as one big bucket with your entire application state, prohibiting you from separating your application into layers and independent systems
  4. People often use it as a workaround instead of understanding borrowing/ownership and accepting our lord and savior, The Borrow Checker

You've put a nice spin on it with the "A context struct should be private to a single module" bit. I think resolves a lot of issues I have with the pattern... Of course Context is too vague a name so I'd name it after the system/module/component it's relevant to... But hang on, isn't that like the pattern where you'd wrap an entire system up into a single class/object/type that represents that system?

I think we've just reinvented OO using Rust syntax :stuck_out_tongue:

(please note that I'm not trying to poke fun of you here! It's just interesting that you get this sort of convergent evolution where people develop different, but almost identical ways to solve the problem of managing state)

4 Likes

I use &Context parameters a lot, but as @Michael-F-Bryan said, usually limited to the current task. Sometimes I also chain them.

struct Ctx<'a> {
    parent: Option<&'a Ctx<'a>>,
    .. fields
}

I definitely wouldn't call it "context struct pattern". The way you use your context is very specific to your needs with the call after initialization is complete.

When I'm thinking about contexts, I view them as a bundle of variables, that are always passed around together and the content may or may not be opaque to the user.

If the context is transparent, you probably have a context struct, because you had to box the variables due to memory-related concerns, but would otherwise pass them individually as function arguments. What you really wanted to pass in this case was a boxed anonymous struct, though, but Rust doesn't have those, yet.

If the context is opaque, your module accepts a user function and expects that function to use your other module function, which may be called multiple times, which requires the context to be passed as an argument.

There may be a good reason for semi-opaque context structs, but I can't think of one, right now.

In all other cases, I wouldn't call it a context struct.

2 Likes

You make some great points.

Oh yeah, this is a big one and I'm slightly embarrassed that I didn't think of it when posting originally. I have to ask myself, how much junk am I allowing into this function which will complicate testing? And I don't think it's over-ambitious to shoot for none.

Oh no! It was never my intention to persuade anyone into an inferior gospel. :laughing:

To your other points, I feel like I should add more constraints on this. For one thing, I definitely wouldn't try to use the pattern everywhere in a project. I would only use it for a large group of closely related functions that have nearly identical requirements in terms of what state variables they access. And I would only use the pattern to represent one layer of application logic or one specific piece of functionality (again looking to the private to a module rule).

Oh dear...I'll see my way out.

Not at all! This is exactly the sort of response I was looking for.

I've done this before too but then I generally decided against it because it allows too many things to be tied together. I might just copy any overlapping fields from parent to child context.

What would you call it then? You seem to be alluding to a specific definition of "context struct/object" that I am not aware of. I simply use the word "context" because it's the best word I can think of to represent what the thing represents semantically.

I personalty tend to prefer the function body pattern.

I'd probably call the struct Prog(ram)/App(lication)Config(uration). Is the struct part of a lib or a bin project? It looks a bit weird to have a run method on a struct like that.

I'd usually name it after the system/component I'm implementing.

For example, imagine we're estimating how long a 3D printer would take to print a drawing.

pub fn estimate_print_time(
  path: &[Segment], 
  motion_parameters: &MotionParameters,
) -> Duration {
  ...
}

struct MotionParameters {
  max_acceleration: f64,
  ...
}

struct Segment {
  delta_x: f64,
  delta_y: f64,
  desired_speed: f64,
}

This process involves a lot of temporary state (you usually need to do a couple passes, the desired speed for one segment will affect the one before/after, etc.), which is a perfect place to use your Context Struct Pattern.

struct Simulator<'a> {
  motion_parameters: &'a MotionParameters,
  segment_desired_speeds: Vec<f64>,
  cornering_speeds: Vec<64>,
  ...
}

pub fn estimate_print_time(
  path: &[Segment], 
  motion_parameters: &MotionParameters,
) -> Duration {
  let mut sim = Simulator::new(motion_parameters);
  sim.simulate(path)
}

This act of wrapping internal state up in a type with internal methods for doing part of your computation and exposing a couple well-defined methods (Simulator::simulate() in this case) is sometimes referred to as Encapsulation in the OO world.

That said, I think the way you've written it is more like the procedural code I would write in C. If you squint, it looks a bit like the session_state in src/session_state.h from the signalapp/libsignal-protocol-c project.

Suppose I have a bytecode interpreter: MyAARPMedicare

struct Interpreter {
	code: Vec<Instruction>,
	stack: Vec<Value>,
	ip: usize,
}

impl Interpreter {
	fn run_instruction(&mut self, i: &Instruction) {
		...
	}
	
	fn step(&mut self) {
		let instruction = &self.code[self.ip];
		self.ip += 1;
		self.run_instruction(instruction); // uh oh
		// 'instruction' already borrows self
	}
}

Such pattern, where I have some parts of struct as static immutable data (code) and some as dynamic (stack, ip), shows up in my code pretty often. Because borrow checker is capable of seeing which parts of struct are borrowed within a function (for example self.ip += 1 is fine, but fn advance(&mut self) { self.ip += 1; } wouldn't be), the easiest solution becomes simply inlining offending function call, which leads to gigantic and complex functions. How should I deal with this pattern? One solution would be to split a struct into two parts, and pass one by immutable reference, and other by mutable. However, having two parameters on every function just for passing context feels a bit cumbersome

In the example posted you shouldn't have a problem because an Instruction is typically about the same size as, if not smaller, than a pointer. So it's more efficient to pass the Instruction by value and not get tied up with lifetimes.

If you find you are running into these sorts of lifetime issues more frequently, that's often an indicator that the way you have structured your data (the Interpreter) doesn't align with the way it is being used. You may want to have a think about the way data is accessed and break out components or restructure things accordingly.