Best practices for multiple interconnected components?

nepx · February 20, 2021, 5:00pm

Hi all,

I'm working on a project where one component needs to call functions on another (with the appropriate &self, of course). There are a large number of components, and the call chain can get a bit nasty: A::foo() -> B::bar() -> C::baz() -> A::something(). So far, I've been using raw pointers:

struct A {
    b_self: *mut B
    // more *_self structs not listed here
}
impl A {
    fn test(&mut self) {
        let b_self = unsafe { self.b_self.as_mut().unwrap() };
        B::foo(b_self, 16);
    }
}

struct B { /* ?? */ }
impl B {
    fn new() -> *mut Self {
        Box::new(Self { /* ?? */ }) as *mut Self
    }
    fn foo(&mut self, arg: u32) { /* ?? */ }
    fn bar(&mut self, arg: u32, arg2: usize) { /* another type of message */ } 
}

struct C {
    b_self: *mut B
}

// so on and so forth, with more references to B.

There are also a number of indirect calls, where an index into a table chooses between A::test(), B::test(), C::test(), and so on. I haven't written that part yet, but I plan to hold them as *mut _ pointers.

My questions are:

is this the best way to do inter-component calls?
When creating structs A/B/C, do I have to use Pin in conjunction with Box?
Am I missing anything?

Thanks in advance.

benkay86 · February 20, 2021, 5:42pm

This looks complicated. The lifetimes of A, B, and C are entangled with one another. I am not an expert, but raw pointers are probably not the way to go here because they typically require unsafe code. Furthermore, as you have correctly surmised, creating such self-referential structs using raw pointers will require you to Pin your structures so that the pointers are not invalidated by moving the pointed-too memory.

You could use smart pointers such as str::rc::Rc for non-threaded code or std::sync::Arc if you think your ensemble of structures might ever need to be Send between threads. Just mind that you may need to intersperse some Weak references to avoid creating cyclic dependencies between your structures that leak memory.

Depending on your underlying use case you might place A, B, and C inside some larger structure ABC since they all seem to have the same lifetime anyway. For example, instead of having a Student keep an internal reference to its Courses, create a School structure that can look up the Courses for each Student.

nepx · February 21, 2021, 1:03am

Will converting them to raw pointers allow them to retain that Pinned attribute?

An idea of mine was to keep both A, B, and C in ABC, just to make things easier to keep track of. I'm simulating a number of chips on a motherboard, and I have these virtual "chips" signal one another by performing function calls -- that appears to be the simplest and fastest way of communicating between them.

The chips are simple enough that everything can run on a single thread. A few things are best done asynchronously, but they require no access to the simulator's internals (like filesystem reads and such).

What specific benefits would smart pointers have over raw pointers -- aside from the elimination of unsafe code?

droundy · February 21, 2021, 2:58am

One of the huge advantages of programming with safe code in rust is that it forces you to make explicit the relationships between components. (The other being memory safety.) Using raw pointers removes that relationship indication. It sounds like maybe you have in essence what should be a self referential struct, which is a challenge in rust. Or maybe you have a set of objects that aren't freed, so you don't want to bother with lifetimes. There are safe idioms for most such patterns, but I can't tell what pattern you have from your question.

benkay86 · February 21, 2021, 3:29am

Well, eliminating (or at least minimizing) unsafe code is a huge benefit because then you get memory safety. But elaborating on this further, let's look at the example of A holding a reference to B:

struct A {
    b: *B
}

The unsafe part happens when you want to do something with the value of B and must dereference your pointer. Assuming you've made sure the pointer can never be null, how do you know the pointer points to a valid location in memory?

impl A {
    fn do_something(&self) {
        let val_of_b = unsafe { &*self.b; }
        println!("{:?}", val_of_b);
    }
}

Think about how you would actually do this in a memory-unsafe language like C. Where will the memory be allocated? If B is a relatively short-lived object on the stack then in Rust it would make sense to have A and B be part of the same AB object such that they have the same lifetime. If B is long-lived then you are probably going to allocate it on the heap (with malloc() in C). In Rust, for the same cost as a malloc() and free(), you can have perfectly safe code:

struct A {
    b: Box<B>
}
impl A {
    fn do_something(&self) {
        println!("{:?}", self.b);
    }
}

Let the compiler and the standard library do the hard work for you! Rust provides you with plenty of smart pointer types to work with. All are very performant. Choose the one that provides the level of flexibility you need.

Box if just one object will have a reference to B.
Rc if multiple objects will hold references to B.
Arc if multiple objects will hold references to B across multiple threads.

And these can be combined with RefCell (or if threading Mutex) when you need to mutate the pointed-to object.

nepx · February 21, 2021, 5:44am

I don't like using unsafe any more than I have to, but in this case it's not that much of a problem since the pointers are set early into initialization, and once they're in place, they don't change (that's the reason behind using Pin, I suppose). But I agree with you -- it's better to avoid unsafe. This is actually a port of some other code that I've written in C.

Would RefCell work for calls back to the same component? If I go from C::foo(&mut self) to A::foo(&mut self) to B::foo(&mut self) and then to A::bar(&mut self), would I trigger some kind of error when borrowing &mut self on that final call to A?

droundy · February 21, 2021, 3:05pm

I'll add another option: use &'static if you have no plans to free the memory. You can create this by using Box and the leak. It's perfectly safe, and you can combine it with internal mutability to make changes.

The other option that comes to mind is that it looks like you're creating a graph, in which case having the data all held in a set of Vecs, and replacing the pointers with indexes into those Vecs. There are tricks to make this beautiful.

s3bk · February 21, 2021, 4:49pm

Have you considered an Actor/Message passing approach?

Each component sends the message to a central dispatcher, that then calls the appropriate method.

nepx · February 22, 2021, 9:34pm

This is exactly what I was looking for! But I'm at a bit of a loss on how to build it without upsetting the borrow checker. In C and Java, I could pass around pointers and references to various classes, but Rust requires a different approach.

My specific environment is:

Single threaded and synchronous
A fixed number of actors created on initialization that only get dropped on de-initialization/cleanup. Actors are not created or destroyed throughout the duration of the program
A very small set of messages in a fixed format. The source and destination are hardwired, and so is the type of message (i.e. IRQ, DREQ, etc.), such that I can create individual functions for each message (i.e. Component1::raise_irq_line or Component2::receive_ack_signal)
All components are opaque and completely segregated from one another, aside from the occasional message passing. Components don't need access to the internals of another

The simplest and most obvious way of doing this would be to use static muts, and I used the C equivalent of this while writing the original version of my program. But I'd like the Rust rewrite to use a more elegant strategy. How are actor models typically implemented, in Rust?

alanhkarp · February 23, 2021, 1:04am

I have a somewhat more complex problem that I've implemented in the actor style. Each actor runs in its own thread (I started before async/await was available), and is initialized with crossbeam channels to the actors it talks to. I have no shared state across actors. I'm not passing references on the channels, and the only statics I have are immutable. The borrow checker hasn't been a problem. I'm quite happy with the design.

s3bk · February 23, 2021, 5:22am

It is possible to implement without unsafe code, but for performance it may be beneficial.
I wrote my own experiment on this, and while it is possible to avoid unsafe code, in practice it is still needed: https://github.com/s3bk/emp/blob/master/examples/echo.rs

However for serius work, I would suggest looking around for actor frameworks on crates.io or lib.rs

Actix looks usable and they have a book!
https://actix.rs/book/

So does the much smaller xtra.

nepx · February 23, 2021, 5:38am

These crates all look extremely useful and nice, but I was wondering if there was a way to do this without using dependencies. As I mentioned in a previous post, everything's singlethreaded and synchronous, so using a library for something that's conceptually very simple feels like overkill.

Basically, what I want to do is work around the borrow checker. While we all know that the borrow checker knows best, it's occasionally very nice to have multiple mutable references scattered across a program. and Drop them upon program deinitialization, since the lifetime of these structs are known in advance. It's a very simple use case for a very simple actor model system -- anything that uses more advanced features plays by the rules of the borrow checker.

I was thinking about using UnsafeCell (which I learned about from poking around the RefCell source code) or raw pointers. Does UnsafeCell allow me to avoid using std::pin?

s3bk · February 23, 2021, 6:14am

Sure it is possible without external dependencies.
You probably want generators (or have structs with a common trait to call them).

You will also need a dispatcher that calls them and sorts messages.
For that you could take dispatch.rs and remove all the things you don't need.
The key trick is in run_once where two message queues are swapped to avoid a fight with the borrow checker.

system · May 24, 2021, 6:15am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Self referential struct with references help	2	447	April 24, 2024
Self-referential struct - request for a review help	4	462	October 23, 2021
Is this self-referential struct unsound? help	15	730	June 21, 2021
Multiple mutable methods help	50	3593	August 10, 2019
Is there any way to use raw pointers safely?	16	3286	October 17, 2019

Best practices for multiple interconnected components?

Related topics