How to pare down references in an object graph?

I'm porting a Java program to Rust. It simulates balls bouncing around, some of them bonded in pairs. In the Java version, the simulated world has a list of balls and also a list of bonds, where each bond references the balls it connects. The world tells the bonds to add spring forces to the balls, then tells the balls to update their positions based on the forces on them. (Other forces come from the balls hitting walls and each other.)

I can't port this structure as-is to Rust because it can have multiple, long-lived, mutable references to each ball. The best alternative I've thought of so far is for the world to keep a table of balls indexed by some sort of id. Bonds would then refer to balls by id instead of by reference. Seems like this should work, though I haven't tried it, but basically all I've done is create a sort of poor man's reference so I can create a "sea of objects", which the O'Reilly book recommends against. Is there a more idiomatic way to create this sort of object graph?

Referring by ids is a fairly common way to do this kind of thing:

http://smallcultfollowing.com/babysteps/blog/2015/04/06/modeling-graphs-in-rust-using-vector-indices/

1 Like

Let me throw out a couple of other ideas.

Define an enum with two variants: first has a single ball and the second a pair of the bonded balls. The world then holds a list of this enum. If balls get bonded/unbonded, you’d move the balls in and out of the bonded variant, as needed. Depending on how dynamic the bonding aspect is, it may be workable. It also encodes the fact that a ball is in one of these two states at any given time.

A slight twist on the above is skip the enum and use two different lists, one for unbonded balls and another for bonded (stored in a struct). You’d still move balls between the two places.

A Java variant of this would be to have Rc<Ball> that’s shared between the different lists. To apply mutation, the Ball would have interior mutability, preferably using a Cell over some Copy types (eg 2D position?). This gives you the familiar sea of nodes model and mirrors Java’s memory layout of this.

The enum is an interesting idea, but a problem is that a ball can have more than one bond.

Ah ok, for some reason my brain decided that it’s just 2 :slight_smile:

I think you can generalize that by storing a vec of balls that form a bond.

There's a word I like to use when describing how the OO and functional programming paradigms differ: agency. Agency is the ability of something to act; to interact with it's environment, and to cause things to happen.

(alternatively, one could say that it is the likelihood that somebody will ascribe it human qualities when describing how it works :stuck_out_tongue:)

In the Java program you described, the world, the bonds, and the balls all have agency. This is why they require lots of mutable references, and why it does not translate well to rust.

There are very few structs in my rust code that I would describe as having agency. Only a few functions containing high-level application code ("orchestration" code); perhaps implementors of traits used in that code; maybe a Visitor here and there; and things designed like coroutines.

For a problem like this I would most likely represent bonds as nothing more than a Vec<(usize, usize)>.

I don't have the book so I'm not sure what this refers to. Can you summarize what the author is referring to and why it is bad?

3 Likes

True, but that loses the information of which balls are directly bonded and which are indirectly bonded, which affects the physics.

Couldn’t the order within the Vec define that when you create the bond? Or assign the order manually if need be via an integer.

Is this a 2D space?

Yes, it's a 2D space. A collection of bonded balls can have a complex, graph-like structure. Think of atoms in a molecule. Order isn't enough.

Agency is a good word for that. Since my program is a simulation, it's natural to organize it using agency. I'd have to think hard about how to do it otherwise, although that's one reason I'm asking here. It's also natural to have a lot of mutability in a simulation, if only for efficiency. I could generate a whole new collection of values for each new simulation step, but that sounds expensive.

I'm not sure I entirely understand why the book dislikes the "sea of objects" design, but it asserts that Rust prefers tree-like object structures over free-form graphs. I suppose that just falls out of the discipline that Rust imposes in return for its safety guarantees.

Ah, I see. Then yeah, the enum approach doesn’t buy you much beyond having a way to segregate bonded vs unbonded balls; you still need to describe the graph in the bond.

So you could Rc your way out of it, if internal mutability is fine. That would be the closest Java analog as mentioned.

Using indices may prove easier however.

Another option might be to allocate the balls out of an arena that outlives your simulation and store immutable references (backed by the same interior mutability). This is basically the Rc approach just without the Rc.

I believe it’s that and also conceptually hard to track the interactions when everything has a reference to everything else, particularly with mutability involved.

I was kind of hoping to avoid interior mutability. Seems like there should be an elegant, idiomatic solution somewhere.

It's usually much easier to use an index array of persistent references, storing indices everywhere that the Java code has references. This is similar to operating systems that use handles to objects, so that the object may move but the reference stays constant. In this case the object referent is only in one place, with one owner, which makes it relatively easy to achieve a successful compilation.

The performance hit is one level of indirection, which might be more than mitigated if objects that are bonded together are clustered together in memory so that they tend to be in the same or adjacent cache lines. Even having most indices of a cluster in the same cache line would help a lot. Whether either of those are even reasonable depends on the relative persistence of the objects themselves and of their bonds.

If your simulation is single-threaded then this approach does not require interior mutability; you simply have a sea of objects whose fields can be updated at will since none of those fields contain references but only indices.

1 Like

An arena (references) would enforce the persistent aspect here. If the balls only contain immutable data, and mutable data is stored elsewhere, then interior mutability would also not be needed.

I’m not against the id approach myself, and I know it’s a common approach, but it’s not without its own shortfalls. Just exploring the design space/options here.

Discourse ate my post. I hate writing from mobile.

I'll expand on this when I get home, but: what you've written is just a simple time integrator, and I myself would deliberately write it as little more than a single loop (that accepts callbacks for the force and output) that works directly with a few Vecs.

(to clarify, I mean all of the agency is contained within one loop; not the whole of the code)

This is in fact advantageous, as then I can more easily swap it out with better time integrators (yours is the simple Euler method, but there are others much more accurate for certain applications). Critically, in some of these methods, the position and velocity are never simultaneously known, meaning they cannot be described with balls that "know" their velocity.

Okay, you've got me eagerly awaiting your expanded post. I've had problems with my integration, namely spurious oscillation when one of the balls in a pair is very small, that I'd love a good way to overcome.

I'm afraid the balls are nothing but mutable data. Position, velocity, size, mass - all change over time.

I'm not entirely sure I understand your arena idea. It sounds to me like it's very similar to having a vector of balls and referring to them by index, but instead I refer to them using references and then use interior mutability to get around the references' immutability. Is that right?

Size and mass change? Are these like galaxies accreting mass or something?

This might make things interesting as I'm not sure how it impacts the integration methods. Years back I learned how these formulas are derived, and I remember the goal (to find "conjugate variables" position and momentum that obey 1st order PDEs) but don't recall how to get there. :stuck_out_tongue:. That said, I get the impression that having mass change over time means momentum != mass * velocity.

....or...I'm reading too far into what you said, probably. Hopefully. :angel: