Learning backlink reference

Learning Rust I rewrite 3 programs. My chess-engine is working.
So now my old Delphi project: The clone of the DOS Game Lemmings.

The question has already been asked probably a 1000 times...

I have a Game struct with all functions for information (lookups, images, animations, resources etc.).
It contains a Vec<Lemming>. They spawn during the gameplay. The Lemming is the moving character in the Game.

How can I achieve a Lemming containing a mutable backlink to Game?
Without losing speed, without using an unsafe pointer (which is working).
What is the rusty way?

struct Lemming
{
    game: *const Game // unsafe
}

impl Lemming
{
    /// Spawn a lemming...
    fn new(game: ....) -> Self
    {
        Self
        {
             game: ....
        }
    }
}
  • Single threaded or threads/async?
  • Mutable or immutable access to the game from the lemming?

single thread. simple framebased.

Lemming is mutable, Game is mutable, because Game contains (for example) the World bitmap which is changed during gameplay.

    // game function
    pub fn update_lemmings(&mut self)
    {
        self.check_spawn_lemming();
        
        for lem in self.lemmings.iter_mut()
        {
            lem.update();
        }        
    }

    // lemming function
    fn dig(&mut self)
    {
        // ....
        if self.game.world.has_terrain_at(self.x, self.y)
        {
            self.game.world.remove_terrain(self.x, self.y);
        }
    }

It's quite possible your code is working by accident and is actually erroneous.

Well, short answer is "We DoN't Do ThAt HeRe". Mutable backlink is generally a bad idea in Rust.

Game contains Lemming so a mutable access of Game naturally gives mutable access to Lemming and this will quickly turns into an aliased mutable reference to the same Lemming which is UB (undefined behavior).

The most common solution is restructure you code (instead of translating from other language directly). For example:

struct Game{
   lemmings: Vec<Lemming>,
   world: World,
}

struct World{
   //...
}

impl Game {
    pub fn update_lemmings(&mut self)
    {
        self.check_spawn_lemming();
        
        for lem in &mut self.lemmings
        {
            lem.update(&mut self.world);
        }        
    }
}

Yes, you need to pass &mut World everywhere. Sometimes helper struct helps:

struct LemmingCommands<'a> {
   world: &'a mut World,
   lemming: &'a mut Lemming,
}

impl World {
   fn command<'a>(&'a mut self, lemming: &'a mut Lemming) -> LemmingCommands<'a> {
      LemmingCommands { world: self, lemming }
   }
}

impl<'a> LemmingCommands<'a> {
   fn dig(&mut self) {
       // ....
       if self.world.has_terrain_at(self.lemming.x, self.lemming.y) {
            self.world.remove_terrain(self.lemming.x, self.lemming.y);
       }
   }
}

This will only work if your lemming does not try to modify other lemmings, as World does not contain Lemming. If you need that, you will either need:

  1. Have some kind of interior mutability. (e.g. use RefCell)
  2. Implement the methods on Game directly and passing lemmings' indexes around.
5 Likes

For single-threaded mutable access, you can use Rc<RefCell>.

Rc does use ref counting, but the performance impact is almost non-existent because there is no atomic operation or synchronization. And the ref counting is only when you clone it (to add it to a new Lemming, for example) and when the clone is dropped (when the Lemming is dropped). The clone call does not copy the Game, it just increments the ref count.

RefCell is also doing some checking to enforce exclusive access to mutable data, but again the checking is extremely cheap.

Note that Rc will allocate the contained item (the Game) on the heap.

Here is a very short example:

(fixed to clone the Game before creating the Lemming):

        use std::cell::RefCell;
        use std::rc::Rc;

        struct Game {
            field: usize,
        }

        struct Lemming {
            game: Rc<RefCell<Game>>,
        }

        impl Lemming {
            fn new(game: Rc<RefCell<Game>>) -> Self {
                Self { game }
            }

            fn f(&self) {
                let mut game = self.game.borrow_mut();
                game.field += 1;
            }
        }

        let game = Rc::new(RefCell::new(Game { field: 1 }));
        let lemming = Lemming::new(game.clone());
        lemming.f();

If you can't live with any runtime checks (of course, benchmarking is the only way to know the impact), or you don't want to allocate the Game or explicitly borrow it like this, then the solution is to pass the Game parameter whenever it is needed, rather than storing it in the Lemming. This is what I do, but not for performance reasons, it's just my personal preference -- I much prefer the static compiler checking of the borrowing rules.

Note that you can use Weak to avoid cycles, if you end up also using Rc for the Lemmings.

1 Like

If I understand the sample code, UB happens even before then.

fn update_lemmings(self: &mut Game) {
    for lem in &mut self.lemmings {
        lem.update()
    }
}

// presumably
fn update(self: &mut Lemming) {
    self.dig();
}

fn dig(self: &mut Lemming) {
    self.game.get_mut_ref_somehow().remove_terrain(self.x, self.y);
    //   ^^^^^^^^^^^^^^^^^^^^^^^^^^
}

The underlying portion would create a &mut Game that overlaps with the one active in update_lemmings (and thus any sound way of getting a &mut _ must deadlock or panic or otherwise be fallible and not succeed here).

(E.g. Rc<RefCell<Game>> can't help with the sample; it will correctly panic.)


If the OP wonders why: What would stop you from calling self.game.lemmings.clear(), thus making the self dangle?

5 Likes

Thank you very much.
I saw comparable solutions (and i will re-read them and try to implement and try to understand).
On the other hand, I encounter so many problems, that I think I need to resign on Rust :frowning:
The simplest things cannot be done 'normally'. And after a few weeks I still don't understand the crazy rules.
Or take a break and try again later...

That's typical and normal reaction. But think about what you call done 'normally', for a moment.

If you'll think about it then you would realize that you are talking about “soup of pointers” design. Where objects refer to each others [semi]-randomly when you feel there should be link between them. And no one knows who is owner and is ownee.

Perfect analogue to the infamous spaghetti code, only this time with data.

Rust doesn't like it. Ideally it wants you to follow the same discipline with data that you follow, in modern languages, to code: nice, easy-to-understand-clear-ownership.

Where everything have hierarchy and ownership is always clear.

Of course, like structured programming had to add escape hatches (exceptions, coroutines) so Rust offers ways to implement non-strict higherarchy of ownership if that's needed. Shared ownership via Rc and Arc are the most common tools, there are others.

But yes, in a Rust land what you call done 'normally' is very much an exception, not the rule.

I suspect that acceptance would go the same way it did with structured programming, via Planck's principle: An important scientific innovation rarely makes its way by gradually winning over and converting its opponents: it rarely happens that Saul becomes Paul. What does happen is that its opponents gradually die out, and that the growing generation is familiarized with the ideas from the beginning.

Given that fact you may be 100% content for now: Rust wouldn't replace other languages (like C or C++) for decades. Thus skipping it's adoption and continuing with doing things 'normally' is perfectly acceptable decision.

3 Likes

I know exactly what you mean about "simplest things cannot be done 'normally'". Rust is really different and it is easy to get discouraged (happens to me regularly and I have spent MUCH more than a few weeks). Learning to think about programming differently is a real challenge. I would say it is a better way of thinking, because you must think and design explicitly for shared vs exclusive data access, which makes your program much better overall.

Taking a break isn't a bad idea sometimes, but I hope you continue. The benefit of Rust is very large: no data races or memory safety issues, with C/C++ level performance. No other language I know of can do this currently. It is worth the trouble.

The key for me is to not try to make Rust conform to how I have programmed in the past. In this case I suggest to just try passing the Game parameter to the Lemming functions when you need it, or vice versa. It doesn't feel right at first and you often have to organize your program differently than you might normally do, but in most cases it works out fine.

Here is an example of what I'm talking about:

struct Game {
    data: usize,
    lemmings: Vec<Lemming>,
}

impl Game {
    fn f(&mut self, lemming_id: usize) {
        self.data += 1;
        self.lemmings[0].g(&mut self.data);
    }

    fn add_lemming(&mut self) -> usize {
        let id = self.lemmings.len();
        self.lemmings.push(Lemming::new());
        id
    }
}

struct Lemming {
    data: usize,
}

impl Lemming {
    fn new() -> Self {
        Self { data: 1 }
    }

    fn g(&mut self, game_data: &mut usize) {
        *game_data += self.data;
        self.data += 1;
    }
}

let mut game = Game { data: 1, lemmings: vec![] };
let lemming_id = game.add_lemming();
game.f(lemming_id);

If you tried to do this differently, for example, by passing a &mut for both the Game and Lemming to a function, you would get a compile error:

impl Lemming {
    fn h(&mut self, game: &mut Game) {
        // ...    
    }
}
game.lemmings[lemming_id].h(&mut game);

error[E0499]: cannot borrow `game` as mutable more than once at a time
  --> src/lib.rs:58:37
   |
58 |         game.lemmings[lemming_id].h(&mut game);
   |         -------------             - ^^^^^^^^^ second mutable borrow occurs here
   |         |                         |
   |         |                         first borrow later used by call
   |         first mutable borrow occurs here

The &mut Game and &mut Lemming are exclusive borrows and they overlap, so they're not allowed. You get this error even if h actually uses non-overlapping fields, since the compiler doesn't look at the implementation of the function, only at the signature, when type checking calls to that function.

To solve this we can pass &mut parameters for different fields of Game, and that way the compiler knows there is no overlap.

        impl Game {
            fn f(&mut self, lemming_id: usize) {
                self.data += 1;
                self.lemmings[0].g(&mut self.data);
            }

Obviously, doing this sort of thing will probably require you to restructure your old Game code. This is the price we pay for the benefits of Rust.

EDIT: This is really the same thing that @zirconium-n posted above, I missed that somehow as I was replying earlier. Their reply is a better description of this approach.

2 Likes

There's not always a straightforward way to get a good (idiomatic) port of an existing program from one language to another. Sometimes it's required to re-architect everything to do things the TargetLanguage way. Attempting one-to-one translations when going to Rust is tantamount to writing C in Rust or Java in Rust or what have you, not Rust in Rust.

Getting near-limitless control (such as &mut) over a containing structure from a contained structure is not normal in Rust, because the ramifications are far from simple, to a compiler which has to prove the absense of memory safety and data race errors.


I recognize none of that is actionable advice for replacing your backlink with some other design.

One way is to wrap approximately everything in Rc<_> or Rc<RefCell<_>> (or Arc<_>/Arc<Mutex<_>>), not just Game, sort-of emulating what some languages do with all values. And then only getting ahold of &mut for very short-lived "critical" sections; in particular trying not to hold them across calls that don't correspond to some hierarchical access.

It's not really idiomatic and it's easy to set up "iterator invalidation" type problems where you clone some Rc to avoid some borrow checker problem, and then the original/"real" version gets mutated. Like maybe you have a cloned Rc<RefCell<Vec<Lemming>>> you keep around for awhile, but in the meanwhile something goes in and clears the Vec reachable from the one Game, so now they're out of sync which may lead to some logic error.

Another way has also been outlined by @zirconium-n and others: figure out what parts need shared or exclusive access to what other parts, and find a way to design that which fits with some tree-like access patterns. If Game owns everything and a Lemming needs to modify the World, Game can hand each Lemming access to the World, and so on. (This is the "purest" in that it doesn't require runtime checks or other Rc-like overhead.[1])

Two articles I can recommend about common challenges with this approach are

And of course there are ways which are a combination of other approaches, such as this toy example that has Rc<RefCell<_>> at the top and uses the sentinel pattern below that to avoid holding a &mut _ across the call to Lemming::update. (The sentinel pattern has similar iterator-invalidation type considerations as was mentioned before; downstream code has to be aware that the lemmings field is not an accurate depiction of the game state for the time being.)[2]


There's no way to say what would work best for porting your program without knowing all the particulars of the program. Even once the particulars are known, a lot of it is going to come down to experience -- being able to "think in Rust".


  1. That said, the costs may be cheaper in practice than you think. ↩︎

  2. This isn't the same motivation as in the article, as the only reason we want to remove the field is to stop borrowing ourself, but the solution is effectively the same. A variation here would be to use a VecDeque to remove one Lemming at a time temporarily, instead of all at once. Though that's even more prone to iterator invalidation... ↩︎

3 Likes

In the process of mastering the rust language, I have improved my skills in creating a structured and efficient code architecture.

Not all programs can be rewritten in Rust in an idiomatic and elegant style, however, any code sculpture born in the world of Rust will stand out with clarity and lucidity even when ported to other programming languages.

Managing the access rights of entities in accordance with the concepts of ownership and borrowing reveals a previously unseen harmony between the components of the program and expands the horizon of understanding the problem.

For me, Rust is not just a programming language, but a development philosophy.

1 Like

Ok, again thank you all very much for the answers. I will take a short break, learn some more, rethink your remarks, experiment etc.
Indeed I must say that - when I browse through some C# libraries on Github - I see an incredible mess.
When seeing a crates on crates.io I see often understandable clean code. Also when I see my own Rust code, which I made: it is much 'cleaner' than what I did in C# or Delphi.

The thing which triggers me with this mutability is that it is often hard to 'catch' the real world in data-structures and algorithms. If moss grows on a tree the tree AND the moss change :slight_smile:

Anyway: old habits die hard. I will come back to this...

6 Likes

If you still want to actually learn Rust then my advice would be: don't try to jump straight from “what is done normally” to “ideomatic Rust”.

Learning ideomatic Rust is hard and takes years. But the catch is: non-ideomatic Rust with tons of Rc<RefCell> (or Arc<Mutex> if you need concurrency) still works! Writing “non-ideomatic Rust” is not a crime, really!

Indeed, that famous and much-lauded ability of Rust to help you with refactorings is the key to the ability to learn the whole thing. Start with “ugly Rust” with tons if useless Arc/Rc wrappers, it wouldn't be as clean as “ideomatic Rust”, but it would work, you would get something usable… and you may refactor it later (if there would be enough interest).

There are good article that lists things that may hamper adoption of Rust and one of the most frustrating mistakes is probably mistake #11 and that comes from misunderstanding about what role “ideomatic Rust” plays.

While in many other languages it's incredibly important to employ certain idioms from the beginning because otherwise you would just produce unholy mess that couldn't be refactored and would need a total rewrite… Rust is not like that: even if you wouldn't try to write “ideomatic Rust” the end result would still be “passable” and, more importantly, refactorable[1]. Reach that stage and you may start writing “real”, “production” code in Rust. And you would [slowly] learn how to make your code look better, more “ideomatic”, over time.

learn to crawl before you may run!


  1. It's possible to make a mess even in Rust, the determined Real Programmer can write FORTRAN programs in any language, after all… [ab]use of unsafe is an easy way to achieve that, but it's still non-trivial and, most importantly, visible. ↩︎

3 Likes