How can I visually work out the lifetime organization of my system before writing it?

I'm currently working on a set of crates that try to solve a relatively convoluted problem, and I'm finding that once I get a few days into doing so with some lifetime strategy it turns out that said strategy has some subtle error where I need some data in some part of the program that isn't available there.

This reminds me of a similar problem I had a few years ago when I was writing an OOP program, and I found UML diagrams to be EXTREMELY helpful here, as I was able to visualize, work with, and most importantly iterate on general designs in a matter of minutes rather than days.

I can try to shoehorn Rust's model into UML diagrams, but I suspect that that won't be so helpful for lifetimes. Are there other sorts of systems that I can use that are perhaps more oriented towards Rust's model? Even if nothing exists for Rust in particular, I'm sure that something reasonable exists to visualize memory management in C, C++, etc. that might help.

2 Likes

The first question is to ask whether you really need to use lifetimes anywhere. Most people don't especially if they are writing an application rather than a library. The main exception is parsers.

One thing you could try drawing is the ownership diagram. Most things should have a single unique owner, as you otherwise need special constructs such as Rc or Arc, so the diagram should preferably be a tree, and if you must use non-tree constructs, make it acyclic (a DAG). Of course, ownership is separate from lifetimes.

If you post one of your failed strategies, I can probably also provide some insight into where you ran into trouble and how to avoid that issue in the future.

10 Likes

I will second that. In nearly two years of using Rust I have almost never had to write any lifetime tick marks. If I find myself thinking about doing so I take it as a sign my approach is not very good.

When it comes to "shoe horning" into Rusts model that should be pretty easy. If things need to live long enough to be shunted around different parts of ones program at random, or at least ways one had not thought of when coming up with the design, just wrap them in a smart pointer, Rc, Arc. Which is what C++ programmers do and is done automatically by likes of Java.

2 Likes

What you need is focus on ownership.

Lifetime annotations apply only to temporary borrows. For the most part this should be a small, local concern. Lifetimes can become a whole-program-paralyzing problem if you overuse them, e.g. if you put references in structs that aren't temporary views, and should have been storing the data instead.

Your program's data and all objects ideally should form a tree, since that's the most Rust-friendly shape of data. Avoid globals. Avoid parent-child relationships (where necessary pass the parent as an extra arg instead of storing it in the child). If you end up with graphs with cycles, you'll probably need Arc and Mutex.

11 Likes

When I say that I'm working out "lifetime organization" I don't mean lifetime tick marks, but rather the general structure of the crate and its properties that make it work or not work in safe Rust, somewhat as @kornel described.

The proposal to use a tree-like structure seems reasonable, and it is indeed what my main structure is at present, but there's a lot of information/state/data that I need to have accessible anywhere in the tree. I've tried passing these as arguments when recursing over the tree, but as there's a lot of this information and what exactly is needed or even possible in each place varies significantly it gets really messy really quickly, even when I try to encapsulate several things being passed into a struct to pass them as one object. If this were a GC'd language, or even something like C++, I'd probably just store a pointer/reference to this state in each node in the tree, but this isn't feasible in safe Rust without hacks like reference counting, and I'd really rather avoid that if I at all can.

1 Like

Whenever you deal with lists and you want to reference items in one list and keep those references in another list, you basically deal with a database and at that point, you should just treat your lists like common database tables, i.e. using indices instead of references. That does effectively disable the borrow checker, because you aren't borrowing anything, but it tends to cause fewer headaches and it performs well.

If you want to ensure, indices don't point to non-existing or wrong items, you can just keep all your lists (tables) in a database struct and handle removal via methods on the parent struct rather than manipulazing the lists directly everywhere.

2 Likes

I would not be so fast in referring to reference counting as a "hack".

As far as I can tell reference counting is exactly what goes on with the shared pointer type in C++ and happens automatically in languages like Java, Javascript, Python etc.

In C++ storing regular pointers/references into every node of a tree will lead to disaster unless you, the programmer, are carefully keeping the lifetime of everything in mind at all times.

And so it is in Rust. Except the compiler demands that you think about this up front.

If you don't want to think about the valid lifetime of such references, as when using Java and the like, then there is a bit of hassle in Rust that one has to wrap things in a smart pointer.

But that is the concession we have to make in order to not need a garbage collector.

2 Likes

I'd start with trying to split out whatever can remain immutable during whichever outermost scope is the "runtime". Then you can give out as many references to that data as you want for the whole duration.

I'd then take the mutable parts and see if they can be made immutable for portions of the run. If there are times where portions can be made immutable, great! If not, one option is to have the owned mutable instances of variables able to emit immutable snapshots of their state to share. Though that is semi-dependent on data being fairly portable (hopefully Copy).

The im crate might be useful to help keep state immutable. Then you just need to have all calculations completed and all references dropped before shadowing your "mutable" state variables into their next iteration.

If you start with these assumptions and push things to their logical conclusions before opting to add Arc/RwLock/Mutex/etc. you'll hopefully need far fewer concurrency primitives (and their overhead).

2 Likes

Could you expand more on why you dislike reference counting? My go to data structures in Rust are:

std;:Vec, std::Rc, std;;Arc, im::Vector (which internally ref counts), im::HashMap (which also internally ref counts)

refcounting + RAII is one of my favorite aspects of Rust

This is a great pattern for graph-like data structures as I think of them. I'll add that you can get a huge improvement by wrapping indices in newtype structs to give you back some type safety by ensuring that a given index can only be used in the Vec it is intended for. You can then think of each of these indices as if they were references that are harder to use.

This also lets you put methods on the indices.

2 Likes

I've come across a similar problem for a project.

For my usecase I found the usage of singletons to come pretty far for me in rust. It is probably somewhat dirty, but given that the scope of my project grew with it over time, I found it pretty useful there. If you know most of the architecture ahead of time, doing a careful architecture is probably the best, but if the scope can change quickly while you already have a lot of code, redoing the entire architecture along with the program might be too cumbersome.

For my usecase, I basicially created a file/mod for certain global resources. Each contained a main struct with a private constructor function that is usually only called once. The resource was then wrapped in a RwLock (basically a more efficient Mutex) to access this resource from anywhere. The instance itself is stored in a private lazy_static and is accessible by the pub functions instance() and instance_mut(). SO kinda a singleton.
It is pretty dirty, but has the advantage of being pretty similar to the likes of Java and similar. And if you get some new fringe case to access this data from somewhere, you needn't worry about having it passed to you. It would basically be a category::some_mod::instance().fringe_data() from anywhere.

There is also the concept of "inner mutability" in rust. It basically means that you put changeable attributes of a struct in a Mutex and thus allow mutation of data over an immutable borrow of the struct's instance. That way you can use an Arc instead of a Mutex or RwLock and it might might access more efficient of lots of mutable accesses are happening to only parts of your struct. Though at that point, the struct is probably doing too much anyway and should get split up.

I haven't considered Indices/Databases but those could be a more idiomatic replacement for rust as well.

You also may want to consider whether to use async or not. If you're not used to async, probably try to not use it for the time being. But since I used actix-web for the REST it proved valuable for me to make most of my architecture async (tokio has a lot of drop in replacements for Mutex/RwLock and the likes).

If you need to wait for events, you may also consider a Bus or Broadcast channel to allow other components to subscribe to certain changes/events of your singleton. Though going this route will probably make the relations even messier.

So please don't hate me on this proposal. I know it is pretty dirty and probably not much better than what C++ devs or some not-so-thought-out Java Programs use, but especially If your software is changing while it's being developed, I found this pattern (Singleton and Channels with optionally a focus on async/tokio) pretty useful and easy. I think this the message passing is a bit similar to actix's actor model (or maybe what go channels do, no experience there though).