Proposal: an approach to back references

Back references in Rust usually require using a reference-counted object in some form. It's hard for a single-owner struct to find its owner. This is a problem for doubly-linked lists, trees with backlinks, and anything where you just need access to the owning object. The lack of a way to do this safely under single ownership has been a long-standing problem in Rust.

Here's an approach that might work for fixing that: make it possible to derive a function that tells a struct who its owner is. A structure S which can be owned by a structure T could have a derived function. S cannot make T stay around, so this is a form of weak pointer.

self.owner()

with signature

fn owner(&self) -> Option<&T>

This is an Option because the owner might not be a struct of type T. The owner can be a local variable, for example, something that can't be referenced from elsewhere. So at times it must be None.

To generate such a function, Derive might be used:

#[derive (Owner(T))]

That would be convenient. How hard would it be to implement?

This creates a requirement that the owning struct T have some extra machinery to update the hidden backpointer in S whenever the relevant field is altered. So, in struct S, the owning pointer of T needs to be identified. Perhaps like this:

 struct T {
     link_to_t: OwnerOf<S>
}

How to implement this? Can this be done entirely with macros? I suspect not, but if anyone can see a way, that would be the easiest approach. Comments?

I'm not completely sure what the suggestion is, but structs generally do not know who their owner is.

Of course. The whole point is to automate the tracking of a struct's owner for data structures which need that information.

Do you have a particular example in mind of a data structure which would benefit from that?

1 Like

It's not possible to soundly convert a reference of a field to a reference of it's parent, even with unsafe code because the reference doesn't have provenance over the parent. It's only allowed to access its fields.

You can use unsafe code and raw pointers, but that's really tricky to get right. I'm currently working in a crate with @ckaran to make this easier for unsafe code: GitHub - RustyYato/generic-field-projection: This crate was created to implement the ideas in this RFC #rust-lang/rfcs/2708.

This would require code to run whenever the owner is moved, which would break existing code that assumes Rust values can be moved just by doing a memcpy or equivalent.

3 Likes

Right. That's like the copyable/cloneable distinction.

That's the problem where the struct is an actual member of the owning object, not merely referenced by it. That case is entirely resolvable at compile time but may lead to a borrow clash.

No, it's not. Implementing Copy for a type allows more code to compile, but it does not alter the behavior of code that does not rely on Copy.

Drop glue is kind of like that, except that's built in to the language, not a trait (types can have drop glue without actually implementing the Drop trait). And drop glue only runs when the thing is dropped, so the ways it can interfere with borrowing and generics are limited (although even so, there is code that's hard to write because there's no way to make a "has-no-drop-glue" bound). I am concerned that adding a trait that runs code when the implementor is moved would invalidate generic functions that compile today.

To move this discussion forward I think it would be a good idea to flesh out the idea more. Be aware that there are existing crates that make it easier to work with self-referential structs in safe code: ouroboros and owning-ref are two of them. How exactly would your proposal differ from these? Can you give an example of a data structure that would be safe, but is disallowed today, and show how your proposal would deal with it?

2 Likes

That would be the Copy trait :slight_smile:

Yes it would, that's why it needs to be an opt-out trait like Sized, but adding opt out traits is a big thing because everyone needs to know about them. So its not worthwhile to add another opt-out trait unless it buys us a lot. Sized is extremely fundamental, so it makes sense that it's special and opt-out.

This proposal for a OnMove trait has come up so many times and has been shown to be largely unnecessary that any new proposal has its work cut out to prove its worth is in the same ballpark as Sized. I've only seen one proposal for an opt-out trait that even approaches this: DynSized (or whatever it's called now). DynSized would allow custom fat pointers, which is a monumental change for Rust, and even it has trouble getting enough support to justify being opt-out.

2 Likes

May I suggest moving this discussion to the internals forum? It seems to be more about extending the language than how to work with the current version.

4 Likes

Those make it clear that there's a demand for the ability to find the owner from the child.

Orouboros is a complicated way to do this using unsafe code and macros. "The #[self_referencing] struct will replace your definition with an unsafe self-referencing struct with a safe public interface." This has the right goal, but it's ugly and hard to make bulletproof. They yanked several old versions, presumably due to problems.

owning-ref-rs seems to address a different problem. "This can sometimes be useful because Rust borrowing rules normally prevent moving a type that has been borrowed from."

The lack of backlinks is a problem in graphics and game work. You have an object hierarchy, and there's a relative transform (position and rotation) at each node. To get the absolute position of something, you have to go from the leaf node to the root, multiplying transforms.

Without backlinks, this requires much fooling around to get to the parent node. I've tried approaches with hashes. That's slow. Going back to the root and outward to the desired item is worse. Putting all the items in a vector and linking them by index is possible, but in this system the hierarchy is constantly being modified, and doing that loses the advantages of Rust's ownership system.

Hence the need for safe backlinks.

I though that is what Rust's Weak pointer pointers were for.

This article discusses a binary tree where nodes point back to their parents with weak pointers: Working with Shared pointers in Rust: Challenges and Solutions [Tutorial] | Packt Hub which sounds like what you are asking for.

3 Likes

Those only work for multiple ownership, with Rc nodes wrapped around everything that needs a back pointer. How much overhead does that add?

I have no idea.

In my naivety it seems that if one wants a memory safe tree with parent pointers there has to be some overhead somewhere.

I'd be inclined to lean toward the "do it in a vector" approach. Perhaps some adopting some existing Entity Component System. Not that I have much experience of doing such things apart from my experiment to generate graphs for the tunnels of the Hunt The Wumpus Game: When is a linked list not a linked list?

Arguably it does not. Rust's ownership system is designed to prevent memory use problems like use after free, double free, use of uninitialized data, race conditions etc.

When one puts everything in a Vec the ownership system is still preventing those memory use errors. It's just that now ones errors in using pointers, memory use errors, have now been reduced to program logic bugs that may produce incorrect results, like any other logic error. The thing is still thing is still memory safe though, so one is in a far better position.

2 Likes

Then I need an allocator for vector slots, because objects are added and removed as the viewpoint changes.

I'm thinking Rc for hierarchy nodes now, with weak pointers for the backlinks. I know that will work, even if it adds some overhead.

It's kind of frustrating, because it's common in C++ to have a hard back reference in every object.

Rc has any overhead over Box only when cloning and dropping. If it is just used as a pointer, it is, well, just a pointer.

6 Likes

An allocator for that is probably overkill, if the objects have the same type you can probably get away with a slab

C++ does it with unsafe. You can already do exactly what you'd do in C++ in your Rust code.

While that's true, it's easier to get UB by violating rust's references invariants than C++'s references.

1 Like