How can we teach people about self-referential types?

Thanks for starting this topic. I've been wishing for a long time for a definitive “What about self-referential types?” FAQ, but it’s such a complicated topic that I haven’t gathered the energy to write it myself. I’d love to help put something together, though.

For referencing elements or slices of a collection, alongside the collection itself:

  • Separate the type into two different types, one that owns the collection and one that references it. Require callers to keep the owning type alive while using the referencing type.
  • Store indices or ranges of indices, rather than references. (Works best for immutable or append-only sequences.)
  • Use a library like owning-ref that provides a safe API for references to heap storage that is known not to change address when its owner moves.
  • Use Pin along with a library like moveit to ensure that values are updated when their addresses change.

For parent/child relationships where the parent owns the child, and operating on the child needs a reference back to the parent:

  • Avoid storing parent references in the child. Instead, always accessing the child via its parent, so the caller already has a parent reference available and can pass it to functions that need it.
  • Use shared ownership with Arc/Rc, and use Weak for back-links to prevent cycles.

For tree, linked list, or acyclic graph structures, where child nodes need back-references to their parents:

  • Use one of the strategies for parent/child relationships, above.
  • Or, use one of the strategies for arbitrary graphs, below...

For arbitrary graphs that may or may not contain cycles:

  • Use a slab allocator that can give out shared references with the lifetime of the slab.
  • Store nodes in a vector or arena and use indices (or generational indices) as links, instead of references.
  • Use Arc/Rc, plus code to detect and break cycles when nodes are removed (if necessary).
  • Use a specialized crate like petgraph or an entity-component-system (ECS) library.

When you are an expert in unsafe Rust and implementing a performance-critical data structure and you are absolutely sure that none of the above strategies is viable:

  • Write custom unsafe code using raw pointers and make sure you manually verify that it cannot ever violate any of Rust's memory-safety rules.
33 Likes