Best practices for lifetimes within hierarchies of structs

What is the best practice in Rust for handling lifetimes within a hierarchy of structs? For example, should I take this hierarchy:

struct Content {
    type_: String,
    value: String,
}

struct Contact {
    email: String,
    name: String,
}

struct Personalization {
    to: Vec<Contact>
}

struct Body {
    personalizations: Vec<Personalization>,
    from: Contact,
    subject: String,
    content: Vec<Content>
}

and rewrite it as:

struct Content<'t, 'v> {
    type_: &'t str,
    value: &'v str,
}

struct Contact<'e, 'n> {
    email: &'e str,
    name: &'n str,
}

struct Personalization<'t, 'n, 'e> {
    to: &'t [Contact<'n, 'e>] // ?
}

struct Body<'p, 't, 'n, 'e, 't2, 'v, 's, 'c, 't3, 'v2> {
    personalizations: &'p [Personalization<'t, 'n, 'e>], // ?
    from: Contact<'t2, 'v>,
    subject: &'s str,
    content: &'c [Content<'t3, 'v2>] // ?
}

I feel like the answer is no, but also yes if I want to avoid unnecessary allocations.

  1. For the slices, is each element forced to have the same lifetime?
  2. What are the best practices for handling lifetimes in structs in these situations?

I feel like the answer is no, but also yes if I want to avoid unnecessary allocations.

The second version forces no allocations, but it also cannot be used in many ways. For example, it is impossible to write a function which calculates and returns a new Body<'????> with lists of Personalizations and Content, because the definition of Body is saying that they must be borrowed from something (with lifetime '????) and a function cannot simultaneously return something and a reference to that thing in another structure (because you cannot move something while borrowing it, and a return is a move).

If you want to create structures that admit the possibility of not allocating, use the std::borrow::Cow type to allow either borrowed or owned data.

Furthermore, in a data structure of immutable references, it is generally adequate to use one lifetime for the whole bunch. (The lifetimes only matter if you want to copy the references out of the structure and use them with their original lifetimes as opposed to one overall data lifetime.)

Applying both of those principles, you would get:

use std::borrow::Cow;

#[derive(Clone)]
struct Content<'a> {
    type_: Cow<'a, str>,
    value: Cow<'a, str>,
}

#[derive(Clone)]
struct Contact<'a> {
    email: Cow<'a, str>,
    name: Cow<'a, str>,
}

#[derive(Clone)]
struct Personalization<'a> {
    to: Cow<'a, [Contact<'a>]>
}

#[derive(Clone)]
struct Body<'a> {
    personalizations: Cow<'a, [Personalization<'a>]>,
    from: Contact<'a>,
    subject: Cow<'a, str>,
    content: Cow<'a, [Content<'a>]>,
}

The Clone implementations are necessary for use of Cow — one of its premises is that you can always convert a borrowed Cow into an owned one by cloning, to allow mutation.

All that said, it is likely that you should not bother with this and just use the owned (String, Vec) version. All-borrowed data is only really worthwhile when you have a situation where you definitely can use it (all the data already exists for you to borrow at the time you create the structure) and you need the performance benefits of minimizing allocations (e.g. you want to load and process a huge data file as fast as possible).

2 Likes

The best practice is to never do this.

Rust's temporary references are not a general-purpose way for storing data "by reference". They won't let you avoid allocating storage for data. They only force these allocations to be done "elsewhere", and require a lot of syntax and usage restrictions to track where the "elsewhere" is.

The first option already has a reasonably small number of allocations, and is the correct way to store data.

There are some rare cases where you may need to have both, and then the best pracitce would be to have pairs of Content and ContentView<'a> that is used as a temporary scope-bound view into data stored by Content. If you do this, you can usually unify all lifetimes of shared loans (&), because the compiler can shorten the lifetimes automatically, but keep separate lifetimes for all exclusive loans (&mut), because they're maximally inflexible by design ("invariant").

But having just Content<'a> without a counterpart it borrows from is likely to be unusably pralyzing for any real-world use and most likely a misunderstanding of what Rust's references are for.

If you need to make these structs more efficient, there's non-growable Box<str> which is a bit smaller than String. If you use often-repeating strings for keys/ids, you could use string interning (ustr) or inlineable strings. If the same data is referenced from multiple places, you could consider Arc<Contact>.

But declaring that all your structs are never allowed to store any data (this is what references mean!) is a red flag.

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.