Lifetime of borrowed nested value in struct

Dear community,

I'm struggling a bit with ownership & lifetimes. The code below is an excerpt of the problem. To summarize; I have a Person type, which contains another type that borrows a value for its entire lifetime. This InternalPerson is autogenerated . How can I best "attach" buffer to Person such it can be borrowed for the lifetime of Person?

#[derive(Debug)]
struct InternalPerson<'a> {
  data: &'a [u8]
}

fn read_internal_person<'a>(data: &'a [u8]) -> InternalPerson<'a> {
  InternalPerson {
    data
  }
}

#[derive(Debug)]
struct Person<'a> {
  internal: InternalPerson<'a>,
}

fn read_person<'a>(buffer: Vec<u8>) -> Person<'a> {
  Person {
    //this will not work, solely to illustrate intention
    internal: read_internal_person(&buffer), 
  }
}

fn main() {
  let buffer = vec![0,0,0,0];
  let person = read_person(buffer);

    println!("person is {:?}", person);
}

To fix the issue you need to set up read_person like read_internal_person where the passed in buffer is borrowed already. Otherwise what is happening is that the ownership of the buffer is passed to read_person function, but you are just borrowing from it in read_person and at the end of the function the ownership of the buffer is dropped. If nothing owns the buffer, then it can be erased (ie dropped) from memory. The compiler steps in here and is giving you an error indicating that you're trying to borrow the memory after it no longer has an owner.

This solves the problem by keeping ownership of the buffer tied to the buffer variable in main.

fn read_person<'a>(buffer: &'a [u8]) -> Person<'a> {
  Person {
    //this will not work, solely to illustrate intention
    internal: read_internal_person(&buffer), 
  }
}

fn main() {
    let buffer = vec![0,0,0,0];
    let person = read_person(&buffer);

    println!("person is {:?}", person);
}

If this isn't what you want, and you want the buffer to be owned by Person and just borrowed by InternalPerson, then you will run into what is called a self referential struct. Without going into details for now, that gets complicated.

Another thing to point out is that generally speaking lifetimes/borrows in structs are generally advised against particularly when starting out in Rust. There are many implications to them that are hard to fully appreciate when learning Rust, and often get used when other forms of reference are better (ie Box, Rc, Arc)

That said if you give a little bit more detail about what you're trying to accomplish, it would be easier to guide you to a good data structure for the problem at hand.

2 Likes

Thanks @drewkett, that makes sense.

To further explain; in reality InternalPerson is a type generated by the flatbuffers compiler. With the read_internal_person I aimed to illustrate this function from the flatbuffers package (source).

pub fn get_root<'a, T: Follow<'a> + 'a>(data: &'a [u8]) -> T::Inner {
...

A generated flatbuffers type consists of a Table type. Inspecting its type I found here.

pub struct Table<'a> {
    pub buf: &'a [u8],
    pub loc: usize,
}

Where the lifetime of the borrowed buffer equals that of Table itself, and thus(?) the parent flatbuffers type (and so forth).

My Person type acts as a wrapper, exposing various convenient methods. A Person is constructed by a function that queries a database, which returns a byte vector. This byte vector is transformed to an InternalPerson by calling flatbuffers::get_root::<InternalPerson>(&buffer).

The buffer is just a means of getting a InternalPerson. It doesn't have to be accessible during the lifetime of Person.

The problem is that you're using temporary borrows in structs. This is almost always a big mistake, since that <'a> makes structs themselves temporary and permanently locked to the scope where that reference came from.

The borrow checker is trying to tell you that Rust references can't store data (references are not for storing data "by reference", that's what Box is for).

You should have struct InternalPerson { data: Vec<u8> } and all your problems will go away.

A more direct equivalent is struct InternalPerson { data: Box<[u8]> } which gives you a slice that won't be anchored to some variable in some function somewhere else.

2 Likes

If InternalPerson holds reference to the data in buffer, then it have to be accessible. Otherwise, this reference would dangle.

So it’s flatbuffers that’s causing the issue. And just looking at some sample code, there’s probably nothing you can do about that. Are you using this for both reading and writing flatbuffer data? How performance critical are you?

Along the lines of what @kornel said in avoiding lifetimes, I think what I would do is just use the generated flatbuffers code to copy the fields that I want into a struct without lifetimes. And similarly the reverse operation if you need to write the struct back out. This might be a bit of a pain if there are many structs/fields there are (though a macro could probably be written to simplify it). But once you have that in place it will be much easier to work with the structs in your code since you won’t need to worry about lifetimes at all for the structs.

Otherwise, you pretty much can’t avoid needing to make sure that the buffer that is being read from is valid for the lifetime of the code if this is a long running process.

Yes, one of the reasons for choosing Flatbuffers is for its performance and efficiency. I'm using it for both reading and writing, the application is read heavy. My current workaround is by having a "reader" function, like:

fn read_person(buffer: &Vec<u8>) -> InternalPerson {
	flatbuffers::get_root::<InternalPerson>(buffer)
}

This way, Person owns buffer and there are no lifetime issues. Downside is that each field I wish to extract, it needs to "construct" InternalPerson over and over again. This seems less than ideal.

I believe Flatbuffers works with &[u8] by design, also since bincode and capnproto work quite similar.

Are there ways of eliminating the use of read_person?

Honestly I don't see a great solution here because of the way flatbuffers is structured.

The options seem to be copy the data, do what your doing or pass InternalPerson around and deal with the lifetimes. Your solution doesn't necessarily seem like a bad one. I think you're mostly just creating a struct on the stack with the buffer reference and an offset on each access, which should be pretty quick. If you're doing a lot of reads on the same data, I would probably try out copying the data once, which would then be a bit faster for future lookups, but that would need to be benchmarked to see if that's worthwhile.

Thanks everyone for your advice! I think I might benchmark some different setups and see how they actually perform.