Designing with Borrows for long-term usage

Hi All,

I'm working on an app that uses the elf crate. The idea is that you can load a bunch of elf files and you can use them to do some stuff, maybe load more elf files, etc. The crate exposes an elf::ElfBytes<'a, elf::endian::LittleEndian> which borrows the binary data for the elf file.

I'm getting stuck in creating an object that can hold these ElfBytes and can be used later on. For example, if I try to do something like this, it will fail:

struct Foo<'a> {
    data: Vec<u8>,
    elf_bytes: elf::ElfBytes<'a, elf::endian::LittleEndian>,
}

impl <'a> Foo<'a> {
    fn new() -> Self {
        let data = Vec::new();
        let elf_bytes = elf::ElfBytes::<elf::endian::LittleEndian>::minimal_parse(&data).unwrap();
        Foo { data, elf_bytes }
    }
}

Note that it's not that I want to design it that way, I don't really care how it's designed, but I want to understand how a good design for this use case would look like.

In the above example, Foo only held one single elf file, but the idea is that you would load one elf and all its dependencies, and in order to understand the dependencies you might want to do something with the bytes/elf file.

My understanding of the available options are:

  • Pre-process everything and don't store the ElfBytes in my struct.
  • Somehow don't use a struct at all
  • Use a crate like ouroboros to get a self-referencing struct (discouraged).
  • Somehow figure out a way to allocate the bytes "outside" my struct and pass the byte slices to my struct. (Can't think of a non-convoluted way of doing this)
  • Some unsafe dark magic.

I'm writing this to ask for advice on how to model this use case. Ideally I would want a general solution (like a design pattern) for this kind of use cases. I've found similar problems on this forum and it seems like each of them have their own ad-hoc solutions which are variations of what I mentioned above.

If a file and its dependencies are loaded into memory and freed as a group later on, you could consider using an arena allocator like bumpalo. This allows the group of files to share references freely without lifetime problems, since all allocations have the lifetime of the arena itself. When finished with the group of files, the entire arena is dropped which frees all the allocated memory for the group.

This doesn't work well if you need to free memory for some files or other data structures in a group of files before you're finished working with the group. With an arena allocator you can only free the entire arena.

The above approach didn't work in the end. There is no way I can find, short of using unsafe code, to use an arena to solve the self-referencing problem. I thought I had done that earlier, but I must have been mistaken. I apologize for the distraction.

1 Like

Thanks for the reply.
Everything will be loaded in memory (not at the same time), and there's no need to deallocate anything until the end.

How would you use the arena in this situation? Would you pass the arena to the Foo object so it allocates the files there, where it also creates the vector of ElfBytes (which would be stored in the arena or in the struct?)

Wow, thanks for this answer!

I'm still new to rust, so I am not sure how the BBox leak works or why it is needed. I will try to read more about rust and come back to this question :slight_smile:

I'm very sorry, I'm going to have to delete the code example I added earlier. I have done a more realistic test and the lifetime constraints are not working out as I described. I will try to post a better answer later.

In short, if you're using an API which borrows from a buffer, your two choices are

  • the buffer outlives the borrowing usage in a well-scoped manner (which could be by leaking it; a reasonable choice for shortlived processes), or
  • utilizing unsafe black magic, hopefully behind an abstraction like ouroboros or yoke.

It looks like the ElfStream<E, std::io::Cursor<Vec<u8>>> type should be an owning alternative to ElfBytes<'a, E>. (Cursor is an in-memory buffer, so no actual file reads/seeks are involved, despite going through the io::Read+Seek traits.) So no need to self-borrow in this case, though there might be some overhead (i.e. any internal caching/buffering ElfStream does that ElfBytes can skip over).

Also, the crate docs call out that

Depending on the use-case, it can be more efficient to restructure the raw ELF into different layouts for more efficient interpretation, say, by re-indexing a flat table into a HashMap. ParsingIterators make that easy and rustily-intuitive.

so you might want to look into using ParsingIterator and extracting your desired information up front anyway.

3 Likes

Thank you very much! I've made a mess of this thread and I'm glad to see there are other ways to use the elf crate that don't require borrowing from a buffer.