Nom usage question: How to manage source and parsed struct together?

I'm using the x509-parser crate which derives from nom. Knowing that is useful for context but the question is really a larger Rust lifetime question.

I want to be able to build up a list of parsed structs, in this case a Vec<X509Certificate> for which I can manage the lifetime of that list as a whole. An X509Certificate is actually X509Certificate<'a> where the lifetime must match the &'a [u8] source material.

I haven't found a way to manage the life of the source material and the parsed X509Certificate struct jointly.

I'm totally willing to clone the source material before parsing; the overhead is acceptable in this particular use case, but I can't find a way to package the two in such a way that makes the borrow checker happy.

Is there a secret sauce that I'm missing? Or is this impossible?

If I'm understanding correctly, you want to want to store the original data in owned form, like a Vec<u8>, in a struct alongside a Vec<X509Certificate<'_>>, where the X509Certificates hold references into the original data. This kind of setup -- a struct holding both owned data and references into that data -- is called a "self-referential struct" (you can search this forum for that phrase to see some previous cases where people have run into trouble with the same thing), and it's not achievable in safe Rust currently. So you will have to arrange your code to avoid the need for such a struct.

As @cole-miller has pointed out, you've run into something called a self-referential struct, which can be a bit of a pain to work with in Rust.

The typical way this is handled is by keeping the source object higher up in the call stack. Alternatively, you might be able to use the owning_ref or rental crate (the latter is unmaintained, but still quite usable) still create a struct which owns both the parsed data and source.

2 Likes

Here's how I might approach this without reaching for a self-referential struct (may not work in your detailed case but hopefully it at least points the way):

  • have a struct RawCerts { data: Vec<u8> } (possibly with other fields) that represents unparsed certificate data

  • implement constructors on RawCerts as needed to read the certificate data from a file, etc.

  • have a method on RawCerts to do the parsing:

    impl RawCerts {
        fn parse_certs(&self) -> Result<Vec<X509Certificate<'_>>, WhateverError> {
            /* ... */
        }
    }
    

    (notice how this returns a vector of certificates that reference self)

  • in some function you'll create a RawCerts with one of the constructors, store it in a variable, then call parse_certs and pass the Vec<X509Certificate<'a>> to other functions as needed (this is the part that @Michael-F-Bryan expressed as "keeping the source object higher up the call stack"; the key point is that something needs to own the raw bytes and be alive/in scope long enough to hand out references into them)

4 Likes

Thanks to both of you for your replies.

I've since determined that I don't need to retain the entire parsed X509Certificate struct in this particular use case, so the ongoing cost of the planned container just got a lot smaller.

So, yay Rust, for making me think harder. :slight_smile:

Nice!

I like that the compiler will nudge you towards doing the simpler thing, it's kinda like a senior dev standing over your shoulder and saying, "your ownership story is getting a bit complex there buddy, are you sure you want to do that?" And 9 times out of 10 there is a simpler way that you've just overlooked.

Of course, when you actually need to do things like this it can be a massive pain. Thankfully we've got 3rd party crates who can take care of the ugliness so end users don't have to.

Well put.

People ask me about using Rust and my standard reply is that I spend a lot of time fighting with the compiler. But, unlike some other languages where that is also true, with Rust I almost always wind up with better code as a result.

A worthy opponent, shall we say?