Parsing OpenPGP key server dumps


#1

I’m trying to parse OpenPGP key server server dumps, which are just concatenations of many, many OpenPGP packets. There is already an openpgp crate, but it is a bit incomplete. Before I go about enhancing it, I have the following questions:

  • Is there a better option for parsing OpenPGP data? I’m not interested in cryptography beyond computing key fingerprints.
  • Is the way the parser is structured the usual way such things are written in Rust? I would have expected that it would return an enum of the implemented packet types (among them an unparsed generic packet fallback), but instead, you have to implement a trait with various callback functions, some of which totally unrelated to packet parsing.

#2

I’m not especially familiar with OpenPGP in particular, but I’ll at least comment the trait with callback functions. In my experience, a trait based approach like this is heavyweight but potentially beneficial because it gives the caller complete control over the representation in memory of the parsed data. The html5ever crate does something similar.

With that said, I would absolutely expect to see a “default” implementation of of PGP that gives you back the enum you were expecting. Forcing all callers to implement the trait seems a bit heavy handed. (The html5ever crate provides an implementation that callers can use out of the box.)


#3

I don’t see how html5ever gives lots of control over the data structure being employed. The callbacks are fairly high-level, and seem to imply a certain implementation strategy for state bookkeeping in the implementation. If you make any short-cuts in your implementation, you can only do so based on under-documented assumptions regarding the order in which the callbacks are invoked.

That being said, I was expecting a StAX-based event API, and not something approaching DOM (which might require to parse the entire key server dump into RAM, as a keyring structure, which is clearly not what is appropriate here).