Parsing a PEM file into raw bytes

I generated an X25519 public private key pair using openssl:

$ openssl genpkey -algorithm x25519 -out x25519-priv.pem

$ openssl pkey -in x25519-priv.pem -pubout -out x25519-pub.pem

$ cat x25519-priv.pem
-----BEGIN PRIVATE KEY-----
MC4CAQAwBQYDK2VuBCIEIGh9XkOCHvmorNZVGtVWXasvEkP4JKE5knI3NGd45q9K
-----END PRIVATE KEY-----

$ cat x25519-pub.pem
-----BEGIN PUBLIC KEY-----
MCowBQYDK2VuAyEAe/MXw6QdzEwpHQwlCUVZFf6XJk0ZCWsXKWk6o3bkQl8=
-----END PUBLIC KEY-----

I read that PEM is just base64 encoded der, so I set out to parse it to der and then further to bytes.

Using the following Rust code, I got the contents of the keys, which I believe is der encoded, since the length of the private key is 48 and public key is 44 (X25519 keys are supposed to be 32 bytes each) :

use std::fs;
use pem::parse;

fn main() {
    let priv_key = parse(fs::read("x25519-priv.pem").unwrap()).unwrap().contents;
    let pub_key = parse(fs::read("x25519-pub.pem").unwrap()).unwrap().contents;
    
    println!("priv_key: {}", priv_key.len()); // 48
    println!("pub_key: {}", pub_key.len());  // 44
}

Next, I decided to use der_parse to decode the der:

use std::fs;
use pem::parse;
use der_parser::parse_der;

fn main() {
    let priv_key = parse(fs::read("x25519-priv.pem").unwrap()).unwrap().contents;
    let pub_key = parse(fs::read("x25519-pub.pem").unwrap()).unwrap().contents;
    let (_, parsed_der_priv) = parse_der(&priv_key).unwrap();
    let (_, parsed_der_pub) = parse_der(&pub_key).unwrap();
    
    println!("parsed_der_priv: {:?}", parsed_der_priv);
    println!("parsed_der_pub: {:?}", parsed_der_pub);
}

But I don't quite get how to interpret parsed_der_priv and parsed_der_pub to obtain the raw bytes. I've never worked with these encoding before so I'm pretty confused, as it yields a really long recursive object:

parsed_der_priv: BerObject { header: BerObjectHeader { class: Universal, structured: 1, tag: Sequence, len: 46, raw_tag: Some([48]) }, content: Sequence([BerObject { header: BerObjectHeader { class: Universal, structured: 0, tag: Integer, len: 1, raw_tag: Some([2]) }, content: Integer([0]) }, BerObject { header: BerObjectHeader { class: Universal, structured: 1, tag: Sequence, len: 5, raw_tag: Some([48]) }, content: Sequence([BerObject { header: BerObjectHeader { class: Universal, structured: 0, tag: Oid, len: 3, raw_tag: Some([6]) }, content: OID(OID(1.3.101.110)) }]) }, BerObject { header: BerObjectHeader { class: Universal, structured: 0, tag: OctetString, len: 34, raw_tag: Some([4]) }, content: OctetString([4, 32, 112, 169, 75, 190, 91, 40, 192, 25, 124, 209, 78, 88, 224, 69, 123, 108, 8, 164, 217, 156, 222, 157, 18, 135, 231, 144, 228, 231, 221, 53, 83, 83]) }]) }

parsed_der_pub: BerObject { header: BerObjectHeader { class: Universal, structured: 1, tag: Sequence, len: 42, raw_tag: Some([48]) }, content: Sequence([BerObject { header: BerObjectHeader { class: Universal, structured: 1, tag: Sequence, len: 5, raw_tag: Some([48]) }, content: Sequence([BerObject { header: BerObjectHeader { class: Universal, structured: 0, tag: Oid, len: 3, raw_tag: Some([6]) }, content: OID(OID(1.3.101.110)) }]) }, BerObject { header: BerObjectHeader { class: Universal, structured: 0, tag: BitString, len: 33, raw_tag: Some([3]) }, content: BitString(0, BitStringObject { data: [134, 84, 193, 92, 72, 159, 157, 107, 155, 248, 67, 246, 224, 180, 37, 87, 92, 128, 157, 131, 165, 203, 69, 169, 92, 138, 138, 183, 72, 244, 203, 3] }) }]) }

Even the last long list of numbers isn't the right length, in case of private key, it is 34 when it should've been 32. In case of public key however, it is 32, which is what it should've been.

So I'm not sure what to make of this.

Just to be clear, I'm looking for something like vec![134u8, 84, 193, 92, 72, 159, 157, 107, 155, 248, 67, 246, 224, 180, 37, 87, 92, 128, 157, 131, 165, 203, 69, 169, 92, 138, 138, 183, 72, 244, 203, 3], where both public key and private key are 32 bytes long.

I think openssl x509 -in x25519-pub.pem -outform pem -out x25519-pub-decoded.pem should work. What do you need to for by the way?

Eh, welcome to the wonderful world of ASN.1 and its various encodings :wink: Seriously, you can't make sense of key structures, nor extract their components, without understanding how those structures are specified. For the concrete case of your key pair, the private key is encoded as PKCS#8 (I'll use the simpler definition from RFC 5208):

PrivateKeyInfo ::= SEQUENCE {
        version                   Version,
        privateKeyAlgorithm       PrivateKeyAlgorithmIdentifier,
        privateKey                PrivateKey,
        attributes           [0]  IMPLICIT Attributes OPTIONAL }

PrivateKeyAlgorithmIdentifier is another SEQUENCE, and attributes are missing. This should help you recognize the components of your key.

Now, PrivateKey is defined as an OCTET STRING, with the format dictated by the private key algorithm. In this case, it's an encapsulated DER-formatted OCTET STRING: the first two bytes are 4 (the tag for an OCTET STRING) and 32 (the length of the string). Therefore, the bytes of the private key are 104, 125 ,..., 74.

Similarly, the public key uses the SubjectPublicKeyInfo ASN.1 structure from RFC 5912.

To get at the raw key values, you can start from the top-level BerObject and, knowing the structure, extract successive sub-objects until you reach the needed ones.

2 Likes

This isn't a certificate, so that command doesn't work:

$ openssl x509 -in x25519-pub.pem -outform pem -out x25519-pub-decoded.pem
unable to load certificate
4593841600:error:0909006C:PEM routines:get_name:no start line:crypto/pem/pem_lib.c:745:Expecting: TRUSTED CERTIFICATE