Extensible enumeration-like types

Many binary protocols use some tag-length-value construct, where the tag field identifies the type of protocol element, followed by the length of the encoded form of the value, followed by the encoded value itself. The advantage of this protocol design is that it is possible to perform some limited processing on the protocol element even if the tag is unknown.

Is there an established way to represent such tags with a Rust type?

I'm using DNS resource record types as an example, and I think I need the following properties:

  • The underlying type must be specified. For example, for DNS resource record types, it should be u16.
  • The constants for known tag values (A, NS, …) must be constants usable in match expressions.
  • It must be possible to represent values which do not have associated constants (yet), construct tags with such values, and access the underlying value of these tags. (If the code does not know about DNAME resource record types, it should just ignore records of that type, for example.)
  • It should be possible to add further constants without impacting source code compatibility. (A future version of the DNS library might provide a constant for the DNAME resource record type.)

As far as I understand it, it is not specified that C-style enums can represent values which are not listed as members, so enums are not a solution. In C, it is customary to add the largest possible value as an enumeration member, but I'm not sure if this applies to rust. The C trick only works for values in the range from INT_MIN to INT_MAX anyway.

It is possible to create your own type with somewhat resembles enums using code like:

#[derive(Copy, Clone, PartialEq, Eq)]
pub struct RRType(u16);

pub const A: RRType = RRType(1);
pub const NS: RRType = RRType(2);
// ...                                                                         

impl RRType {
    fn name(self) -> Option<&'static str> {
        match self {
            A => { Some("A") }
            NS => { Some("NS") }
            _ => None
        }
    }
}

impl std::fmt::Display for RRType {
    fn fmt(&self, f: &mut std::fmt::Formatter)
           -> std::fmt::Result {
        if let Some(name) = self.name() {
            write!(f, "{}", name)
        } else {
            write!(f, "{}", self.0)
        }
    }
}

Is there a language construct to support this more directly? Or is there at least some well-established macro to implement all this?

Why not use something like

enum RRType {
    A,
    NS,
    Other(u16),
}

Have a look at how trust-dns does it.

I am not sure if I anderstand you correctly but what about:

enum Tag{
    A,
    NS,
    Custom(value: u32)
}

impl get_value for Tag {...}

enum DnsRecord {
    RRType1(length: u32, value: Vec<u8>),
    RRType2(length: u32, value: Vec<u8>),
    CustomType(tag: Tag, length:, value:)  
}

impl get_tag for DnsRecord {...}

alternatively, defina interface a record should conform to and let it define its own data structure.

Is there a way to hide the Other constructor or somehow prevent pattern matching on it? Maybe one could use an opaque type for its argument. I'm concerned that adding an MX constructor would break code which matches on Other(15).

You can make the other value opaque like:

pub struct RROtherType(u16);
pub enum RRType {
    A,
    NS,
    Other(RROtherType),
}

Then users can't destructure the Other variant.

There's still the problem that adding new variants is a breaking change for exhaustiveness, if someone matched Other(_) instead of a broader _ to catch the remainder. There's not a good solution for this yet, though most people put #[doc(hidden)] on such variants as a friendly agreement that it shouldn't be used directly.