[philosophical] Are `struct`s a redundant language feature?

Quick word up front: I'm in no way advocating any breaking changes to Rust.
Rather, consider it a mental exercise for a future programming language that might be a bit terser with its core features.

I'm currently working on a derive macro, and that's changed my perspective on the relationship between structs and enums.

From the macrologist's perspective, enum looks strictly like a generalization of struct. Specifically, any struct can be modeled precisely by an enum with only 1 variant, e.g.


struct FooStruct {
    field: u8
}

struct FooTupleStruct(u8);

struct FooUnitStruct;

enum FooEnum {
    Struct { field: u8 },
    Tuple(u8),
    Unit,
}

So given the above, how exactly is struct not just syntactic sugar for enums?
Is there anything a struct can do that an enum cannot in Rust?

4 Likes

Technically it can be done, but using that enum is rather cumbersome because you can't access the fields without a match. There's also the whole #[repr(...)] thing, e.g. this

#[repr(transparent)]
enum A {
    Case(u8)
}

doesn't work currently. (although there's an RFC to change that)

4 Likes

Yes using an enum with 1 variant is slightly more cumbersome than using an equivalent struct, that's why I call structs syntactic sugar for enums. It's similar to if let vs match in that way.

As for the repr issue, ok that might not be currently implemented but that's nothing fundamental to enums - I can easily see this work by putting the #[repr(...)] on the variant rather than the enum / struct.

I mean, you can translate any program using structs into one using enums :woman_shrugging:

Maybe there are some details with FFI

1 Like

Structs are only redundant due to the fact that Rust allows variants to mimic structs without having to actually use the struct keyword. In a smaller model with less syntax sugar, you'd have tuples (unlabeled structs) and enums where each variant has precisely 1 piece of data.

3 Likes

Not at all. If you look at it from a theoretical perspective, it's actually the other way around. Well, sort of.

structs with named or ordered fields are purely product types: their possible values are the Cartesian product of the sets of possible values of each field.

enum variants in most languages with an algebraic type system are usually allowed to have no associated values or exactly one. An enum of which all variants carry exactly one associated value is a pure sum type.

In Rust, however, enums have this convenience feature whereby you can directly specify a variant's associated value to be of an ad-hoc product type by adding ordered or named fields to the variant ("tuple" or "struct" variants). This wouldn't be strictly necessary for expressing a sum type, and it breaks the purity of the sum type. However, it's — arguably — useful and convenient.

15 Likes

I guess the point is that you have noticed that enums are not just sum types - they're sums of anonymous products, so you can also use them for creating products.

9 Likes

I see, I wasn't aware that a pure sum type definition requires exactly one associated value.

Are there any negative implications of this at the theoretical level? Because practically speaking, my experience has indeed been that being able to express ad-hoc product types is really nice to have.
And if there are no negative implications, what's the problem with breaking purity there?
Because from my current perspective it looks like the definition of sum types can simply be "upgraded" to allow an ad-hoc product types for each variant.

I don't think breaking the purity has any impact besides making enums more convenient.

1 Like

I wasn't implying there was a problem.

Why would we need that at the theoretical level? Seems like an unnecessary complication. If you need to e.g. compute the cardinality of an enum with struct variants, just represent it as a sum of products and call it a day.

The appropriate theoretical foundations for working with types are not the same as the concrete features in a programming language that make coding convenient in practice. We shouldn't force the theory fit 100% the syntax of a particular language, and vice versa, we sometimes need to come up with "theoretically impure" language features for reasons of convenience and ease of use.

2 Likes

From which it appears that a union is an anonymous sum-of-products type.

1 Like

Because in practice it's been true for more than 5 years, at least in Rust.
But because you mentioned breaking purity, I figured that since it hasn't happened (yet) surely there'd be a reason for that. Apparently there isn't, other than orthogonality coupled with the fact that redefinitions are tricky.

Interestingly, in the part of rustc that computes data structure layouts, structs (and tuples, for the record) are treated in exactly this way: as an enum with only one variant.

See src/librustc/ty/layout.rs. This is the only part of rustc I have any familiarity with; I don't know if it is true of other parts of the compiler, but it seems likely.

I don't really have any comment on the "philosophical" aspect of things.

3 Likes

I wasn't aware of this, but in retrospect it makes sense: I use the syn crate in my derive macro, and IIRC syn traces its origin back to rustc.

  • Structs can be unsized, enums not.
  • Record structs occupy only a type namespace, while record-like enum variants occupy both type and value namespaces.
  • Record struct names can be used both as a type name and as a constructor.
6 Likes

Niko Matsakis's Virtual Structs blog series from 2015 is an interesting glimpse at what Rust might have looked like if structs and enums were unified into a single concept.

7 Likes

Little hint of irony in the post:

Basically, once we implement specialization,

A bit off topic, but 5 years later, specialization is still not stable. Some features really do languish in limbo for way too long I guess.

2 Likes

hehe what the language do you speak about? any abstract programming language or choosen programmin' lang like (c++, rust, java, etc..)? From the first perspective you can implement very small subset programming primitives to solve ANY problem, Just look at assembler - no data types, no cycles, no switches.... - and you can implement everything (in principle). But when you look at a programming lang from pragmatic viewpoint you should take in account many other things ,,, For the instance about domains of applications it is designed for. From that viewpoint the product types are just facilities designed to lift up the simulation complexity of the real objects. So if you remove them from the lang, the lang itself will be unsuitable for such applications...

1 Like

I've often thought that there is a case for unifying structs and enums in documentation, since when you are using them (especially through associated methods) you do not much care what class* of type they are. I don't think you'd want to unify types since they are aliases that work differently, or unions since they are niche and unsafe.

*It's really hard to come up with a word to use here, type, kind, category are all out since they have meanings in type theory.

2 Likes

Agreed on unifying in documentation. Consider that sometimes the only reason an enum is wrapped in a struct is for the outer item to be public while the variants are private. Currently its easy to forget if a type is an enum or struct and not find it in the wrong section.

Personally I'd go farther and include type definitions (aliases) in the same rustdoc section. The reason being that they are frequently used and thought of in the same way. They can still be color coded differently and the detail pages should continue to make the actual kind of item clear.

1 Like