I've come across quite a few libraries using marker unit structs and would like to clarify their usage. Take petgraph, for example;
/// Marker type for a directed graph.
#[derive(Clone, Copy, Debug)]
pub enum Directed {}
/// Marker type for an undirected graph.
#[derive(Clone, Copy, Debug)]
pub enum Undirected {}
When specifying the type of graph, e.g. graph::<u32, u32, Directed>new(), we use one of these types (or just use an alias or method that specifies the type). I have a few questions about this design:
Couldn't the same functionality be gained by passing and storing an enum in the type? E.g. GraphType? My assumption is that this avoids match statements everywhere, avoids storing an enum, and ensures traits are implemented only for the specific type of graph. Is that correct?
What are some cases where this would be preferred over passing an enum? (Trait implementations as mentioned above?).
Is there any reason they chose an enum rather than a struct?
It makes me a little uneasy creating unit structs which, in some way, provide a similar functionality to enums, however perhaps make it more difficult to understand which variants of a type are available (e.g. Is there another type of graph that the documents failed to mention? I'm sure there isn't, but to be sure, wouldn't I need to dig through trait implementations just to make sure? As opposed to an enum, which is very explicit about variants. For the above, I'm aware that a GraphType trait could be implemented to constrain the exact unit types that could be passed in, but I find it's still making things much more complicated.
The big difference between this and using an enum to specify the graph type is that this is statically-known at compile time. Among other things, this means that the library can issue a compile error instead of a runtime error if you try to perform an operation that doesn't make sense for the graph type.
There is no way to construct an enum that has no variants, so there's no question of whether you should ever instantiate one or not.
Just an example for what cool stuff you can do with statically knowing whether your graph is directed or undirected (what @2e71828 described above)—what you can't do with passing an enum—is implementing methods only for directed or undirected graphs:
use std::marker::PhantomData;
/// Marker type for a directed graph.
#[derive(Clone, Copy, Debug)]
pub struct Directed;
/// Marker type for an undirected graph.
#[derive(Clone, Copy, Debug)]
pub struct Undirected;
struct Graph<Ty> {
ty: PhantomData<Ty>,
}
impl Graph<Directed> {
fn method_only_sensible_for_directed_graphs(&self) {
todo!();
}
}
impl Graph<Undirected> {
fn method_only_sensible_for_undirected_graphs(&self) {
todo!();
}
}
It's also possible (via a trait implemented for the markers) to actually change the data (the types of fields or methods) based on what marker was chosen. For your graph example, there could be an Edge struct which stores a direction field which is an enum Direction { Forward, Reverse } for Directed, or () for Undirected. This avoids storing unnecessary data, and would be impossible to do with mere run-time checks — the size of a struct can't depend on run-time data.