SlotMap IDs with types, sub-typing?

I'm thinking about how to model a document with a slotmaps. My documents are sort of like HTML, but simpler. They have blocks, spans, images, lists, etc. If I do something like this:

new_key_type! { pub struct NodeId; }
struct Document {
    // every document node has data here
    nodes: SlotMap<NodeId, NodeBase>,
    // only spans have data here
    spans: SecondaryMap<NodeId, SpanData>,
    // only images have data here
    images: SecondaryMap<NodeId, ImageData>,
    ...
}

The NodeId is a loose type. I'd like to have something like ImageId or SpanId for when I know I'm dealing with that specific sub-type of node. But I don't see anything in the slotmap docs about how to handle this. It just shows SlotMaps and SecondaryMaps with a single key type.

I was thinking something like...

enum NodeId {
    Span(SpanId),
    Block(BlockId),
    Image(ImageId),
    ...
}

But then I have no idea how to connect that to the SlotMap and SecondaryMap types.

I haven't used slot_map, but from the docs I don't think this makes sense. A secondary map has the same key as the primary map, so having a different key type for the secondary map doesn't make sense to me.

When you access a secondary map, the type of the node (image, span) is part of the type for the secondary map, e.g., SecondaryMap<NodeId, SpanData>. So you already have strong type checking for accessing the node type stored in each secondary map.

If I did this in C++, OOP style, then instead of NodeId I might pass around a pointer to the base class Node. Then there would be subclasses of Node like Block, Span, Image, etc. In cases where I knew I was dealing with a specific type, I could use pointers to subclasses instead of pointers to the base – ie Block*, Image*, etc, instead of Node*. In Rust, having a SecondaryMap is kind of like having a subtype. It would be nice if that could be reflected in the ID type.

An OOP style "table" node could be like:

struct Table {
    vector<TableRow*> rows;
}

But here in Rust I'm losing type safety:

struct Table {
    rows: Vec<NodeId>
}

So something like TableRowId would be created by the owning Document and represent a guarantee that there was something in a SecondaryMap<?, TableRowStuff> that contained the extra info specific to table row nodes.

I think understand your point, but there is no subtype relationship, in terms of Rust types. Having a primary slot map with one or more secondary slot maps is no different than having multiple HashMaps with the same key values, semantically, but implemented in a more optimized way.

By having a custom key type per group (primary map + its secondary maps) you create some level of type safety by preventing accidental access to one group using the keys of another group -- that's all that is possible with the slot_map crate.

(This is unrelated to your specific question, but also note that in general, with Rust there is no subtyping in the sense of C++ subtyping. So it takes some getting used to when coming from C++.)

One possibility is to create a struct that wraps a group of slot maps (primary + secondaries) representing a document. The struct would have methods that add nodes of various types and that would provide the guarantee that one (or more?) secondary map(s) and the base primary map is populated as appropriate for each node type. Each such method could wrap the slot map's id in a struct that is specific to the node type, i.e., ImageId, and return that type. The struct would also include methods for accessing the data for a given type, given an id parameter for that type.

1 Like

As far as I can tell, you can use different key types for slotmap by just roundtripping through the data method on Key and the required From<KeyData> implementation. It will be up to you to keep all the secondary map keys in sync with the primary node storage situation, but that's true even if you use the same key type.

It also means there's no real type safety encapsulation if you expose slotmap::Key implementors to downstream: they can convert between your key types too.

Another challenge with this intuition is that the source of truth about a slotmap key is the super case, so you would have to "downcast" your keys from NodeId to SpanId, etc., every time you got a new key. So you'll have to have some sort of "privileged" code (even if freely converting between all key types wasn't possible).

One way to approximate this would be to use a private enum as the "base class", with private key types too, and only allow infallible "downcasting" where the slotmap is used directly.

new_key_type! { struct ImageKey; } // etc
enum NodeKey {
    // Used for Default and insertions into Document.nodes
    Bare(slotmap::KeyData),
    Span(SpanId),
    Image(ImageId),
}
impl From<slotmap::KeyData> for NodeKey { .. }
// delegate to variant payload
unsafe impl slotmap::Key for NodeKey { .. }

// Private "forceful downcasting" with more meaningful semantics
impl NodeKey {
    fn to_image(self) -> ImageKey {
        self.data().into()
    } // etc
}

Then for the public interface, wrap them up and limit their interface.

// Public versions (private fields)
pub struct ImageId(ImageKey); // etc
pub struct NodeId(NodeKey);
// Public upcasting
impl From<ImageId> for NodeId { .. } // etc
// Fallible and public "downcasting"
impl TryFrom<NodeId> for ImageId { .. } // etc

Then on Document you would do your downcasting where appropriate.

impl Document {
    // private
    fn insert_bare(&mut self, ..) -> NodeKey {
        self.nodes.insert(..)
    }

    // Public.  Consumers can use `.into()` to get a `NodeId`
    pub fn insert_image(&mut self, ..) -> ImageId {
        let ik = self.insert_bare().to_image();
        self.images.insert(..);
        ImageId(ik)
    }

    // You can take `NodeId` where appropriate, or use generics
    pub fn do_node_stuff<Key: Into<NodeId>>(&mut self, key: Key) {
        self.actually_do_node_stuff(key.into().0)
    }

    // This also reduces monomorph bloat (if you make `do_node_stuff`
    // generic for ergonomic reasons).
    fn actually_do_node_stuff(&mut self, key: NodeKey) {
         // the actual logic
    }
}

There actually is a way to do something like this with real subtyping, but it is very non-idiomatic, baroque, and leads to awful error messages, so I can't recommend it.

It is not clear what are you trying to achieve here. NodeId is a type used for every table specifically to show that all the data for this key belongs to the node. How would you even use separate-typed ids?
Different Ids would make sense if document structure was modelled in a more loose way, with an indirection:

new_key_type! { pub struct NodeId; }
new_key_type! { pub struct SpanId; }
new_key_type! { pub struct ImageId; }

enum Node {
    Span(SpanId),
    Image(ImageId),
}

struct Document {
    nodes: SlotMap<NodeId, Node>,        
    spans: SlotMap<SpanId, SpanData>,        
    images: SlotMap<ImageId, ImageData>,
}

Now, spans and images have their own ids and independent storage, and the property that all data is connected by a single NodeId is lost.
I'm not advocating one solution over the other, I'm just trying to show that they are different and keeping the same Id for different data is meaningful.