Polymorphism with traits


#1

Looking to write a set of traits make any data structure look like an XML DOM tree. This would allow easy read, write, and query. I’ve come up with these traits and wonder if they are idiomatic.

pub trait Node<'a> {
    fn as_attribute(&self) -> Option<&Self>
    where
        Self: Attribute<'a>,
    {
        None
    }
    fn as_element(&self) -> Option<&Self>
    where
        Self: Element<'a>,
    {
        None
    }
    fn as_text(&self) -> Option<&Self>
    where
        Self: Text<'a>,
    {
        None
    }
}

pub trait QName {
    fn namespace_uri(&self) -> Option<&str>;
    fn local_part(&self) -> &str;
}

pub trait NamedNode<'a>: Node<'a> {
    type QName: QName;
    fn name(&self) -> &QName;
}

pub trait Attribute<'a>: NamedNode<'a> {
    type AttributeValue: Into<String>;
    fn value(&self) -> &Self::AttributeValue;
}

pub trait Element<'a>: NamedNode<'a> {
    type Attribute: Attribute<'a> + 'a;
    type AttributeIter: Iterator<Item = &'a Self::Attribute>;
    type Child: Node<'a> + 'a;
    type ChildIter: Iterator<Item = &'a Self::Child>;

    fn attributes(&'a self) -> Self::AttributeIter;
    fn children(&'a self) -> Self::ChildIter;
}

pub trait Text<'a>: Node<'a> {
    fn data(&self) -> &str;
}

I’ve not managed to specialize Node for every Element, every Attribute and every Text, since those implementations conflict with each other: the compiler does not know that an Element is never a Text. So each implementation of Element would also need to implement Node.

Perhaps there’s a already a crate with similar traits.


#2

I’m not sure about existing crates, but have you considered modeling the DOM with an enum rather than using traits?


#3

@vitalyd Yes, a concrete implementation might use enums. These traits could be implemented by those enums. The idea is to have traits so that DOM related algorithms can be reused for different data models.


#4

I think one would generally model the entire DOM as a single enum. That would make implementing your traits messy, if not impossible (at least as they’re specified here) because not every variant of this enum fits the trait. Splitting the DOM into multiple enums seems gnarly and likely to lead to spaghetti code. But maybe I didn’t think this through hard enough.

What types of (DOM-like) data models are you thinking of abstracting over?

As for the existing traits, you might be able to use specialization to provide default method impls but then specialize them for concrete types.


#5

The goal is to have a DOM for rich XML documents. I’d like to derive the code from Relax NG. That can lead to hundreds of classes and attributes. The code would make access to the data model for the XML model typesafe. This is to enforce nesting and cardinality rules.


#6

Traits are fundamentally open-world entities: setting up multiple blanket implementations is generally problematic. I think there are ways to use macros to make implementations less tedious without running into this issue.

In any case, I don’t think this function achieves what you (likely) intended:

    fn as_attribute(&self) -> Option<&Self>
    where
        Self: Attribute<'a>,
    {
        None
    }

The where clause does not provide evidence that Self: Attribute<'a>. Rather, it demands evidence upfront (whether it returns Some or None has no bearing on this). Therefore, you can’t really use it to test whether it is an attribute or not at run time – if it isn’t the compilation would simply fail.

What you probably want instead is something along the lines of MOPA: My Own Personal Any. That gives you the ability to conditionally downcast trait objects at run time.


#7

The MOPA documentation with limerick is very nice. I’d like avoid using Box though.

pub enum NodeType<'a> {
    Element(&'a Element<'a>),
    Text(&'a Text),
}

pub trait Node<'a> {
    fn node(&'a self) -> NodeType<'a>;
    fn as_element(&'a self) -> Option<&'a Element> {
        match self.node() {
            NodeType::Element(e) => Some(e),
            NodeType::Text(_) => None,
        }
    }
    fn as_text(&'a self) -> Option<&'a Text> {
        match self.node() {
            NodeType::Element(_) => None,
            NodeType::Text(t) => Some(t),
        }
    }
}

pub trait Element<'a> {
    type ChildIter: Iterator<Item = &'a Node<'a>>;
    fn children(&'a self) -> Self::ChildIter;
}
pub trait Text {
    fn data(&self) -> &str;
}

The associated type ChildIter has to be known in advance. This is possible but has no iterator:

pub enum NodeType<'a> {
    Element(&'a Element<'a>),
    Text(&'a Text),
}
    
pub trait Node<'a> {
    fn node(&'a self) -> NodeType<'a>;
    fn as_element(&'a self) -> Option<&'a Element> {
        match self.node() {
            NodeType::Element(e) => Some(e),
            NodeType::Text(_) => None,
        }
    }
    fn as_text(&'a self) -> Option<&'a Text> {
        match self.node() {
            NodeType::Element(_) => None,
            NodeType::Text(t) => Some(t),
        }
    }
}

pub trait Element<'a> {
    fn child(&'a self, child: usize) -> &'a Node<'a>;
    fn child_count(&self) -> usize;
}       
pub trait Text {
    fn data(&self) -> &str;
}

It would be nice if the children of Element could be provided by an iterator.


#8

Here is a (failing) attempt at adding an iterator that just calls child_count() and child() to iterator over Element children. This fails because

struct ElementChildIterator<'a> {
    element: &'a Element<'a>,
    pos: usize
}
impl<'a> Iterator for ElementChildIterator<'a> {
    type Item = &'a Node<'a>;
    fn next(&mut self) -> Option<Self::Item> {
        if self.pos < self.element.child_count() {
            self.pos += 1;
            Some(self.element.child(self.pos - 1))
        } else {
            None
        }
    }
}
pub trait Element<'a> {
    fn child(&'a self, child: usize) -> &'a Node<'a>;
    fn child_count(&self) -> usize;
    fn children(&self) -> ElementChildIterator<'a> {
        ElementChildIterator {
            element: self,
            pos: 0
        }
    }
}
error[E0277]: the trait bound `Self: std::marker::Sized` is not satisfied
   |
93 |             element: self,
   |                      ^^^^ `Self` does not have a constant size known at compile-time
   |
   = help: the trait `std::marker::Sized` is not implemented for `Self`
   = help: consider adding a `where Self: std::marker::Sized` bound
   = note: required for the cast to the object type `trai::Element<'_>`

error: aborting due to previous error

I do not know why it is giving this error, because the member element of ElementChildIterator is a reference and should simply have the size of a pointer.


#9

Specifying that trait should be object-safe for purpose of casting to object talks about why this doesn’t work.