Building a runtime reflection system for Rust 🦀️ (Part 2)

Hey all! The second part of our series on building a reflection system is ready. In this one, we look at the various options (or lack thereof) for accessing attributes of Rust structs dynamically at runtime. Hope you enjoy!

7 Likes

The approach I’m using for a similar problem is defining a newtype that implements a Column trait for each attribute. The reflection API then is something like this:

trait HasColumn<C:Column> {
    fn col_ref(&self)->&C;
    fn col_mut(&mut self)->&mut C;
}

trait TryColumn {
    fn col_opt<C:Column>(&self)->Option<&C>;
    fn col_opt_mut<C:Column>(&mut self)->Option<&mut C>;
}

(Column and not Attribute in my case because I’m implementing relational algebra)

For your case, you’d need to solve the string to type problem before this is useful, but it sounds like that may have been part 1.

Oh cool! Yeah, the return type is converted in the shim methods into our internal types, so that is fine.

Do you have any tricks for automatically implementing either of the traits?

I’m pulling some relatively esoteric type-system tricks to pull that off: TryColumn for me is really Record, and it has an associated type that’s a frunk-style HList of all the columns that are present. There’s a blanket implementation of HasColumn based on this, which defers to col_opt and panics if it returns None (with a feature to use unreachable_unchecked instead).

Column implies Any, so col_opt can match on TypeId to do its work. For things like projections, column renames, and joins, I write adapter objects that delegate col_opt to their contained records.

I’ve been meaning to clean things up a bit and push the code to Github; maybe I can make that happen this week.

While I appreciate the work you put into this, I think that generating accessors would be way less complicated and more idiomatic using a derive macro that directly knows about fields and is able to return references to them without further indirection.

Thanks for the feedback! That is the desired end experience for users.

You can currently do:

#[derive(PolarClass)]
struct Foo {
    #[polar(attribute)]
    x: u32
}

and the derive macro does *whatever it needs to* to make this work.

We've discussed making the derive macro register all attributes by default, and making them skippable, stuff like that.

There's still the question of how you implement that under the surface, which this article covers. How would you "return references without further indirection" for example? Would you derive a single get(&self, attr: &str) method? What would the return type be? In our case it could be Box<dyn ToPolar>.

What I like about the current option is that you can mix and match. The derive macro can register as many attributes as you want, but you can still provide custom attribute accessors, if necessary.

Yes, that's probably how I'd design it, personally, since we are talking about dynamic lookup (and a derive is supposed to implement a specific trait, so I would consider it surprising if for example it generated a different set of functions for different types with a different set of fields).

I would probably make it return &dyn Trait, so that no allocation/copying needs to be performed. As to what the specific trait can be, it could be Any, since again, we are talking about dynamic lookup, which requires (somewhat) dynamic typing. Or it could be a custom trait that knows how to downcast and/or provide generic operators more ergonomically than Any can. Or, if working with a closed set of types, it could be an enum that wraps some primitive type or a reference to one.

1 Like

Nice read!! A few comments:

The entire instance is pinned: this method takes Pin<&mut Self> .

Careful with Pin, and careful with making too much publicity for it: it is a subtle wrapper. For instance, the statement alone "it is pinned because Pin<&mut Self>" is wrong: something is pinned if behind a Pin-wrapped pointer, such as Pin<&mut Self>, and is Pin-aware / is unable to unpin itself / is !Unpin.

  • And then once you get something correctly pinned, the whole thing becomes less ergonomic to work with. So I'd definitely favor the getter approach (which gets to be shared for all Instances of a same Class, where the self-referential pointer required that all pinned Instances carried them with them.

An alternative to self-referential pointers are offset-based pointers, which in this case, now that I think of it, could make a lot of sense:

  • the offsets are type-based, so they can be stored within the class,

  • the offsets can be computed by a (proc-) macro,

  • they single1-handedly solve the issues pointed out by:

1 At the meager cost of a #[repr(C)] attribute

Obviously, all this would be a micro-optimization more than anything else, but maybe one worth keeping in the back of the one's head :slightly_smiling_face:


Another micro-optimization (super tiny, to be fair): I feel like this could be using fn(&Instance) -> PolarValue pointer-types rather than Arc<dyn Fn...>. It will save a few atomic {in,de}crements and one level of pointer indirection, and reduce the size of its containing struct by one usize :slightly_smiling_face:

1 Like

See, this is why I love sharing this stuff. The feedback is always so helpful + interesting. :heart:

Thanks for the pointer on Pin. I was following the std::pin documentation from here: https://doc.rust-lang.org/std/pin/index.html#example-self-referential-struct on the self-referential struct. I guess the reason it doesn't show up in the above is that String is !Unpin? You're right, I should have been way clearer with the caveats - I got something working, and decided it was definitely not worth trying to make it correct!

That's a great alternative I didn't talk about (or implement). It did cross my mind when writing that exact section you quoted. I decided the #[repr(c)] was too intrusive, but honestly I don't have much rationale for why.

This is super cool... I'll be sure to keep this in mind once we're needing to squeeze out those last few nanoseconds and bytes :wink: