I don’t see why this has to be unsafe. You could, for instance, save type-erased getter and setter functions for each field that use nom to read from/write to the appropriate buffer position. You might also want to compile to some kind of bytecode so that the field-name string comparisons happen at compile time instead of runtime.
In my current understanding, when Rust code interfaces with C structs via ffi, we need unsafe.
In the Vec<u8> approach, we need to be able to allocate, initialize, read/write fields, and de-allocate things sorta like C-structs.
Because of #2, my intuition is "lots of unsafe is required for this."
I see how one could make the counter argument "well, this is different since Rust is doing reading, writing, allocation, and has more info" -- but it is not obvious to me how to do this in code (nor have I seen any crate / tutorial for doing this.)
FYI - Python objects are backed by a special hashmap where obj.x is translated to obj.__dict__["x"] (plus/minus some hooks like __getattribute__), and its performance characteristics are often good enough.
Also, if all strings are interned by your runtime, your string hashing and comparisons turns into a trivial std::ptr::eq() call.
If the hashmap lookup is still too expensive, you could add a small-size optimisation which uses a Vec<(String, UserValue)> and linear search when an object only has a few of fields/methods (e.g. 20).
This is my fault for never formally stating this earlier, but by 'efficient', I was hoping that something like a.b.c.d gets compiled, by the interpreter, into bytecode that just jumps to the right memory location in one step.
in a Vec<u8> rep this would just be 12 bytes. So I would prefer something where we just increase the stack pointer by 12 bytes, and use it; whereas in the approach you suggested, we would allocate a Vec.
Sorry if Vec<u8> is the wrong term -- would be a better description of this?
I think one thing we would both agree is that the way "C" lays out structs and the way your proposal lays out structs is not the same, and in the "C" approach, everything is one contiguous block of memory, whereas your approach allocates stuff on the heap.
OK in that case you can't use anything like the original enum because enums have padding. Just allocate one big Vec<u8> for your stack, have a stack pointer, etc, and load values by using things like u64::from_ne_bytes (which doesn't require unsafe).
But if you're doing things like that then you're really micro-optimizing your interpreter, so it's no longer a "tiny interpreter".
This is generally not possible when a type's layout isn't known until runtime unless you have a JIT.
By "interpreter" -- and again, my terminology may be loose here, I was thinking: "something that takes as input &str and produces as output Vec<BytecodeInstr>" . You might be calling this a "JIT" (though I think it's too simple to be a "JIT").
"a.b.c.d" , given that we know the types A, B, C, D already can be compiled down to
addr(a) + (offset of b in A) + (offset of c in B) + (offset of c in D)
the addr(a) part we will only know when interpreting the BytecodeInstr
The (offset of b in A) + (offset of c in B) + (offset of c in D) part is a constant we can compute at the "&str -> Vec<BytecodeInstr>" stage
I guess your thinking goes like this: “real JIT have to create a machine code and I only just want tiny bytecode which I would then interpret”. But the truth is: 90% of work writing JIT which compiles to machine code is creation of memory model and appropriate bytecode! After you have it generating machine code (not a veryefifs more-or-less trivial.
Now, if you want to go further and add optimizations to that compiled code including profiling and other stuff… then it becomes much more complicated, sure.
But going from Vec<BytecodeInstr> to machine code is trivial and obvious step.
That's is why people are not trying to provide safe crates which make this step easy: since the next step (actual generation of machine code) is, by definition, unsafe… why try to avoid it in the first step?
My intuition, which might be wrong, is that the hard part of JIT is the optimizations part, as well as tracing and hot spotting; and that the gap between a "dumb bytecoder compiler" and a "JIT" is huge.
I guess the point about JITs was less about the JIT compilation to machine code and optimisation, and more about how the runtime needs to understand all the types that are available when a program is running so we can generate those fast accessors during execution and when an object's properties are changing.
The major thing about bytecode interpreters is that bytecodes are (generally) not statically typed. Instead, you'll have instructions like "read property X from the object at the top of the stack", and leave it up to the interpreter to figure out access and types. That almost always necessitates a hashmap or similar data structure that looks things up dynamically and stores data on the heap.
That said, if your programming language is statically typed, you know all the properties and types so the bytecode compiler can turn a.b.c into something like "load the variable at the top of the stack, push the element in the object's second slot onto the stack, push that object's third slot onto the stack".