What do you mean by "construct" here? If you haven't monomorphized that type into the binary, you can't literally construct it. And if you have monomorphized it, then you have all the layout stuff dealt with already.
What's the interface to this that you're attempting to provide?
Note also that tuples are repr(rust), so you cannot predict their layout, and thus there's no way to compute what the layout would be at runtime in order to give references to instances you've constructed yourself...
I was just fighting with the issue of how to combine N permutations of a set number of Sized types into a single ?Sized tuple in one box (i.e. reduce the heap pointers of a Vec<Box<dyn Trait>>). I started building from a desired state and went backwards from there until I reached this monstrosity [playground]
Also, the big issue I came across was that as soon as something is put in a Box all associated types need to be scrubbed from it or else it still carries a concrete type with it, so Vec<Box<dyn Trait>> can only ever be monomorphized to not allow reconstruction. What you describe sounds like you're on your way to a giant enum...
Well, the alternative option I started working towards was to make my own block allocation and divide it up to use as I see fit. That had the nice benefit that only "viewer" structs (defining locations of the data) needed to be put into an 'enum'. Also, the match to create would only require knowing N items taking up S size. It's still a very rough work in progress, but [playground]
3-6 DOF Euler, Verlet, Leapfrog, RK2, RK4 (+other RK tableaus?) in any flavor of float type. The range of sizes for my enum solution was well over 10x, and seemed wasteful when 3DOF Euler will cover most common use cases.
Actually, I think on this point, our problems differ quite a bit. It seems like you have a bunch of different types, of possibly different size, all implementing the same trait, that you want to stuff into one continuous area of memory instead of fragmenting the heap.
My problem is a bit different in that after we fix a table, all records of a given table have the exact same size and exact same layout, and I stored a Vec where within a Vec, all the Records have precisely the same size / layout (whereas it seems you want a Vec of objects of possibly different sizes).
In my case, two different table will likely have different sized / layout Records, but within a single table, all the Records are precisely the same layout / size.
Ahh, interesting... Well, unless I'm still missing something, it all comes down to not being able to return different types from match arms. Have you seen the tuple_list crate? As long as you can declaratively write what you want to put in it, it works (but that`s a critical limitation).
I think there's still quite a lot of similarity; I just happen to have the escape route of being able to fall back on traits. Eventually I'd like to reach a similar point as you - a runtime-selectable, fixed-layout tuple of structs (with all required compile-time guarantees) that can be further collected (as a uniform type) into an independent runtime manager.
I'm now wondering if procedural macros might offer some help in generating boilerplate code for building such variadic collections. I've been meaning to get around to learning those...
here, in the 'data : HashMap<String, Vec>, the String is the column_name, everything in the column has the same size of usize: type_size, and it's type is indicated by u64: type_type`
Because the problem has such a "clean" representation in a "column first" representation, I think there are certain structures that can be exploited in a "row first" / record representation.
I do not know if your problem satisfies this condition.
Because of (1) & (2), I think everything I care about can be represented using Cap'n Proto: Schema Language . I don't intend to use capnproto, but the fact it serializes so nicely again implies certain stuctures.
Again, I don't understand your problem well enough to comment on your case. In my case, I don't think it matters: whatever the fixed O(1) cost is, I pay it once per column. After that, all "dynamic ways to combine columns" happen at runtime.
This probably just comes from seeing so many graph/tree structures based on indices to avoid self-referential in structs, but is there a way to wrap the column first solution such that appears/behaves as if it were row first? Or are you specifically looking to work out a new format that is conducive for all uses?
I'd suggest that your type_size and type_types might need to be generics associated with the table: struct Table<TypeType, const TypeSize: usize>, or at least const values in some way so they remain compile-time guaranteed. Thus, you may end up needing to encode your entire top-level "row" type in the table's type anyway, or as some form of attached constants. Also, is the u64 for type_type meant to take advantage of TypeId?
I am not a database expert, so take this with a huge grain of salt: Cache locality.
Common ops are: delete a row, insert a row. Uncommon ops are: delete a column, add a new column. So if we have a table with 10 columns; in row-first, we modify at one place. In column-first, we modify 10 vecs.
On the read side, if we are searching for "everyone where name == 'John'", then, sure, column oriented wins as we only have to scan one column. However, if we are mapping through the records, and for every record we need to read multiple columns, then, once again, row-first has better cache locality.
I agree that the pseudo code I posted is trash. I don't know what the right solution is. It seems it needs to somehow encode the schema of the table, which involves encoding the type of each column, which may or may not involve enumerating each field, it's name, it's offset, it's type. This is actually why I was looking at capnproto a bit earlier, the "schema" capnproto defines is very similar to the "schema" of a column.
I think this also implies you are right about using procedural macros. This does sound like a situation involving "running through the struct/enum" and outputting lots of stats. Possibly something like:
Ahhh! I'm sure I'm even less of an expert with databases, but that definitely seems important. If you care about rows, but data exists as columns, grabbing row 10,517 would mean making that index call to each column. Not great, especially if those are of a size where they may need to be cached to disk!
Totally! At the very least I've found that making assumptions about an ideal concept that could exist at runtime and trying to work backwards from there has helped me to learn more about Rust.
Are you looking to retain mutable ownership of the table at all times? If not, immutable structs with multiple overlapping localized views into segments of data might give some flexibility. Though that gives the added complexity of having to shadow variables into mutated versions of themselves any time you do want to add/remove columns...
This is the issue I was running into about not being able to monomorphize traits. As far as I was able to work out, if you want to be able to return a type from raw data, it needs to be either a fixed return type or as an associated type with the trait. So, I could do this: