Memory layout introspection

Rust allows conveniently querying the size of a type using std::mem::size_of, which is very helpful, but I may need to query more aspects of the memory layout of a type, for example the address and size of a private field or the address, size and value of a niche. What are the ways this could possibly be done?

This may be quite an unusual request, as rustc typically handles memory layouts automatically without any need for intervention, but for FFI it matters. Even in FFI this may be an unusual request, as repr(C) usually is sufficient, but I want to do better than repr(C). For my use case, the memory layout doesn't even need to be stable across compiler versions. It would be enough to have a way to query the memory layout of the current compiler version and feed this data into the program that generates the non-Rust code.

Mostly I want to pass Rust types as opaque values. The only things the non-Rust code can do with them are to move them around and pass them as arguments to functions implemented in Rust, so the only thing the non-Rust code need to know about them is their size, which I can easily query using std::mem::size_of. But for some types, I want the non-Rust code to know more about their memory representation, so it can do certain operations on its own, without calling a function implemented in Rust, for example get the length of a string or match an option and extract the contained value.

Strings I could convert using into_raw_parts and from_raw_parts into a struct with repr(C) when passing them over the FFI boundary. If the layout of this struct is identical to the layout of String, this conversion would hopefully be optimized away, but if for some reason the layout of String changes, there would be a costly conversion. It would be preferable if I could skip doing the conversion in the Rust code, and instead have the non-Rust code access the fields at addresses determined by querying the memory layout of String.

Options I could also convert into a type with repr(C), but this would be inefficient, as repr(C) doesn't optimize for niches. Option<std::io::Error> can fit in two registers, but a repr(C) alternative to Option cannot, and would instead be placed on the stack. If there was just a way to query the address, size and value of the niche of std::io::Error, I could again skip doing the conversion in the Rust code, and instead have the non-Rust code use this information.

I have a few ideas what can be done:

  • Ask around and see if someone has a solution.
  • Parse the 'rlib' files of the standard library to figure out the memory layout of each type for the current compiler version.
  • Compile probe functions such as |x: String| string.len() and |x: Option<std::io::Error>| x.is_none() and disassemble the compiled code.

I don't think you can/should rely on this. The layout of #[repr(Rust)] types (i.e. everything without a repr annotation) is implementation defined, can't be relied upon, and may change between compiler versions or runs. Parsing rlib files means you'll probably end up making invalid assumptions about layouts, which will lead you to writing/generating broken code.

Mario Ortiz Manero (I don't remember his u.rl.o username, but his website is NullDeref and he has lots of good content) brought this to my attention after reading an article I wrote several years back regarding plugins, and you can't just say "I'm okay with a little UB". To quote a previous thread on unsafe:

Note that your #[repr(C)] version of String will still only contain a pointer and two usizes. So unless you are passing these things around in a really tight loop I doubt shuffling three numbers between registers will have a meaningful impact on performance.

One approach that comes to mind is writing your own custom derive which lets you query an object's memory representation at runtime.

The full implementation is probably larger than you can comfortably read in a Discourse comment, but this is my toy implementation on the playground. While it looks complex, code like this is really easy for a custom derive to generate.

Once you have a representation of the memory layout, you can copy @Yandros's trick from safer_ffi and use a special cfg-gated function to generate code to consume it.

Again, all the same comments around #[repr(Rust)] being implementation defined/unreliable apply. You should probably look into the abi_stable crate if you want to pass Rust standard library types across the FFI boundary.

I don't think you will be able to reliably understand niches in C without either using some sort of shim function that does x.is_none().

You could parse std's source code and look out for the #[rustc_layout_scalar_valid_range_start(...)] or #[rustc_nonnull_optimization_guaranteed] attributes, and reimplement the rest of rustc's niche optimisation logic in your own code, but at that point I'd rather pull in rustc as a library and ask the type system directly.

3 Likes

Good point. The things I can do with size_of I do in the same compiler run as produces the program that will export the type through FFI, so that I know is safe. I would love to have a function std::mem::niche_of to query the niche offset, size and value of a type.

If I do things in a build script on the other hand, that's a separate compiler run. If I read the rlib file, then the rlib file read will be the same as the rlib file used in compilation, and any data read from it will be consistent with the compiler run that produced the rlib file. But I guess the niche of each type is generated on each compiler run, and not specified by the rlib file, so maybe the niche can't be found without making daring assumptions.

Unlike the ordinary use case for FFI, this is for implementing loops and such, which should ideally be optimized just as well as doing the same thing without FFI, preferably by inlining, or at least by making highly efficient FFI calls. Inlining is as I described part of the same problem, as it requires making assumptions about memory layout.

Your approach looks like it could be useful for registering user-defined types with the JIT compiler, which will be an important feature once I'm ready to implement that. But it doesn't solve the problem of querying the memory layout of already existing types. Your example code merely specifies assumptions about the layout of String, as if String would have a stable layout.

I looked into it. It defines an alternative type for each standard type, using repr(C) and having methods to convert between them. Which is the same thing as I'm doing now and wondering if I can do better than.

The point was to find a way to generate this shim (probably from the build script). The shim itself is inlined, whereas calling is_none as a foreign function is not. And not that this makes a difference, but the non-Rust code is not written in C, but in Cranelift.

Using rustc as a library is an interesting idea, but it has its downsides. A program that pulls in rustc as a library is itself a separate compiler run, which does not guarantee the same results as the compiler run that builds the program, and it only works on nightly.

Plenty of food for thought.

This necessarily will be opt-in -- no, you can't look at the private fields on my types -- so you probably want to make a custom derive for a new trait that you define. That could generate code that uses memoffset — bare metal library for Rust // Lib.rs to soundly get locations of things. (You'll see that Ralf has contributed a bunch to that crates to make it sound using the relatively-new ptr::addr_of.)

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.