Get offsets of the fields of Vec<u8>

I am manually writing a JIT compiler for x64 using dynasm for an emulator, and I need to read a value from a Vec. But to be able to read from the Vec, I need to get, at least, the offset of the field containing the buffer pointer. But, of course, the layout of Vec is not stable.

For now, I got the following function for getting the layout of Vec (playground):

/// Return the offsets of the fields `ptr`, `length` and `capacity`.
fn get_vec_u8_fields_offset() -> (usize, usize, usize) {
    use std::mem::{align_of, size_of, transmute};

    // build a vec with distincts ptr, length and capacity.
    let mut vec: Vec<u8> = Vec::with_capacity(2);
    vec.push(1);

    // ensure they are distinct.
    assert!(![0, 1, 2].contains(&(vec.as_ptr() as usize)));

    // ensure the layout of Vec are 3 usize values.
    assert_eq!(size_of::<*mut u8>(), size_of::<usize>());
    assert_eq!(size_of::<Vec<u8>>(), size_of::<[usize; 3]>());
    assert_eq!(align_of::<Vec<u8>>(), align_of::<[usize; 3]>());

    // SAFETY: the checks above guarantees this is okay
    let fields: &[usize; 3] = unsafe { transmute(&vec) };

    // find the index of each field.
    let len = fields.into_iter().position(|x| *x == 1).unwrap();
    let cap = fields.into_iter().position(|x| *x == 2).unwrap();
    let ptr = 0 + 1 + 2 - len - cap;

    // convert the index to byte offset.
    let s = size_of::<usize>();
    (ptr * s, len * s, cap * s)
}

I tried to make it as resilient as I could, by panicking instead of a returning a hard to debug wrong value, in case the Vec layout decided to heavily change some day.

Is there any flaw in my code? Is there a better way of doing it? Ideally I would like the function to be const, but I could not come up with a way that works. I managed to get this, at most, but it needs Vec::len and Vec::capacity to be const (but I don't see why they could not).

I would do it rather differently. If the information is not available at compile-time, make it dynamic! Specifically, I would write trivial FFI-safe functions that return the buffer pointer and the length, and which can easily be called by the JIT. Something like this:

#[no_mangle]
pub unsafe extern "C" fn vec_buf_ptr(vec: *mut Vec<u8>) -> *mut u8 {
    unsafe {
        (*vec).as_mut_ptr()
    }
}

#[no_mangle]
pub unsafe extern "C" fn vec_len(vec: *const Vec<u8>) -> usize {
    unsafe {
        (*vec).len()
    }
}

Then emit a call to these functions and use their return value whenever you need to access the contents of the Vec.

2 Likes

Yes, I could do that. I am already doing something similar, but the function reads from the Vec and other memory regions. But I am trying to optimize the code emitted, and accessing the vector directly would be faster.

Since it's you generating the code, couldn't you do that anyway? I mean, inlining is a pretty basic optimization pass in compilers; what backend are you using for codegen? It may already have support for inlining.

I am using dynasm, which basically just emits machine code to a buffer. So the only way of inlining a function, is by writing its assembly by hand, which I need the layout of Vec for.

Hmm, but don't you already have the assembly if the function is already compiled? I would imagine you need to put such "runtime" code in a separate support library (compiled before the JIT runs) anyway, in which case you should be able to find the corresponding symbol and get the code.

A more convenient (and I guess more robust) way would be to use an optimizing backend such as cranelift.

Using runtime code would normally involve a function call. Theoretically I could try to parse the machine code from that function, apply transformations to it in order to make it use the same registers that the surround code, and write them to the buffer.

But that process is much harder than my original question (maybe a superset, even).

And the runtime code don't need to be in a separated support library, as it is a JIT.

I am considering, in the future, implementing another backend for my emulator using cranelift for cross-platform support, etc. But, as far as I know, cranelift cannot inline an external function and would still require to at least implement the function in CLIR, which would still need the layout of Vec.

But that's not why I suggested the external library.

That is precisely what I was suggesting.

But you don't need any of that. You can find the instruction that performs the actual load/store and infer the offset from it.

Yes, but rustc has a cranelift backend, so this can be automated.

Maybe it would be better to store your vec as raw parts instead of Vec<u8> and only (temporarily) reconstitute the vec when you need to operate on it in the Rust side and then disassemble it again overwrite the raw values. That way you'll have a guaranteed layout.

4 Likes

Yes, that may be the more reasonable approach. I can create a struct like,

struct MyVec {
    ptr: *const u8,
    len: usize,
    capacity: usize
}

and them just get the offsets using offset_of!, like I already do in the remaining of the code.

But I still wonder if my original approach is bad.

It seems to be quite defensive, but it doesn't check that the pointer value can be found in the struct, so it would miss some pointer manipulation tricks.
Additionally it's always possible that we're doing something to the values that just happens to work with the ones you're trying. It's quite unlikely that something like that would actually happen and for Vec specifically it would also depend on the precise interpretation of

Most fundamentally, Vec is and always will be a (pointer, capacity, length) triplet. No more, no less. The order of these fields is completely unspecified, and you should use the appropriate methods to modify these. The pointer will never be null, so this type is null-pointer-optimized.

But in the end it's always better to use the sanctioned methods instead of transmutes.

An additional wrinkle is that these fields are not really usizes so it's dangerous to write any bitpatterns into them other than those that have been obtained from a Vec. E.g. we could add more niches to Vec which would make additional bitpatterns instant-UB.

and them just get the offsets using offset_of!, like I already do in the remaining of the code.

Using repr(C) is an easier way to get a guaranteed layout.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.