Understanding pointer offset/wrapping_offset safety

What this is saying is that if I've got some variable x, I can't use pointer arithmetic to somehow get a pointer to some unrelated variable, y.

A common example of this is when people take pointers to two unrelated variables on the stack (aka "from different allocations"), calculate the difference between the two, and later on try to reuse that difference for offsetting a pointer.

let x = 42;
let y = 0;

let x_ptr = &x as *const u32;
let y_ptr = &y as *const u32;
let difference = y_ptr as isize - x_ptr as isize;

unsafe {
  let ub_pointer_to_y = x_ptr.offset(difference);
}

The reason this is is UB is because it breaks LLVM's ability to reason about aliasing, which in turn means some optimisations will generate broken code. It also opens the door to changing memory you shouldn't have access to (e.g. heap corruption).

The short answer is "yes". You got one big allocation from whatever mmap function you used and can do whatever you want with it, then you'll be passing out smaller allocations to your caller and the caller will only be allowed to touch the memory you gave them.

In general, if some function gives you a pointer to something, you can't use pointer arithmetic to get an address to to something outside the object being pointed to. The one exception is when you are given a pointer to an array, in which case it's fine to use pointer arithmetic to get a pointer one past the last item because that's how for-loops are often implemented in C++ (e.g. in for (auto p = std::start(my_vec); p != std::end(my_vec); p++) { ... }).

Assuming you don't accidentally go out of bounds and only ever store Ts (mixing sizes means your offsetting needs to be smarter), that's perfectly sound :+1:

I'd just use start_of_region.add(i*mem::size_of<T>()). I don't think I've really seen wrapping_offset() used much in practice for much the same reasons you've already mentioned (i.e. 64-bit pointers are huge).

I don't think that has anything to do with pointer offsetting, but it would probably break things downstream. Your consumer will be assuming they can do whatever they want with the memory they've been given until they call some sort of free() function, and that assumption is what makes their unsafe code (e.g. reads and writes) sound.

If you then go and retract that ability without letting them know, you will violate that assumption and they'll have a bad time the next time the caller tries to use that newly mprotect-ed memory.

4 Likes