What guarantees does Rust make for addresses of pointers to ZST?

Hi. I am experimenting with pointer tagging and have have a question about zero-sized-types.

I have noticed that when I create Box<T> (or &T) where T is ZST, then inner raw pointer is always equal to that type's alignment.

#[repr(align(2))]
struct A2;

#[repr(align(16))]
struct A16;

#[repr(align(1024))]
struct A1024;

macro_rules! print_ptr {
    ($ty:expr) => {
        let x = Box::new($ty);
        println!("{x:p}");
        
        let y = &$ty;
        println!("{y:p}");
    }
}

fn main() {
    print_ptr!(());
    print_ptr!(A2);
    print_ptr!(A16);
    print_ptr!(A1024);
}

prints

0x1
0x1
0x2
0x2
0x10
0x10
0x400
0x400

I know that Rust guarantees that pointers to ZST are aligned to their alignment. But does it guarantee that their value is the same as I have observed? And if it doesn't guarantee that, can I freely swap address of any pointer to ZST?

For example does following code contain any UB(s)?

fn main() {
    let x = Box::new(());
    // intentionally forget original pointer
    let _ = Box::into_raw(x);
    
    // let's guess what the address was.
    // alignment is correct, so what could go wrong?
    let raw: *mut () = 0x100 as _; // is this safe?
    let unit_ref: &() = unsafe {
        &*raw // is this safe?
    };
    
    let x = unsafe {
        Box::from_raw(raw) // is this safe?
    };
    
    assert_eq!(*x, ());
    
    drop(x);
}

And how does Strict Provenance play with any of this?

Use Case

If this all sounds like XY-problem, then this is a simplified version of my use-case.

I want to take Box<dyn Foo>, turn it into raw pointer, then split *mut dyn Foo into pointer to data and pointer to VTable, then I would like to store this pointer to VTable somewhere, and tag the data pointer using its lower bits. However problem arises when this data pointer points to ZST (with low alignment). Using VTable I can determine the size and alignment of original type, but I don't know what assumptions I can make about data pointer. Can I freely "forget it" and when I want to reconstruct the original dyn Foo can I just produce any pointer with sufficient alignment and cast it into & or Box to dyn Foo?

In this scenario I think I can preserve the provenance of the original pointer, so Strict Provenance should not be a problem here.

No, it does not. The fact that the address is equal to the alignment is merely an implementation detail. What Box::new() is doing here, you can also do with std::ptr::dangling(), but that function too does not guarantee any particular value. Address equal to alignment is just the simplest way to get a non-zero (non-null) aligned address.

And if it doesn't guarantee that, can I freely swap address of any pointer to ZST?

Yes. This is equivalent to moving the ZST. ZSTs are not only able to live at special dangling addresses; for example, in

struct Foo;
#[repr(C)]
struct Bar {
    x: i32,
    foo: Foo,
    y: i32,
}

there is definitely a Foo in the middle of Bar. It's just that, if you want, there can be a Foo anywhere that’s aligned, too.

And how does Strict Provenance play with any of this?

From https://doc.rust-lang.org/std/ptr/index.html#strict-provenance:

But it is still sound to:

...

  • Forge an allocation of size zero at any sufficiently aligned non-null address. i.e. the usual “ZSTs are fake, do what you want” rules apply but this only applies for actual forgery (integers cast to pointers). If you borrow some struct’s field that happens to be zero-sized, the resulting pointer will have provenance tied to that allocation, and it will still get invalidated if the allocation gets deallocated. In the future we may introduce an API to make such a forged allocation explicit.

So, in your example, the integer-to-pointer cast suffices to follow the rules (though isn't a use of the Strict Provenance APIs per se).

If I understand correctly, they’re saying that the one thing you can't do is use a ZST pointer that was part of a non-zero-sized allocation that no longer exists.

6 Likes

If you ask Rust to allocate it for you -- such as a local variable -- it'll be at some aligned address, but you're not guaranteed anything more. I keep being tempted to put them all at isize::MIN, for example.

But note that you must preserve ZST addresses in general, because someone might have intentionally made one in a specific place. For example, the slice iterators make sure that even if you have a &[()] you're iterating, you still get the address that was in the slice, not some arbitrary dangling address.

2 Likes

I'm not sure if we're even guaranteeing that anywhere and also we're not really upholding that everywhere either, especially for zero-length ZST slices.

E.g.

#![feature(strict_provenance)]
#![feature(split_as_slice)]

use std::slice;
use std::ptr;


fn main() {
    let a: &[()] = unsafe { slice::from_raw_parts(ptr::without_provenance(1234), 1) };
    
    let mut split_iter = a.split(|_| true);
    assert_eq!(a.as_ptr().addr(), split_iter.as_slice().as_ptr().addr());
    split_iter.next();
    assert_eq!(a.as_ptr().addr(), split_iter.as_slice().as_ptr().addr());
    split_iter.next();
    assert_eq!(a.as_ptr().addr(), split_iter.as_slice().as_ptr().addr());
}

results in

thread 'main' panicked at src/main.rs:16:5:
assertion `left == right` failed
  left: 1234
 right: 1

That's fine because there aren't any values contained in the zero-length slice. It's when referencing a "real" ZST value that not randomly "relocating" the ZST is relevant. If there's no value, there's nothing "at" the address to be "relocated" unexpectedly.

Thank you all for your explanations! In the meantime I actually understood how pointer tagging (based on alignment) works, so now I see that my question was misguided. For my use case it doesn't matter what size given type is, I just have to align it properly.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.