Is the null pointer guaranteed to be aligned?


#1

I am using bit flags with AtomicPtr<T> and I may store null in the AtomicPtr. The type T has enough alignment so I can use alignment bits for bit flags. However, I am not sure about null pointer. Is it guaranteed to be aligned?

NOTE: currently, null is always 0 thus always aligned. But is this a guarantee for the future?


#2

If you are writing code like below:

struct Context {
    flags: Flags,
}

// non-null pointer
let mut context = Box::new(Context {});
let ptr = AtomicPtr::new(Box::into_raw(context));

// null pointer
let ptr = AtomicPtr::<Context>::new(std::ptr::null_mut());

you don’t need to worry about the alignment in the null pointer case.
Because the null pointer means that there is no data have to be aligned.


#3

I’d assume so. Rust can transparently call into C (including passing raw pointers to C functions) so Rust’s concept of a null pointer will need to be the same as C’s. That means it’ll always have a numeric value of 0 and should always be aligned because 0 is a multiple of all possible alignments.


#4

Yeah, but memory layout of NULL is not necessarily zero in C.


#5

No. My code stores bit flags with the pointer itself, not in the pointed memory.
It is like this:

// >= 2 and not = 2
#[repr(align(2))]
struct Node<T> {
    item: T,
    next: AtomicPtr<Node<T>>,
}

fn pop() {
    ...
    if first.is_null() {
        return None;
    }
    let node = &*first;
    let ptr = node.next.load();
    if ptr as usize & 1 == 1 {
        cleanup()
    } else if node.next.compare_and_swap(ptr, (ptr as usize | 1) as *mut _) == ptr {
        let item = node.take_item();
        cleanup();
        return Some(item);
    }
    ...
}

This bit means something like logical removal.

I think in this case I am still able of storing this flag somewhere else.

However, there are some cases in which a pointer and a flag need to be operated atomically. Reads and writes to those must be reading a single thing and writing a single thing.


#6

According to the C11 standard, section 6.3.2.3, #3:

An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant. If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

(Originally found here, and verified)


#7

Thank you for explanation. I fully understand what you want.

Unfortunately, I’m not a member in the core team, so I don’t know about the guarantee of the exact value of the null pointer returned from std::ptr::null() and std::ptr::null_mut().

At least, the null pointer is defined using 0 at this moment.
I checked the definition of the null pointer in ptr.rs, but I couldn’t find any description about the exact definition of the null pointer in Rust.

I also checked the definition of NonNull in ptr.rs, which is defined using NonZero.
That may mean the core team treats the null pointer as 0 internally.


#8

I think it’s absolutely fine to assume that null is 0 and will be for Rust’s entire lifetime.

Rust doesn’t support exotic hypothetical or extinct platforms, and there doesn’t seem to be any reason to change null representation ever.


#9

The way null pointers are usually implemented isby having them point to the beginning of a memory page without read, write or execute bits set. This is why you SIGSEGV when trying to interact with data via the pointer rather than anything else. This doesn’t mean the page is at address 0.

However, like @kornel made note of, that not being the case (the page not being at 0) is something that’s pretty much extinct and exclusive to museum pieces these days. And as @mohrezaei showed, C11 has defined what it means to be considered a null pointer (I’m not sure if C99 also didn’t define this, which would solidify the notion moreso).

So with the above, you can pretty much bet your house that the null pointer is aligned, as it will point to a page start and an aligned address (0 is aligned by definition).


#10

Yes, the C standard requires the compiler to convert the integer 0 to null pointer. However, this does not guarantee about memory representation of null pointer to have all bits zeroed. It is possible some architecture maps null pointer to, say, 0x3F0D. So, the compiler would have to convert the integer 0 to the pointer 0x3F0D.

But thank you Kornel. This will help me a lot.


#11

Right, anything that relies on the binary representation of pointers is by definition less portable. That said, I’d say it’s acceptable to say that a library only supports platforms where the NULL pointer is 0 which is true for every non-academic modern architecture—and probably most academic ones, even, say, CHERI capabilities.


#12

As far as I know rust pointer are not C pointers.

If I remember correctly rust pointers are comparable with C pointers cast
to integers, so in difference to C pointers they are actually just integer addresses.

It’s a different matter for references which are roughly C-pointers with additional
lifetime checks.

But this also means that rust has no fancy parts around pointers, i.e. converting
0 to a pointer will not necessary yield a null-pointer and comparing to rust 0-pointers
is true. (Only std::ptr::null and
similar have that guarantee). Also there is (and will be) no guarantee that the null-pointer
is aligned, as this could make it impossible to support some hypothetical platforms.

But as @medusacle already pointed out architectures with non 0 null pointers are
almost non existing today and I highly doubt that this will change, and even more that
rust will ever support such a architecture.

As a compromise I would:

  1. just assume it’s 0
  2. add a test which uses std::ptr::null/std::ptr::null_mut for your bitmap type
    and test if it’s aligned

#13

Done. I added this function to my project:

#[inline(always)]
pub fn check_null_align<T>() {
    debug_assert!(null_mut::<T>() as usize % align_of::<T>() == 0);
}

#14

Not really familiar with C NULL spec, but Rust uses Zeroable trait for NPO, like Box internally holds NonZero(*mut T) so Option<Box> has exact same memory representation as Box. I think non-zero nullptr breaks everything relies this assumption, including even libcore itself.


#15

First, pointers are allowed to point to objects at unaligned addresses, and you can read / write to those addresses using the _unaligned pointer methods. So, in general, you cannot assume that a pointer is going to point to an aligned address. Only if you know where the pointers come from, you might be able to assume that they are properly aligned.

Others have commented that there are “non-zero” optimizations guaranteed by Rust that prevent changes to this. There is however a subtle difference in the naming of the wrappers that support these optimizations, the ones for integers is called NonZeroU..., while the one for pointers is called NonNull.

The truth is that the null pointer bit pattern is not guaranteed by Rust. This might be explicitly guaranteed or not guranteed in the first RFC from the Unsafe Code Guidelines WG, but the WG has not started discussing this issue yet - that should happen early 2019.

Having said this, currently, the only null pointer “value” that LLVM supports is the “all zeros” bit pattern. This is true for both C’s null pointer constant, and for run-time memory addresses that compare equal to this null pointer constant, which are not necessary required to be the all zeros bit-pattern by the C standard.

So while Rust might not guarantee this right now, that std::ptr::null() returns 0x0 is true for all platforms that Rust support today, and it might be true for all the platforms that your code runs on and might care about in the future (e.g. your platform requires atomics support, narrowing down the list). So it is kind of up to you to do a risk assessment about this and make a choice whether you want to rely on this or not.


#16

Well, I produce all the pointers in such a code. I allocate memory which needs to be aligned for regular reads and writes, so I have this guarantee. My concern was with the null pointer, however. I have inserted this assertion in constructors:

#[inline(always)]
pub fn check_null_align<T>() {
    debug_assert!(null_mut::<T>() as usize % align_of::<T>() == 0);
}

This function will only catch the errors when testing or running in debug mode, however. I might consider changing the debug_assert to assert so no-one ever face weird bugs and instead the assertion will fail in such cases.


#17

I’ll raise this with the unsafe code guidelines group. In general, we don’t want people writing paranoid code about platforms we might never support.

To better understand what you are trying to do: why are you using “bit flags” (are you tagging pointers?) even when the pointer is null? Do you want to be able to tell two different kinds of null appart? Note that if you flip a bit in the null pointer, and call ptr.is_null(), that will return false.


#18

I clear the bit flags before I test if it is null. By thinking a lot I realized that, in some cases, non-zero null should not be a problem (e.g. it is never null or it is either null or marked, which can be replaced by some dummy pointer). In other case, it can be replaced by indirection, but this could be more inefficient and would consume more space.

For instance, one case is a sorted linked list. Nodes are logically removed first. When one is traversing the list, one removes physically the logically removed nodes. This check must be done when loading the next field as a single atomic operation. In this case I already use an extra level of indirection, but it would require one extra word. A new allocation when marking would also be required. (I already make this new allocation when marking and this is not a grave case)


#19

Interesting, so IIUC correctly what you are actually doing is having a linked list of pointers, when null means that there is no next node, and then there is another sentinel value (e.g. null with a tag bit set) that denotes that this node is the root node, is that correct ?


#20

Almost. I do have a sentinel node and, in the file I linked, with some dummy pointer to say it is the sentinel. However, this is not when the bit flag is used. The bit flag is used to mark a node as logically removed (see field “next”).

There are also some channels which use a bit flag to tell one side of the connection that the other disconnected: