Pointer identity

I want to use pointers to identify different objects.

  1. Do different Boxes to a non-zero-sized type always point to different addresses?
  2. Do different Rc's (even for a zero-sized type) always point to different addresses?

Here are tests I tried:

use std::ptr;
use std::rc::Rc;

struct Foo;

let a = Box::new(Foo);
let b = Box::new(Foo);
assert!(!ptr::eq(&*a, &*b)); // Error

let a = Box::new(0);
let b = Box::new(0);
// Is it safe to assume no optimization is performed to break this?
assert!(!ptr::eq(&*a, &*b)); // OK (Is this always true?)

let a = Rc::new(Foo);
let b = Rc::new(Foo);
assert!(!ptr::eq(&*a, &*b)); // OK (Is this always true?)

Note: My English may sound unnatural.:smiling_face_with_tear:

Yes, boxes are guaranteed unique as long as they are not zero-sized.

As for Rcs, they are unique even if zero-sized, because the allocation also stores a counter for how many clones of the Rc you have, so the allocation is always at least usize sized. (Of course, clones of the same Rc have the same address.)

2 Likes

I got it! Thank you.

Only alive boxes are guaranteed unique. So if you drop a box then create a new one they may have same address

5 Likes

Although I can theoretically imagine that this might not be guaranteed for ZSTs which do not require dropping in an implementation without weaks. A ZST doesn't hold any data, so if destruction has no side effects, then IIUC the reference count isn't actually needed, and could be omitted as an optimization. The Rc could then always return the same dangling pointer regardless of how many other copies there are, and simply not track the number of instances, and avoid allocating a counter altogether. I guess this can also be true in the current implementation when the compiler can prove that there is no need for tracking weaks.

2 Likes

See also std::rc::Rc::ptr_eq. I wonder why there's no such method for Box.

1 Like

I guess the rationale can be that Rc is specifically designed for sharing. Thus, it is pretty much expected that there may be multiple Rcs that do or do not point to "the same" (as in "identity") value. In contrast, Box is uniquely-owning, and as such, Boxes are always logically distinct entities with their own identity. So, it's a more common thing to want to query whether two Rcs are the same (e.g. for optimization or other purposes), while one does not usually want to ask if two boxes happen to point to the same location.

So, is it safer not to rely on Rc<ZST> to identify objects?

The following code is why I wanted to use Rcs to ZST:

struct Node<T> {
    elem: T,
    next: *mut Node<T>,
    prev: *mut Node<T>,
    alive: Rc<Cell<bool>>,
}

struct Dummy;

pub struct LinkedList<T> {
    front: *mut Node<T>,
    back: *mut Node<T>,
    id: Rc<Dummy>,
}

#[derive(Clone)]
pub struct Index<T> {
    node: *mut Node<T>,
    id: Rc<Dummy>,
    alive: Rc<Cell<bool>>,
}

impl<T> LinkedList<T> {
    pub fn new() -> Self {
        Self {
            front: ptr::null_mut(),
            back: ptr::null_mut(),
            id: Rc::new(Dummy),
        }
    }
    pub fn push(&mut self, elem: T) -> Index<T> {
        let alive = Rc::new(Cell::new(true));
        /* ... */
        Index { node: new_back, id: Rc::clone(&self.id), alive }
    }
    pub fn pop(&mut self) -> Option<T> {
        /* ... */
        popped_node.alive.set(false);
        /* ... */
    }
    pub fn get(&mut self, index: &Index<T>) -> Option<&T> {
        // The node at `index` has not been removed.
        if index.alive.get() {
            // `index` was obtained from `self`.
            if Rc::ptr_eq(&self.id, &index.id) {
                return unsafe { Some(&(*index.node).elem) };
            }
        }
        None
    }
} 

However, I realized that I need to keep track of whether the LinkedList has been dropped or not, so I decided to use Rc<bool> instead. :sweat_smile:

Not sure about your question, but I think an one idiomatic way to create a unique identifier usually is to increment an AtomicU64 using .fetch_add(1, Relaxed). Then you don't need to worry about how/if actual allocation takes place.

2 Likes

Incidentally, if you already have Rcs, then don't use raw pointers and unsafe for implementing a linked list. Use Rcs for forward pointers and Weaks for backward pointers to avoid circular references.

Looking into the source of Rc::new in std, you can see that Rc always allocates (the internal RcBox is never a ZST). As I don't see why this should/would be changed (and because Rc also needs to provide methods such as strong_count), I think it's safe to rely on Rc.

However, one thing I'm not fully sure about is how std::ptr::addr_of_mut behaves when a ZST is involved. See it's use here.


The Nomicon says about ZST's, that

[…] pointer offsets are no-ops, […]

So I do believe that Rc::as_ptr gives a usable result for the sake of comparison. Note that Rc::ptr_eq doesn't even compare the address of the ZST; it just just compares the address of the RcBox.

I thought it would be easier to use the address as an ID than writing dedicated code to generate unique IDs.
But I'll also try using AtomicU64 to generate IDs. Thank you for the suggestion.

I have used both approaches (unique IDs and comparing memory addresses).

Another thing that can be tricky when working with addresses is that storing a raw pointer makes any struct !Send and !Sync. And converting a pointer to an usize may come with other issues sometimes (e.g. the pointer metadata will be lost). So in some contexts, I just gave up on working with addresses. But I really decide on a case-by-case basis currently.

1 Like

Rcs are only used for sharing information (ID and aliveness of nodes).
I want to mutate values in nodes, but using Rc<RefCell> as links is somewhat cumbersome.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.