Working with identity (comparing equality of references/pointers)

I know that the compiler doesn't actually do it currently, but it would not be surprising to me if merging the two integers into a single address was an allowed optimization in the following code.

fn main() {
    let s1 = 1;
    let s2 = 1;
    let s = vec![&s1, &s1, &s1, &s2, &s2];
    println!("Distinct count: {}", count_distinct(s));

You keep coming back to address identity. But that's simply not the tool that you need to use. You need to compare regions, i.e. include address and size. Rust has zero-sized types and so comparing objects by address only is by definition broken. Maybe it didn't match your expectation, but it's not a bug in the language, it's not a defect in the documentation either, it's just something that you will have to live with.

In other words, you should write your code in a way that it's as general as possible, so that it works with all types uniformly. You should not expect that people won't call your code with () or struct Empty or an empty slice.


And how about:

fn main() {
    let s1 = vec![];
    let s2 = vec![];
    let s = vec![&s1, &s1, &s1, &s2, &s2];
    println!("Distinct count: {}", count_distinct(s));

The Vec is immutable, so it shouldn't matter that it's two different Vecs.

My point is: Where to stop here? I could even argue that

fn main() {
    let s1 = 1;
    let s2 = 2;
    let s = vec![&s1, &s1, &s1, &s2, &s2];
    println!("Distinct count: {}", count_distinct(s));

could be optimized to use a single address, as the value is never read.

Does that mean, in consequence, I can't use memory addresses for checking identity at all?

That comment is not helpful. Comparing size doesn't make sense when I have a known size, and comparing size just for the special case of ZSTs would be unnecessary overhead for a corner case that doesn't seem to make a lot of sense in most practical cases. Plus, it doesn't help in case the base address doesn't differ due to optimization, as in the example brought up by @alice.

Edit: Also note that generic type parameters are Sized by default, as I said here.

It's fine if checking "identity" isn't possible in Rust, i.e. requires an auto-increment integer or something like that. I could do it. It's easy. But it's overhead.

Well, ultimately I don't really know what the guarantees about memory address comparisons are, so it's hard to say. But I do in general think that identity comparisons are difficult to define in a reasonable way in Rust.

1 Like

I think being able to compare memory addresses (or ranges) for equality can be a helpful tool for a couple of algorithms. And I wouldn't be surprised if some crates already do that in order to solve certain tasks efficiently.

Now if Rust really doesn't give any guarantees, then these (hypothetical) implementations could break in future. Of course, it would be a mistake of those (yet hypothetical) crates to assume you can compare base addresses (or ranges) in order to do identity checks.

I would like if there was any form of statement in the reference that allows to deduce in which cases (if any cases at all) it is possible to use pointer comparison (or wide-pointer comparison, including length) for identity checks, because that could be helpful for a couple of algorithms and would avoid having to add otherwise useless auto-increment integers to data structures.

Even in case Rust decides to not give any guarantees, such a hint could be helpful, and it would motivate me to include runtime-overhead in my code to make my code sound.

If you can't find official documentation or an RFC, it's probably not considered normative / guaranteed, so if you want to be robust, take the conservative route. Rust sometimes gives an explicit non-guarantee (e.g. #[repr(Rust)] layout), but more often just does not give a positive guarantee on topics that are undecided or where flexibility is still desired.


Even if addresses are a safe choice for your situation (heap allocated, live, not resized?), and even if you're talking about Sized, non-ZST types, you need to at least keep track of the type / only do this for a restricted set of types (vs. a *const () or usize or whatnot) to ensure identity. A struct and a field inside that struct can have the same address.


Why wouldn't it make sense?

If sizes are known at compile time, they would almost surely get optimized away. If they aren't known at compile time, then again, it is necessary, for example in the case of slices of length 0.

I'm curious what your use case is that could be broken by this? The only one I can imagine is to avoid repeating a slow computation or to avoid an expensive equality check, and neither of these is affected by the kind of optimization you're talking about. If two variables have the same address then their memory contents are identical, so who cares if they may have had different names?

1 Like

You guessed right: My use case is avoiding repeating a computation (or network access). And you are right: In my particular use case, false positives aren't a problem if the memory contents are equal. Thanks for pointing that out! I wasn't aware of that while writing my previous posts.

But there might be other use cases (see below); and there might also be an issue with false negatives (see farther below).

Yeah, it seems to be difficult, but I also think it's rarely needed. Maybe one particular use case could be the Entity-Component-System pattern if all entities have certain data attached to it (e.g. there is a special component that all entities have). Then it could make sense to not use an integer to define the entity, but an allocated piece of memory, where the base address (and Sized type) could be used to identify the entity (e.g. in hash tables), and the allocated piece of memory can be used to store extra information for that entity.

Yes, I was aware of that. In my Playground code above, I use a PhantomData field to keep track of the type (at compile-time).

Edit: And the PhantomData even captures the lifetime of the immutable reference.

That means I can count on the following?

let x = 1;
let y = &x;
let z = &x;
assert_eq!(y as *const _, z as *const _);

Or would the compiler only ensure that while I hold a reference the contents wont move (but there is no guarantee about y and z being equal)?

If y and z may differ, then this would have an impact for my use-case, because extra computation (or even I/O) could happen, where it is not desired.

Because for a particular type, the size is constant.

Slices are !Sized. You are right, if my type is ?Sized, then it is necessary to compare the length. Of course, also the type needs to be the same, as otherwise a struct could be equal to its first (or only) field. As I said, I kept track of the type through PhantomData.

The only remaining issue is with ZSTs or other cases where Rust doesn't give guarantees on pointer

  • stability,
  • uniqueness.

And checking the size won't help me here.

Following the links provided by @quinedot above, the topic indeed seems to be more tricky than it might appear at a first glance.


As pointed out a couple of times in this thread, the issue isn't really limited to ZSTs, as such optimizations could happen with other types too (except where there are explicit guarantees that they don't happen). Thus, I would like to revoke my comment on the Rustonomicon being misleading.

You can count on it.

Topics this fundamental can be hard to find normative citations for (as they've been hashed out pre-RFC-era and there is still no specification), but there does happen to be RFC 68 that talks about it (somewhat indirectly).


I think this one falls under a much broader caveat, though I don't know if it's explicitly written anywhere either: most operations on addresses, even though safe, are not useful in general.

For example, what does this do?

    let (p, q);
        let x = 4;
        p = &x as *const _ as usize;
        let y = 5;
        q = &y as *const _ as usize;
    dbg!(p == q);

It looks like right now it's false, but there's nothing guaranteeing that -- the compiler could notice that x and y have disjoint lifetimes, and use the same storage for them, at which point it could be true. (It might even do this today in async fn's desugaring; I don't recall.)

Or, more specifically, you can construct easy examples of other rules that imply that "different" ZSTs can have the same address -- consider [[(); usize::MAX]; usize::MAX], for example. (Yes, that's a valid type..)

Well, that's why the nomicon calls it exotic :slightly_smiling_face:

But I do think it'd be good to add a note of some kind there about addresses, since a ZST can have the same address as even a non-ZST.

From all your comments, I finally came up with a nicer approach that neither involves unsafe nor restricts usage to Sized types (and also doesn't do any dirty tricks such as coercing a pointer to usize). That's my current approach:

pub struct RefId<'a, T: ?Sized>(pub &'a T);

impl<'a, T: ?Sized> Clone for RefId<'a, T> {
    fn clone(&self) -> Self {

impl<'a, T: ?Sized> Copy for RefId<'a, T> {}

impl<'a, T: ?Sized> PartialEq for RefId<'a, T> {
    fn eq(&self, other: &Self) -> bool {
        (size_of_val(self.0) == 0 && size_of_val(other.0) == 0)
            || (self.0 as *const T == other.0 as *const T)

impl<'a, T: ?Sized> Eq for RefId<'a, T> {}

impl<'a, T: ?Sized> Hash for RefId<'a, T> {
    fn hash<H: Hasher>(&self, state: &mut H) {
        if size_of_val(self.0) != 0 {
            (self.0 as *const T).hash(state)

It treats zero-sized types as well as empty slices (of the same type) as equal.

As an exercise, I decided to create my very first Rust crate (refid) which provides the above newtype (RefId) for any reference &'a T (with T: ?Sized) that allows identity checks through pointer comparison. Feel free to criticize my code or any metadata, but please also keep in mind that these are my very first steps with

Thanks to @H2CO3 for bringing up the issue with comparing regions instead of base addresses. Actually I don't need to manually compare the length as Rust can do that when comparing wide-pointers; but I did include special checks for zero sizes (see the full source, including some tests).

By the way: Coercing a pointer to a usize (which I'm not doing anymore) wouldn't lose the length information because the compiler actually refuses to do that coercion (see Playground).

Also thanks to all others for helping me understanding Rust's behavior better. I still feel like there are some open questions, and I'm sorry if my repeated questions for normative references come off as overly criticizing. I love Rust; I just would like to be sure that my code works as intended, and I feel like when dealing with pointers, some things aren't obvious (and/or undefined yet).

I hope I could express why identity checks (without extra runtime cost) may be a nice thing to have, and maybe I'm not the only one who has stumbled upon this problem. Perhaps my crate can help to solve this gently (e.g. by using HashSet<RefId(K)>).

1 Like

See also GitHub - mbrubeck/by_address: Rust wrapper type that implements hashing and comparison by pointer address (references implement Deref so this generalizes your approach)


So I'm not the only/first one who had this idea :grinning_face_with_smiling_eyes:. Supporting Deref seems to make sense in many cases. I think the by_address crate doesn't handle the zero-size case though? @mbrubeck

(Update: I published a new version 0.0.3 of refid, which also supports smart pointers with Deref, but still differs in regard to empty memory ranges, see below.)

use by_address::ByAddress;
use refid::RefId;

fn main() {
    let v = vec![5, 6, 7];
    assert_eq!(RefId(&v[0..0]), RefId(&v[1..1]));
    assert_ne!(ByAddress(&v[0..0]), ByAddress(&v[1..1]));

Maybe treating these two slices as non-equal might have advantages in certain cases though?

Nope, definitely not. The first I ran into it was doing something like -- since for ZSTs, overlap checks can't tell if you got duplicates.

This is a valid approach, but may surprise others in niche circumstances, so it would be good to highlight in documentation. Doing something else would surprise a different set of people, so maybe this is a vacuous point. You need the documentation regardless. :slight_smile:

Mmm... not quite I'm afraid. This covers unsized types where the pointer metadata is length (slices and wrappers of slices, such as str and Path and so on). But with a wide pointer to dyn Trait (or a wrapper there-of), the metadata is a vtable pointer, and two pointers to the same type may have different vtable pointers. So it's possible to get some false negatives (e.g. two references to the same instance that underwent unsized coercion in different code generation units).

There's a deeper consideration here that I'll just mention in passing: different types cannot be compared by this crate. So if I coerce a &i32 to a &Debug and a &Display, they can't be compared. Maybe this is what you want, maybe it isn't. There are probably also some complications around dyn upcasting, should that land.

Anyway, since you're generalizing over metadata types, you can't rely on the wide pointer comparison; you'll want to not compare pointer metadata. Use size_of_val to compare the sizes; that will take care of the length of slice-like DSTs. (Everything else has the same size for the same type.)

That leaves the question of how to compare the pointers without comparing metadata. Which leads to...

You can't coerce a wide pointer into a usize, but you can coerce a wide pointer into a thin pointer, and then coerce the thin pointer into a usize.

You don't need usize, but as per above, you do need to get (just) the data pointer. Taking it all together, I think you want something like this:

match (size_of_val(self.0), size_of_val(other.0)) {
    (0, 0) => true,
    (x, y) if x != y => false,
    _ => self.0 as *const T as *const u8 == other.0 as *const T as *const u8

We're in the same boat; I love Rust but thinks it needs some more guarantees nailed down. Sadly my impression is this won't improve in the short term :slightly_frowning_face:.

One thing you definitely can't do is read data behind the pointer; it's dangling.

From what I've read in the nomicon and std::ptr::read docs, I don't think trying to read a dangling pointer to a zero sized type is UB. It can't be null or unaligned, but dangling is fine.

1 Like

At least in regard to base addresses, this is also true for non-ZSTs, as @quinedot said earlier:

But I guess that &() as *const _ may still be equal to the pointer of any empty slice (while the pointers of two empty slices can't be equal if the empty slices start/end at different addresses). Anyway, it feels surprising but likely makes sense. A note in the docs might help to reduce confusion, unless the problem is more general anyway.

I did mention it here in the struct's doc. Maybe I should add the note in the impl Eq or impl PartialEq also? How do you usually comment notes on equality: in Eq or PartialEq?

And how can you make yellow or red boxes with an info-symbol (or auto-fullscreen flashing + alarm-sounding boxes :stuck_out_tongue_winking_eye:)? I didn't figure out yet how that works (if it is possible with rustdoc).

Okay, thanks for this info! I didn't know. I will look more into wide-pointers to understand metadata other than size info. However, I think that in some (or even all?) cases it may be useful to treat wide-pointers with vtable pointers as different (because method calls will execute different code, right?) So two values with different metadata aren't "identical" in that case (depends on defintion of "identical", of course).

I just found the section "Memory model" in Rust's reference. :innocent:

I was thinking of these

  • Reference
  • Nomicon
    • (I now think the ZST case is being addressed in the paragraph below, but pretty indirectly IMO; that paragraph could use a rewrite.)

But when following up to this comment found this

Which is more specific and presumably normative.

So thank you for the correction!

Okay, after typing up one response and reading it over to check my assumptions, I think we're running into the deeper consideration I mentioned in passing. Let's look at:

struct Wrapper(i32);
fn f() {
    let w = Wrapper(0);
    let x: &dyn Debug = &w.0;
    let y: &dyn Debug = &w;
    let z: &dyn Debug = coerce(&w);
    // In some other module:
    // fn coerce(w: &Wrapper(i32)) -> &dyn Debug { w }

x and y have the same type, but they have different vtables because they have different (erased) base types. When it comes to comparison, which type do you care about? I think any reasonable person would agree that x and y execute different code. So I think you're saying you want them to compare differently. Let's call this the "base type objective"; we'll come back to it.

Now let's consider y and z. If coerce happened to be compiled in a different codegen unit, y and z may have different vtable pointers, even though they are both the same type, and have the same (erased) base type as well. Do you want these to compare differently? I can't imagine you do, regardless of what you think about x and y. That's what the issue I linked is about. If you rely on "vtable address equality matches base type equality", you will have false negatives.

(One could argue that the different vtables for the same base type also means they "execute different code", technically, but I don't think that's very reasonable in this context. I'll assume you don't care about that.)

So what are the options?

You could give up on the base type objective; that is what my suggestion does. That does mean x and y are considered equal; here, equality means they have the same type, plus the same address and size (or are ZSTs). But they may have different erased types, and thus observable behaviors, in the case of dyn Trait.

You could keep your current code. That means x and y are considered different. But it also means that y and z will occasionally and unpredictably and probably undeterministic-across-compiles compare as different. I have no idea if this is an acceptable trade-off or how common it is. Maybe you just document this gotcha around dyn types and move on. Maybe you limit yourself to Sized types.

Is there a third way that achieves the base type objective? I am not aware of any good way to achieve this. If my impressions of the vtable issue is correct, you could compare vtable contents. But I'm not sure my impressions are correct, it would be unsafe regardless, and it would also rely on non-normative implementation details.

Anything else I've thought of involves limiting the applicability of your data structure pretty severely, as you're basically looking at TypeId and/or dyn Trait downcasting.

Is this a bug or just unfortunate? It's not currently mentioned in the issue (I'll try to follow up later), but wide pointer comparison is the outcome of an RFC. The intention of that RFC was that you could compare base types by vtable address; at least that's my reading, and what others implied in the PR comments. So, I would argue that it's a bug.

Will it be fixed then? I'm afraid this is one of those situations where I'm not too confident it will, based on reading the issue itself; i.e., an area where I don't think Rust's guarantees are improving.

Does this gotcha apply to anything else? Yes, it applies to const and fn pointers too. You can read more in the issue and the things it links to.

(Edits: Minor typos corrected. Additional note, I don't think fn and const are in violation of an RFC; I did follow up and commented about the wide pointer RFC in the issue.)