Working with identity (comparing equality of references/pointers)

As per the issue and it's dupes, you'll be in good company :slight_smile:. Though with my suggestion, it just transforms it into the possibility of false positives in case of trait objects of both structs and their contents when they have the same size (which seems even more niche).

I've often wished for a NotDyn bound. Sized is a poor substitute. To base it off of DynMetadata, you would need to be able to test for associated type non-equality, which isn't quite the same thing... though maybe implementing with associated type X could be considered an opt-out for any other type Y, and T: Trait<Assoc != Y> could be sugar for T: Trait + !Trait<Assoc=Y> or something.

I think that's a ways off though. Note that negative bounds are not implemented on unstable yet, even. (I think coherence is in the process of landing in nightly.) And disjoint associated type coherence is delayed until the broader issue of mutually exclusive traits is tackled.

Those the wrappers-around like str that I mentioned. Their metadata is the same though, yep.

As for future possibilities, they'll have to work at least somewhat similarly due to the existence of methods like size_of_val. Unless we get a new Sized-like implicit-auto-trait.

Some comments implied that, but I don't actually know that the contents of the vtable are the same, so I can't answer this definitively. Another possibility is to unify the vtables when linking. There are trade-offs between compile time and run time cost.

That's a valid view, and one that motivated the acceptance of the RFC. And I'm not arguing against it, but there are other valid views.

Incidentally, you're throwing out behavior in the ZST case.

#[derive(Debug)] struct S;
#[derive(Debug)] struct T;

let (s, t) = (S, T);
// Maybe the same address
let v: Vec<&dyn Debug> = vec![&s, &t];

#[repr(transparent)] #[derive(Debug)] struct U(T);
// Definitely the same address
let u = U(T); // &u, &u.0

You may also have a bad time with "identity" due to optimization. Consider:

    let const I: i32 = 0;
    let i = 0;
    let a: &dyn Debug = &i;  // Base is i32 (stack address)
    let b: &dyn Debug = &&i; // Base is &i32 (stack address)
    let c: &dyn Debug = &a;  // Base is &dyn Debug (stack address)
    let e: &dyn Debug = &I; // Base is i32 (unstable address)
    let d: &'static dyn Debug = &0; // Base is i32 (static address)

These all "behave the same" but compare differently. Is it okay if optimizations make some of them compare the same?

I'm not really making a point here, other that pointing out it's probably always going to be a "best effort" type of situation.

(Though if you want all things that "behave the same" to compare the same, you'll have a worse time due to the halting problem :slight_smile:)

You are right (I think) that testing the associated type is different from negative impls. However, there is a way (with Rust nightly) to implement a NotDyn trait. That is because there are only three concrete types used as pointer metadata:

We can thus create a trait NotDynMetadata as follows:

trait NotDynMetadata {}
impl NotDynMetadata for () {}
impl NotDynMetadata for usize {}

And then, using #![feature(ptr_metadata)], we can easily define our NotDyn trait:

trait NotDyn {}
impl<T> NotDyn for T
where
    T: ?Sized,
    <T as Pointee>::Metadata: NotDynMetadata,
{
}

What do you think? I made a small example on Playground for testing it.

Yeah, I mistook negative impls for negative bounds. The first is just a guarantee for non-implementation. But luckily, we don't seem to need negative impls or negative bounds for an implementation of NotDyn, as shown above.

Hmmm, okay. Either way, it would make pointer comparison more complex.

Not every value that behaves the same should be considered identical. I stated the opposite: Every value that is identical should behave the same. (Otherwise, there would be the issue with the halting problem indeed :grinning_face_with_smiling_eyes:.)

If you are right that let x = 1; let y = &x; let z = &x; (or let z = y as a more relaxed case) ensures that y as *const _ == z as *const _, then it is possible to create two values (*y and *z) that are "identical" in the sense that they share the same memory location. Then, working with wrappers like RefId or ByAddress makes sense with today's Rust.

Of course, that won't mean that we will know in all cases whether two values are identical, but we would know at least in some cases (and in the future, those cases will hopefully be explicitly mentioned in the reference or other normative documentation).

1 Like

I noticed that the repository of by_address lists 2.0.0 as last version, while the most recent crate on crates.io, has version is 1.0.4 (actually 2.0.0 has been yanked, I just saw). Does anyone know why?


Update: Looks like version 2.0.0 was meant to perform a double-dereference on deref (returning T::Target instead of T), but the method body wasn't changed (see changeset). Not sure what that means. Deref always confuses me :sweat_smile:.

Changing only the type but not the method body works precisely because of deref coercion. If your method is declared to return &<T as Deref>::Target, then if your method body actually returns a &T, it will be converted to &<T as Deref>::Target.

2 Likes

Ah, function results are a coercion site. Thanks!

Good point, so long as Pointee is an automatic trait which can not be overridden, the set of possible NonDyn traits are known. This makes it not only something useful to Rust programmers in their own projects, but something the language could guarantee is accurate.

At which point, some of the special-casing of Sized with regards to dyn Trait could be replaced with NonDyn, and NonDyn would become a super-trait of Sized for backwards compatibility. (This is the main part that needs fleshed out in more depth IMO.)

In the context of RFCs, I could imagine some hesitancy around the indirectness and having a core trait be NonSomething, but I'm not on any teams, so who knows.

Even if a fourth type of metadata was added later, that wouldn't be a huge deal-breaker I guess? Then the semantics of NonDyn would be more like SizedOrSlice (e.g. a fourth type of metadata would not be covered by the NonDyn aka SizedOrSlice trait until the respective code was updated).

Like I said, if it's named SizedOrSlice, then it doesn't contain any Non… :innocent: Not sure if it sounds better though.

Why would that be necessary? Because if I guarantee a type to be Sized, it should also be guaranteed to be NotDyn? Wouldn't that happen automatically? Not sure if I overview it correctly.


Update: Ah, got it. This doesn't work if debug_not_dyn expects a NonDyn:

fn wrapper<T: Debug>(reference: &T) {
    debug_not_dyn(reference)
}

And T being Sized implicitly should guarantee it, but the compiler doesn't know it.

Trying to define the following causes a conflict:

impl<T: Sized> NotDyn for T {}

My thinking was around dyn safety, where you can have a method like:

fn foo(self) where Self: Sized;

And the trait will still be dyn safe as a whole because this dyn unsafe method is not callable for (non-Sized) trait objects. However, this is a poor substitute in some cases, like perhaps

fn bar<T: AsRef<()>>(&self, t: T) -> i32 where Self: Sized;

which might make perfect sense to implement for str or [T], but is still not dyn safe due to the generics. But if instead the dyn safety logic was based on NonDyn, we could change this to:

fn bar<T: AsRef<()>>(&self) -> i32 where Self: NonDyn {
  // If you're going to make this change to an existing trait
  // you'll have to add a default body, in case the trait was
  // implemented for DSTs already (which could exclude this body).
  // (But don't have to exclude it!  Even though it was uncallable.)
  0
}

And the trait would still be dyn safe, but we could also now call the method for str and [T]. However, our old version of bar (with the Sized bound) has to remain dyn safe for backwards compatibility. Moreover, implementations (including generic ones) which specified where Self: Sized on the method have to keep compiling to ensure backwards compatibility.

Likewise, trait Trait: Sized has to continue to inhibit dyn safety, and generic implementations bound on Sized have to keep compiling.

I guess it's not strictly necessary to be a super-trait so long as the compiler logic keeps taking Sized into account in all of these situations, but it seems much cleaner to me to add the super-trait so that Sized directly implies NonDyn as per existing super-trait behavior. And just generally speaking, reasoning about dyn safety shouldn't care about Sized per se.

There's also the whole possibility of some ability to pass/return DSTs coming to Rust some day; that's an aspect I haven't tried to dive into. At a minimum it's another situation where Sized will probably become the wrong tool for the job (and we'll then yearn for a StackPassable trait or something).

2 Likes

Just because I stumbled upon it, I wanted to share the following:

Apparently, even the Rust compiler itself uses the concept of defining equality through reference equality (aka "identity" or pointer equality) in some cases, see impl PartialEq for rustc_middle::ty::TyS<'_>, for example.

(Found this when following the link in this post.)

In other words, using the Sized bound to enable dyn safety is semantically wrong and unnecessarily reduces generalizability. Haven't thought on that before, but it makes sense.

That works because a type representation is never zero-sized unless your language only has a single type (in which case it would always be equal to itself and you wouldn't need to represent it anyway).

1 Like

To be clear, I agree there are use cases for a concept of "object identity" and there are also ways to implement that using pointer comparisons. The point of my earlier post is not that this is a fool's errand or the goal is useless, but that this is simply not a native concept to Rust and you should expect to do some work to marry the concepts because the language hasn't done it for you (unlike some other languages).

Part of that work may certainly be defining what exactly you mean by "identity" for things like unsized and zero-sized types. If you want to be generic over "all types" (whatever that means) then you often have to do quite a bit more work than if you can simply deal with one type as in the example you give from rustc.

1 Like

I think that is necessary but not sufficient. There need to be certain guarantees regarding stability of pointer values for that to work. It's not only a ZST issue.

Yeah, I see. There seem to be a lot of caveats.

That is the high-level part, yes. But it also makes me wonder how (and where) the low-level pointer comparison is defined in Rust. I think it's just bitwise comparing the pointers, but is that written down somewhere?

Unless the latter is clearly defined (and certain stability/uniqueness rules are known), we can't judge about whether pointer comparison is or isn't a useful approach for whatever definition of "identity".

What seems clear now is that in the general case, pointer comparison may have different rules than what would be useful for an "identity" op. In cased of Sized types, the issue doesn't seem to be the pointer comparison tough, but the question when (base) addresses are known to be equal or not equal at all. And that isn't trivial not just in the case of ZSTs but may also cause problems in other cases (as shown in some examples in the thread above).

However, I do believe that for non-zero sized values on the heap that are Sized, things are quite safe.

You also need to add the caveat that both values are of the same type: A struct and its first member can both be Sized and reside at the same address. They may even have the same size if there are no other members of the struct.

1 Like

Yes, you are right. But that's easy to guarantee by capturing the original type, see current RefId type that is dependent on T (and thus also captures the lifetime if T is a reference.

Rc and owning smart pointers in general (which are used in that context AFAICT) unconditionally heap-allocate. Those addresses aren't moving around.

Can the pointer held by a regular & reference to a stack allocated local variable change during the lifetime of that reference? I don't think so. If it can, I'd like to see example code demonstrating it.

I don't think so, but the point is that with a smart pointer, you can move around ownership conceptually, while preserving the address. I.e. the following is guaranteed to pass the assertion:

let rc_before = Rc::new(42);
let ptr_before = &*rc_before as *const i32;
let rc_after = rc_before;
let ptr_after = &*rc_after as *const i32;
assert_eq!(ptr_before, ptr_after);

Meanwhile, the following isn't:

let val_before = 42;
let ptr_before = &val_before as *const i32;
let val_after = val_before;
let ptr_after = &val_after as *const i32;
assert_eq!(ptr_before, ptr_after);

Here, there is no reference existing anymore though, right? (Or does the temporary &val_before not get dropped until the end of the whole block?)

I think the reference created by &val_before is a temporary, so it only lives as long as the immediately enclosing expression. However, even if it did exist until the end of the scope, the code would compile, since the borrow isn't actually used after it's converted to a raw pointer.