Confusion with `Any` again

Today, I got confused by the following code:

use std::any::Any;
use std::sync::Arc;

fn foo(value: &Arc<dyn Any>) {
    if (&**value).is::<i32>() {
        println!("Passed value is an i32.");
    }
    if (&*value).is::<i32>() {
        println!("Works also.");
    }
    if value.is::<i32>() {
        println!("This too.");
    }
    if (&&&&value).is::<i32>() {
        println!("That too.");
    }
    if <dyn Any>::is::<i32>(&**value) {
        println!("Here it's more strict.");
    }
    if <dyn Any>::is::<i32>(&*value) {
        println!("This won't happen."); // This is the only line which doesn't get executed.
    }
}

fn main() {
    let x: Arc<dyn Any> = Arc::new(1i32);
    foo(&x);
}

(Playground)

Output:

Passed value is an i32.
Works also.
This too.
That too.
Here it's more strict.

My understanding is that the is method is only implemented for dyn Any and not for Any. So when I write value.is::<i32>(), then it cannot be executed on &Arc<dyn Any>, and deref-coercion is performed twice? I can rely on that, I guess, because the receiver of a method isn't a coercion site?

However, the Rust reference states:

For method calls, the receiver (self parameter) can only take advantage of unsized coercions.

And going from T to dyn U is an unsized coercion. I guess that's what happens when I write <dyn Any>::is::<i32> in regard to the function argument (e.g. &*value).

I'm still confused. What does that cited sentence in the reference mean? How and when is the receiver "unsized coerced"?

Method calls dereference repeatedly as much as needed; the compiler tries really hard to find a matching method. Thus:

  • case 1 explicitly dereferences (twice, the second of which is a Deref coercion) to yield dyn Any, explicitly references once to yield &dyn Any, and is() is called on that, which works, because <dyn Any>::is takes &self.
  • case 2 dereferences then references immediately, i.e. it does nothing, so you still have &Arc<dyn Any>. Then "try hard to find the method" mode kicks in, and two more dereferences finally find the method on dyn Any.
  • case 3 is exactly the same.
  • case 4 also dereferences as long as it can.
  • case 5 explicitly asks if &**value, which is a reference to the inner dyn Any, is an i32, which it is.
  • case 6 explicitly asks if &*value, which is an Arc<dyn Any>, is an i32. So there Self = Arc<dyn Any>, which is 'static, so it can be coerced to dyn Any. Therefore it will, and of course Arc<dyn Any> is not the same type as i32, so this fails the downcast. You can see this in action if you change the concrete type to Arc<dyn Any>: Playground.

Oh right, &* is a no-op here.

I would say it asks if **value is of type i32 (but passes **value as a reference, i.e. &**value).


What I don't understand is why &Arc<dyn Any> sometimes gets coerced to &dyn Any and sometimes not:

use std::any::Any;
use std::sync::Arc;

fn foo(value: &Arc<dyn Any>) {
    if value.is::<i32>() {
        println!("We got an i32...");
    }
    if <dyn Any>::is::<Arc<dyn Any>>(value) {
        println!("... and an Arc.");
    }
}

fn main() {
    let x: Arc<dyn Any> = Arc::new(1i32);
    foo(&x);
}

(Playground)

In the first case here, value doesn't get coerced to &dyn Any, but in the second case it does. Why, and where in the documentation could this be explained?

I don't think that's right. In the first case, method call autoreferencing performs two *s (the second of which does happen to include a Deref coercion), and then it succeeds finding the exact type dyn Any.

In the second case, there is no such eager auto-referencing, because you are calling the method using UFCS, so value itself is coerced to &dyn Any immediately.

I had to search for UFCS, didn't know the term.

I think the point is that the coercion &Arc<i32> to &Arc<dyn Any> only happens in the method argument, but not in the receiver of the method:

use std::any::Any;
use std::sync::Arc;

fn main() {
    //(&Arc::new(1i32)).is::<i32>(); // refuses to compile
    assert_eq!((&Arc::new(1i32) as &dyn Any).is::<i32>(), false); // compiles
    assert_eq!(<dyn Any>::is::<i32>(&Arc::new(1i32)), false); // also compiles
}

(Playground)

I got confused by this:

When exactly is the receiver coerced and how? What does the documentation mean here?

Another example:

struct S;
trait T {}
impl T for S {}

impl dyn T {
    fn foo(&self) {
        println!("foo");
    }
}

fn main() {
    let s = S;
    //s.foo(); // fails
    (&s as &dyn T).foo(); // works
}

(Playground)

Sorry for my confusion.

This one doesn't surprise me.

But I agree your latest one doesn't seem to jive with the reference.

I never understood that line either. Let's see if we can figure it out! To the blame-mobile :racing_car: :dash:

That... doesn't address the particular line in question, but seems particularly relevant to your latest post!

Later:

Comment about which unsizing coercions are actually performed is still relevant, but this is generally an improvement. I'm happy to have this merged with an issue created (either here or on the main repo) if we can't reach a decision now.

It never got filed but should be.


I didn't see any discussion or explanation about that line though.

...plays a bunch....

Fully qualified syntax doesn't actually disable auto-deref of function args as far as I can tell. That includes for the receiver of methods. It doesn't do auto-ref or array unsizing like method call resolution does. Maybe it just makes the receiver act like any other parameter?

Definitely you can do other coercions on receiver, like &mut T to &T, with or without fully qualified syntax.

Yeah, I still can't make any sense of that particular line. And I think exactly what fully-qualified syntax changes should be made more clear too. The current wording around "avoiding deref" is misleading IMO.

2 Likes

With the help of your comments, particular "unsized coercion only happens if we run out of ways to deref", I understand the first part now, i.e. why value.is::<i32>() works the way I want, and I also found the corresponding section in the reference:

Method-call expressions

The first step is to build a list of candidate receiver types. Obtain these by repeatedly dereferencing the receiver expression's type, adding each type encountered to the list, then finally attempting an unsized coercion at the end, and adding the result type if that is successful. Then, for each candidate T, add &T and &mut T to the list immediately after T.

For instance, if the receiver has type Box<[i32;2]>, then the candidate types will be Box<[i32;2]>, &Box<[i32;2]>, &mut Box<[i32;2]>, [i32; 2] (by dereferencing), &[i32; 2], &mut [i32; 2], [i32] (by unsized coercion), &[i32], and finally &mut [i32].

I find the "Then, […] add […] to" a bit confusing, which might read easier as "Then, […] insert […] into", but it's unambiguous as it is, just made me stumble for a short moment.

So that relieves me, and I would say I can rely on the observed behavior in my project. :relieved:

However,

You are right tha &Arc<dyn Any> coerces to &dyn Any, but there are two ways to perform that coercion:

Way 1

Unsized coercion:

Coercion types

  • TyCtor(T) to TyCtor(U), where TyCtor(T) is one of
    • &T
    • […]

and where U can be obtained from T by unsized coercion.

namely T to dyn U, i.e. &Arc<_> to &dyn Any in this case, which would make the is method call evaluate to true and print the line.

Way 2

Dereference:

Coercion types

  • &T or &mut T to &U if T implements Deref<Target = U>.

Which also gives an &dyn Any, but would result in a different TypeId and thus result in the is method call evaluating to false and not print the line.

So there are two ways to perform the coercion. How can I know which one will be used? I didn't find any "coercion precedence" in the reference (yet), other than for method-call expressions.

Edit #1: And documentation issues aside, I wonder if there is a (good?) reason for a higher precedence of unsized coercion into dyn in the argument position while there is a higher precedence for Deref coercion in the receiver position?

Edit #2: Actually in the receiver position there is no coercion into dyn at all (it seems), but only array to slice unsizing coercions if I understand the second part of your post correctly. Anyway, I still think there are two ways to coerce in the argument position, which is what I tried to demonstrate here: Playground.


I still have to dig through the other part of your post (on "only array to slice unsizing coercions"), which is a different issue, I guess.

1 Like

Hmm, good point! You've answered your own question though -- it seems unsized coercion is preferred over deref.

Well, you've deduced for some cases anyway. Are there exceptions? How about the preference between any other potentially overlapping coercions? As far as I know, the order of preference isn't even documented, much less any exceptions to it. So we couldn't have deduced this from the documentation.

And the documentation isn't normative anyway. You can deduce from the documentation that method resolution can coerce a deeply nested Whatever: Any to dyn Any, but that deduction would apparently be false.

Such is life without a spec. Annoys me often. Results in hours of deducing-the-rules-by-trial. Wonder if I'd be better off becoming an expert in the implementation of rustc. No idea how stable any of these undocumented, under-spec'd behaviors are, practically or from a lang/trait team perspective.

Yes, that is my understanding.

2 Likes

As an aside, this came up in a post a while ago , and I created an issue on the reference, but it doesn't seem to have been looked at.

So, it seems the unsizing in method calls is just converting to slices (presumably to allow slice methods to be called on arrays).

2 Likes

Yes, I saw that:

Coercion types

Coercion is allowed between the following types:

  • […]

  • T_1 to T_3 where T_1 coerces to T_2 and T_2 coerces to T_3 (transitive case)

    Note that this is not fully supported yet.

  • […]

Even if the documentation isn't non-normative, I think that the current behavior should be documented.

I also think it would be good to think about which behavior is best. Not that I see any particular problem with the current rules (apart from being confusing perhaps?), but it might be wise to think about what's best before giving additional guarantees in the documentation.

Although changing the precedence would be a breaking change (of non-documented behavior), I still think it could be done in a new edition (if there's a good reason to do so, I'm just hypothetically speaking now).


I guess it would be reasonable to open an issue for the Rust Reference, stating that transitive coercion requires specifying precedence rules to avoid ambiguities?

I decided to open an issue for the Rust Reference (and hope I didn't misunderstand anything, but if I did, please correct me).

You could add this simpler, non-transitive example you noted before, or I might later.

(Edit: Done)

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.