What is the reason of using `&**` here?

Hi! Rustaceans! Have a question here.

For std::Vec's indexing implementation (mod.rs - source), I found an usage of &**self.

In my intuition, this is to get a &[T] type, the path might be:

&Vec --*--> Vec --*--> &[T] --&--> &[T]

If so, then why? not simply use **, we can get a &[T] by:

&Vec --*--> Vec --*--> &[T]

The path is &Vec<T> --*--> Vec<T> --*--> [T] --&--> &[T]. Using just ** would give an expression of type [T] and results in a compilation error complaining about a mismatched type and/or trying to move a value behind a shared reference and/or trying to store a value of an unsized type.

4 Likes

I have to admit I don't fully get this. The signature of Deref is fn deref(&self) -> &Self::Target. But the actual deref operation is self -> Self::Target.

The “*foo” syntax, when implemented by Deref::deref, desugars to *Deref::deref(&foo).

Ahh that's the trick then.

edit I don't know perhaps it would be useful to say this in the Deref docs somewhere.

It does!

  • In immutable contexts, *x (where T is neither a reference nor a raw pointer) is equivalent to *Deref::deref(&x).
2 Likes

Sorry I haven't looked at the docs for a while. Thanks for the answer!! :slight_smile:

But the document(Vec in std::vec - Rust) states that, a deref operation on a Vec<T> type should return a &[T] reference type, not [T] type.

@imic sorry if I've confused things. Once you've dereferenced once to get a Vec, then the next * actually gets expanded to *deref(&x). Then the final & cancels out the * from the expansion to give you &[_] in this case.

Well…

If Foo implements Deref, and you have x: Foo, then *x is of type Foo::Target.

The type Target in the implementation for Vec<T>: Deref is [T], not &[T].

The return type of the Deref::deref method is &[T], but that fits the picture of *vec desugaring to *Deref::deref(&vec). Here

  • vec: Vec<T>
  • &vec: &Vec<T>
  • Deref::deref: fn(&Vec<T>) -> &[T]
  • so, Deref::deref(&vec): &[T]
  • and finally *Deref::deref(&vec): [T]

By the way, in light of this desugaring, to achieve the equivalent of Deref::deref(some_expr), but using the “*” operator, you’ll need to write &**some_expr.

Of the two *’s, the right one operates on some_expression: &… some primitive shared reference type, so only the left * is desugared.

&**some_expr hence desugars to &*Deref::deref(&*some_expr), and the operation &* on a shared reference is a no-op in each case, which means this is essentially Deref::deref(some_expr), as expected.


This also gives an easy intuitive meaning to the &**self in the original question: It’s a way to achieve the same as a call to the deref function.

By the way, if you find this syntax weird (and also rarely see it), that’s because a lot of dereferencing in Rust can happen automatically. For the code in question, it’s mainly done to avoid accidentally defining a recursive function; method resolution or implicit coercions of references (those two things are the possibilities for automatic dereferencing in Rust) wouldn’t help here.

However, the code can be written arguably more cleanly, by specifying the implementor of the index method, the type [T]. The function

impl<T, I: SliceIndex<[T]>, A: Allocator> Index<I> for Vec<T, A> {
    type Output = I::Output;

    #[inline]
    fn index(&self, index: I) -> &Self::Output {
        Index::index(&**self, index)
    }
}

could thus be re-written more explicitly as

impl<T, I: SliceIndex<[T]>, A: Allocator> Index<I> for Vec<T, A> {
    type Output = I::Output;

    #[inline]
    fn index(&self, index: I) -> &Self::Output {
        <[T] as Index<I>>::index(&**self, index)
    }
}

or slightly less explicit, but still specifying the required self-type of the ::index method using

impl<T, I: SliceIndex<[T]>, A: Allocator> Index<I> for Vec<T, A> {
    type Output = I::Output;

    #[inline]
    fn index(&self, index: I) -> &Self::Output {
        <[T]>::index(&**self, index)
    }
}

Now that the first argument’s type for <[T]>::index is clearly indicated, we can use automatic dereferencing and simply write

impl<T, I: SliceIndex<[T]>, A: Allocator> Index<I> for Vec<T, A> {
    type Output = I::Output;

    #[inline]
    fn index(&self, index: I) -> &Self::Output {
        <[T]>::index(self, index)
    }
}
5 Likes

This is undeniably SOME explanation! Quite thorough, quite neat. By the way, those thoughts/ductions/inspirations on Deref are so impressive that I ,honestly say, should have expected to get this kind of explanation on a valuable book! Thanks!!! @steffahn

At my first glance, I thought the operator * in Rust would work similar to other languages, in which all operators are syntactic sugers and will be replaced by their overloaded implementaion eventually. Now I think that's wrong, and "operator resolution" seems more complex than other languages.

Yes, it’s noteworthy that *’s desugaring contains a usage of * itself, so you clearly cannot desugar everything. And indeed, it will only be desugared if it’s not applied to a type where the implementation is compiler-provided: Those types where it’s magically compiler-implemented are &T, &mut T and Box<T>.[1] The first two need to be compiler-implemented because * is called on those types in the desugarings, and Box<T> allows moving out of *x expressions.

It’s also noteworthy that the desugaring depends on whether the access is mutable or immutable, which is a somewhat nontrivial process to determine in general, but intuitively makes sense in most straightforward cases. For more details on the full picture, read about place expressions in the reference.

Also, a second syntax (and again set of two traits) that behaves very similar is indexing, where x[n] will desugar to *Index::index(&x, n) on mutable access (and *IndexMut::index_mut(&mut x, n) otherwise).

In fact, most operators in Rust are compiler-implemented for a set of “primitive” types, but in most cases, this fact doesn’t have any, or only has little, effect on the language semantics; only for *, the desugaring really must be only happening conditionally, otherwise we’d recurse indefinitely.

In case you’re interested in more details how Box<T> is magical/special w.r.t. * operators, see this thread: Dereferencing a Boxed value


  1. Those types still do implement Deref, and the desugaring might be used after all if they are used indirectly, in a generic function. E.g. a fn foo<T: Deref> could use *x in the function body on a value x: T, and this operation will use the Deref desugaring, even if foo happens to be called with something like T = &u8. ↩︎

Well, it is kind of implied by the name of the operation. de-referencing turns a pointer into its pointed value. If de-referencing resulted in a reference, then it would make plain references unusable (since there would be no way to get to the referred value if all you could ever perform were the identity operation &T -> &T).

As for this, I agree so and believe that the implementation of indexing operator on Vec is actually handled by compiler (digging out the calling chain of Vec's indexing is my initial goal).

Because I found out that the very last effective expression on a Vec's indexing calling chain is, T[n], as per index.rs - source

It's only compiler-implemented for slices, not for Vec.

There's some niche effects with regards to borrow checking that this compiler-provided implementation for indexing can allow. (E. g. it allows you, on a slice: &mut [T] to create e. g. &mut slice[i].field1 and &mut slice[j].field2 and use them at the same time.)

I've found this thread really interesting and helpful for me.

One more thing: if you have a fat pointer type like x: &[T] then &*x recovers the extra reference information in addition to the pointer, i.e. the slice's length. So even though *x 'forgets' the length, the compiler actually doesn't and re-adds it when you re-reference.

Is this a correct description?

1 Like

Yes. I believe that currently, any unsized type in a place expression must have come from a dereference (of a wide reference/pointer), so the information is always available.

(Probably the compiler is smart enough to not actually do anything to the value itself, but you can't count on this, e.g. *const [u8] and &[u8] are allowed to have different layouts.)

2 Likes

I'm not sure what you are trying to say with this. [T] and dynamically-sized types still have a length (unknown at compile time, but known at runtime). I don't think it is meaningful to say that [T] "forgets" its length. The fact that it's stored as a fat pointer is merely an implementation detail (it's convenient for various reasons); conceptually, it would make much less sense to view &[T] as magic that somehow "guesses" the "non-existent" or "forgotten" length of the pointed slice.