Method lookup: Deref or Borrow?

I'm working on a Rust presentation for my colleagues and there's a part I'm unable to explain. In Rust, it's possible to call "related" methods on objets, like

  • All &[T] methods when working on a Vec<T>
  • All &str methods when working on a String
  • ... Path .. PathBuf, etc.

I want to understand why this is possible so I'm able to explain it. I found this explanation here

When looking up a method call, the receiver may be automatically dereferenced or borrowed in order to call a method. [...] Then, for each candidate T, add &T and &mut T to the list immediately after T.

This makes sense to me, at least the "borrowed" and "&T" parts. chunks is not a method of Vec, but &vec returns a slice which has a chunks method so we can call it on a Vec. IIUC, there's something automatic going on when we use the . to call a method.

I read about std::borrow::Borrow and reference but it's doesn't seem to provide an answer. In fact, everything I read seems to tell me that the answer is std::ops::Deref

impl<T, A: Allocator> ops::Deref for Vec<T, A> {
    type Target = [T];

But why would it be? I'm not dereferencing anything here, I'm borrowing (&) it! In C++, dereferencing (aka *) is used to get the value behind a pointer. Does "dereferencing" means something different in Rust?

If v has type Vec<T>, then the expression &*v has type &[T]. Thus, you get the slice methods by first dereferencing (which means to apply the Deref trait since v is not a reference), and then borrowing the output.

2 Likes

std::borrow::Borrow is a completely ordinary trait with no magic whatsoever. Borrowing in Rust via &… expression is a primitive operation that has nothing to do with that trait. (One main use-case of this trait is in the API of HashMap, so you can lookup e.g. a String key type using a &str key.)


The fact that &vec “returns a slice” is an inaccurate observation of the fact that the &Vec<T> that it produces can be implicitly coerced into a slice, i.e. into &[T]. This so-called “deref coercion” is also based on the Deref trait. Method resolution itself is unrelated to this coercion and defined directly in terms of the Deref trait, as explained in the reference page you linked; or arguably it’s defined in terms of “dereferencing”, arguably the page could be more explicit in mentioning the Deref trait directly instead of only linking to the page about the dereferencing operator.

The example on that page is

For instance, if the receiver has type Box<[i32;2]>, then the candidate types will be Box<[i32;2]>, &Box<[i32;2]>, &mut Box<[i32;2]>, [i32; 2] (by dereferencing), &[i32; 2], &mut [i32; 2], [i32] (by unsized coercion), &[i32], and finally &mut [i32].

The step marked “by dereferencing” works due to the fact that Box<[i32; 2]>: Deref<Target = [i32; 2]>.

6 Likes

In case you don’t want to present the full method resolution algorithm in detail (which would be rather niche knowledge), the main takeaways (but likely still too long of a list) are that method calls can

  • add indirection, so you can call Foo’s &self or &mut self methods on a value x: Foo like x.method() instead of (&x).method() or (&mut x).method()
  • remove indirection, so
    • on Copy types you can e.g. call i32’s self-methods on a &i32 value
    • (on non-Copy types, trying to call a self method on a reference value would find the method, too, but then fail to obtain an owned value behind a reference)
    • since Box is an owning indirection, a self-method of Foo can be called on x: Box<Foo> though
  • as a combination of the above, transform the form of indirection, so you can call Foo’s &self method on a x: &mut Foo value, or call &self methods on a Rc<Foo> or Box<Foo>; or &Rc<Foo>, or &Box<Foo>, etc…

In this context,

  • removing indirection, aka “dereferencing” can be customized using a trait (Deref), which means that
    • custom pointer or smart pointer types can be supported like Rc or Box are in the above discussion (except from Box’s superpower of allowing owned access)
    • the Vec type and String-like types (i.e. including PathBuf/OsString), which are not necessarily considered “pointers”, use the Deref trait to allow methods of their corresponding slice types on their growing, owned version. So a &self method of str can be called on types like String or &String or &mut String, too.
  • (the support for adding indirection cannot be customized, it can only add a & or &mut borrow)

The overall effect (and intention) is that you’ll (almost) never have to write something like (&foo).bar(), (&mut foo).bar(), (*foo).bar(), or (&*foo).bar()/(&mut *foo).bar() or even (&****foo).bar() (i.e. multiple dereferencing steps) manually. In comparison to other languages such as C++, this mechanism helps remove some additional friction that would otherwise come up from the fact that Rust has a lot of different pointer types, and it makes Rust’s foo.bat() method call syntax support use-case of both foo.bar() and foo->bar() in C++.


So, “dereferencing” refers to whatever the dereferencing operator “*x” can do, and that is determined by the Deref trait. I don’t know about the conventional use of the word “dereferencing” in C++, but C++ also allows arbitrary overloading of the * operator, so it’s somewhat comparable.

I’m not sure what your context here is / what you’re referring to. A particular piece of code?

3 Likes

Whoah, thank you for the answers. It's very educational!

I think my confusion comes from the word "dereferencing". If you take this example

let v = vec![1, 2, 3, 4];
for pair in v.chunks(2) { ... }

v is not a reference or a pointer, so there's nothing to dereference, no indirection to remove. (This is where I'm wrong because of my strict definition of "dereference")

However, you're telling me that the compiler tries all kinds of things for the method lookup, one of them being "remove indirection" which is calling Deref::deref. This seems about right:

let v = vec![1, 2, 3, 4];
for pair in Deref::deref(&v).chunks(2) {}

Calling deref seems to share the same behavior. With Deref::deref(&v) I can only call the slice methods, not the Vec ones. That makes sense.

As for your question, I thought I was borrowing it because that's what I usually use (& and as_slice) to get a slice.

let test = &v;       // Actual reference to Vec
let test: &[_] = &v; // Slice

But chunks() does borrow its referent (the slice itself in this case).

But the built-in language feature called "borrowing", which yields a reference to the value being borrowed, is not the same thing as the Borrow trait or the Borrow::borrow() method. The latter is a sort of generalization of the former – it's a cheap conversion between references. The &value expression only ever yields &T when the type of the value is T; in contrast, the whole point of Borrow is to allow an additional conversion from &T to some other reference type &U.

1 Like

The key thing is that the compiler is always willing to try taking a reference to the variable, i.e. taking &v, and then dereferencing. The relevant steps for

let v = vec![1, 2, 3, 4];
for pair in v.chunks(2) { ... }

are

  1. v is of type Vec<T>, which does not itself have a chunks method.
  2. Try taking &v, which is of type &Vec<T>. If there was a chunks() method on Vec taking &self, it would be found now.
  3. Try dereferencing: go from &v to &*v = Deref::deref(&v). That's of type &[T], which does have a chunks() method, and we're done.
2 Likes

FYI, to talk of actual implementation, and the number of indirections to the data, dereferencing a Vec does actually remove a layer of indirection. The Deref::deref method converts &Vec<T> to &[T]. The former is a reference to a vector of T, and a vector itself consists of an owning pointer to T, as well a length and capacity information. The number of indirections until you reach the Ts in &Vec<T> is 2, and in &[T] it’s 1.

Yet, when reasoning about dereferencing, it’s actually common to talk about the target types, not necessarily the references. I.e. what I mean is, you can think of “dereferencing” Vec<T> to [T] instead of the actual &Vec<T> to &[T] conversion. This approach then allows to generalize over dereferencing for shared-reference access, mutable-reference access, and owned access (the last one only works with Box though, and with Copy types). The first two are then covered by Deref and DerefMut traits. “Dereferencing Vec<T> to [T]” conceptionally thus includes both

  • the conversion &Vec<T> to &[T], and
  • the conversion &mut Vec<T> to &mut [T]

Note that method resolution works on that level, too. It will first reason about such a dereferencing step as going from Vec<T> to [T], and only at a later stage, the actual implementation in terms of either deref or deref_mut, or primitive / compiler-implemented implementations of dereferencing (for reference types and/or for Box) are chosen.


So the concrete example v.chunks(2) with v: Vec<i32> works as follows.

First prepare the list of types to consider. This list is

  1a. Vec<i32>
  1b. &Vec<i32>
  1c. &mut Vec<i32>
  2a. [i32]
  2b. &[i32]
  2c. &mut [i32]

where each step na to (n+1)a is done via dereferencing, and na to nb or nc work by adding & or &mut operators. In terms of types, we arrive at 2a because Vec<i32>: Deref<Target = [i32]> is implemented.

Now we find the first one of these types that has a chunks method. The chunks method we’re after is part of a impl<T> [T] block and a &self method. So self is of type &[T], which matches 2b in the list above with T == i32. We’ll skip the part where no other (in-scope) chunks method exists for any (earlier) type in the list. To get to 2b, the step is thus:

  • one dereferencing from v: Vec<i32> to *v: [i32], and
  • one referencing/borrowing from *v: [i32] to &*v: &[i32]

Finally, we can talk about how to desugar the expression &*v that we arrived at.

The first key insight to understand what’s going on without getting confused is that *v isn’t a value that we could use directly and e.g. assign to a variable. Instead it’s only a “place”, or put differently, the expression “*v” is a “place expression”; value vs. place is a distinction somewhat similar to the concept of “rvalue” vs. “lvalue” in C++.

Now that that’s out of the way, we’ll start actually considering the desugaring of *. Unless we’ve managed to do nothing with it at all, a place such as the one produced by *v is accessed somehow. In this case, the access is by borrowing it immutably via the & operator. More complex accesses could however also involve creating some more places before we access the result. Something like &(*foo).bar to access a field or &mut (*foo)[42] doing an index operation are just 2 possibilities, and trying explain every case can get technical. Anyways, this is immutable access clearly, so we desugar *v using the rule for immutable access, aka a so-called “immutable place expression context”. This way, *v becomes *Deref::deref(&v).

Put together now, &*v thus becomes &*Deref::deref(&v). Note how the desugaring of * itself contained yet-another usage of *, but this usage of * was for dereferencing &[i32] to [i32]. Dereferencing of &T or &mut T (or Box<T>) will not be further desugared, because otherwise we’d never stop desugaring :sweat_smile:, instead this operation is supported directly by the compiler as a primitive operation.

Finally, it’s out last job to interpret the meaning of any remaining primitive dereferencing expressions. In this case, that’s fortunately really easy, since we do &*… on a value of type &T. This operation is essentially a no-op, so we might as well just remove it entirely. Thus we arrive at Deref::deref(&v) for &*v, and thus v.chunks() is desugared to <[i32]>::chunks(Deref::deref(&v)).


With all this reviewed in detail, I hope you can appreciate what I meant by my discussion at the beginning of this post about how such a dereferencing operation can both be thought of in terms of converting Vec<T> to [T] or &Vec<T> to &[T], depending on how you look at it, since both interpretations played a role in the desugaring process.

Looking at the overall result, Deref::deref(&v) starts with Vec<T> (though not as a value but as a place, so the Vec is not moved ^^), borrows it (which adds a level of indirection), and dereferences it (which removes one level of indirection), so the overall effect is that, actually, the resulting &[T] does have the same number of indirection to the underlying T as the Vec<T> did. Thinking about having a v: Vec<i32> on the stack, and producing the temporary value &*v: &[i32] to put onto the stack, too, that operation will in fact (after inlining) not be any dereferencing/loading of addresses anymore at all in the assembly, but simply copy a pointer and a length. Except that it loads and stores to access the stack. If these values are however in registers (a Vec fits into 3 registers, and a &[T] into 2 registers), then it really is just some copying after all.

Also note that while for Vec or String, the Deref::deref operation does remove a level of indirection, this doesn’t have to be the case for Deref implementations in general. The only (partial) example of this in the standard library is for Cow<'_, T>, where dereferencing a Cow<'_, i32> for example will, in case the Cow is of the Cow::Owned variant, convert &Cow<'_, i32> to &i32 in the deref implementation which is just going to be adding an offset (assuming the enum tag comes before the payload) to the pointer that represents the &Cow<'_, i32> at run-time.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.