Excuse me if it is a stupid question:
struct G { n: i32 }
let g = G { n: 10 };
let d = (&g).n;
if let i32 = d { println!("ddd") };
Here &g
is Pointer
, why can I asscess its field named n
while it is just a Pointer
?
Excuse me if it is a stupid question:
struct G { n: i32 }
let g = G { n: 10 };
let d = (&g).n;
if let i32 = d { println!("ddd") };
Here &g
is Pointer
, why can I asscess its field named n
while it is just a Pointer
?
For the below explanation, let me introduce another variable to give the reference a name.
struct G { n: i32 }
let g: G = G { n: 10 };
let r: &G = &g;
let d: i32 = r.n;
println!("d is {d}");
Where other languages such as C
require you to dereference a pointer in order to access fields, as in d = (*r).n;
, and even introduce a new operator to make this operation more readable: d = r->n
, in Rust, a different approach is taken. You can still be explicit and use
let d: i32 = (*r).n;
(Here, r
is &G
, and dereferencing gives *r: G
, of which then the field n
can be accessed.)
But where C
requires you to write r->n
, Rust chose to let you re-use the r.n
notation. This is made possible by rules to automatically introduce dereferencing operators for field access operations as well as for method calls, as necessary, in order for the field or method to successfully be found. Of course this is a trade-off, the downside is that the operation becomes less explicit, and also that cases where multiple types —say, G
and &G
– would both have a method of the same name become ambiguous in principle, and resolved by some slightly nontrivial order of priority in practice, (although the actually complicated full set of rules really only applies for methods
, since the types &T
and &mut T
which will be considered for method calls don’t have any fields).
In practice, IMO the benefits of the convenience are probably worth it; the “ambiguity”-related problems mentioned above come up occasionally, but not too often —e.g. this recent thread is a problem encountered because of it— and the downside of code being less explicit, in particular the fact that a dereferencing operation might be hidden is a lot more reasonable in Rust than in, say, C
because of Rust’s memory safety, so you don’t need to be careful about and review every single instance of dereferencing in Rust. And in unsafe
Rust, where you can dereference raw pointers (like you’d do in C
all the time), this convenience feature goes away for those raw pointers, any you’d need to explicitly write something like (*p).n
again, if p: *const G
is a raw pointer, so the dereferencing stays explicit and easy to spot in code review.
The whole reason C has the ->
operator instead of simply combining *
with .
is to avoid the need for parentheses in (*p).v
.
The only reason you need those parentheses is that *p.v
is ambiguous and requires precedence rules to distinguish (*p).v
from *(p.v)
(it means the latter of course).
The only reason it is ambiguous in the first place is that *
is a prefix rather than a suffix operator.
Pascal solves this much more nicely: the dereference operator is suffix ^
and you write unambiguously either p^.v
or p.v^
.
I'm still confusing about this, here is an example:
let mut a = String::from("things");
let g = G { z: &mut a };
let d = &g;
(d.z).clear()
[E0596] Error: cannot borrow `*d.z` as mutable, as it is behind a `&` reference
â•â”€[command_480:1:1]
│
4 │ let d = &g;
· ─┬
· ╰── help: consider changing this to be a mutable reference: `&mut g`
5 │ (d.z).clear()
· ──────┬──────
· ╰──────── `d` is a `&` reference, so the data it refers to cannot be borrowed as mutable
·
· Note: You can change an existing variable to mutable like: `let mut x = x;`
───╯
When I call (d.z).clear()
, I guess there is a process like this:
let k = d.z;
k.clear()
According to Automatic dereferencing
rule, when calling d.z
is accually calling (*d).z
, so k
is &mut a
. So if k
is &mut a
, why it can't call clear
?
I have some guess in the process: the compiler see clear()
first, and needs (d.z) is a mutable borrow
. When it need (d.z)
is a mutable borrow
, it needs d
is a mutable borrow
. Then if compiler think d
is mutable borrow
, it do (*d).z.clear()
. But it sees d = &g
, that is not a mutable borrow
, so the error above happened.
What is really happening here?
There’s two ways to answer the question of what’s going on. The first way to answer the question is: Try to get an understanding of Rust’s ownership and borrowing principles, in particular: references can not give shared and mutable access to a value. What does that mean for &G
where G
contains a &mut String
? Well, the &G
can obviously be shared (you can just copy it), so that consequently mutation of the String
will be prohibited.
The other way to answer it is to try to figure out the exact steps and rules that make the compiler give an error message in this case. That might be interesting for a deeper understanding, but it’s really only useful as a subsequent step once you have understood the basic principles. I’ll try to give a short rundown of the details below.
So, one intermediate step is to talk about references-to-references, which is similar to the situation above. Running over the type & &mut T
every so often is not unusual, and it’s important to learn that this type isn’t all that different from &&T
or &T
in terms of what it allows you to do.
Now onto the details. As you noted correctly, the automatic dereferencing rules mean that the compiler treats d.z
as (*d).z
, which is a &mut String
, however… The auto-deref (and auto-ref) rules are only useful to determine which desugaring is used, and for which type the field (or method) is accessed (or called). It does not mean that this operation will actually succeed.
The interesting, and initially often somewhat confusing, property of expressions like *d
or (*d).z
is that they are so-called “place expressions” referring not to some to-be-computed value, but to an existing value in some existing place. There’s different thing one can do to a place expression, best characterized by the typical trifecta of access in Rust “owned”, “mutably borrowed” and “immutably borrowed”: You can
And in many cases, not all (and possibly even even none) of these operations will succeed.
Now for the concrete example, (*d).z.clear()
is indeed the correct desugaring in terms of all the types matching. Let me even write String::clear((*d).z)
to make more clear that the .clear()
also doesn’t do further dereferencing or referencing. It expects a &mut String
, and that’s the type of (*d).z
.
On the other hand: how do we access (*d).z
here? As a first approximation, we move it (by passing it to a function). The question is then whether the place (*d).z
can be moved. There are two cases where this can be allowed:
either: the place is directly accessing some field (or field of a field, arbitrarily deep) of a local variable. The value can be moved out of this field, and this (part of the) local variable is then left uninitialized, which restricts what you can do with it in subsequent code, until you re-initialize it again. Importantly, places reached by dereferencing references (&mut T
or &T
) or more generally any type that can be dereferences (like smart pointers such as Arc<T>
for example) except Box<T>
do not allow you to move out of their target.
Box<T>
is special / magical in that it is the only type that does allow this; so essentially you can arbitrarily combine field accesses and dereferencing Box
es, but nothing else.
or: the type of the place implements Copy
, in this case, the value will (logically) be copied instead of moved (which is essentially the same thing at run-time). This case could arguably be mentioned first, because it always applies if possible. The place only needs to be accessible immutably in this case.
The type of (*d).z
is &mut String
which is not a Copy
type, and the place involves dereferencing a non-Box<_>
type, so moving is out of the question. The error message looks different though, not complaining about the impossibility of moving (*d).z
but of mutably borrowing *(*d).z
!
The missing step is: &mut T
values are implicitly “re-borrowed” in many situations, including essentially always when &mut T
is the result of dereferencing, and also always when passing &mut T
to a function expecting &mut T
. In other words: it definitely happens here. Re-borrowing just means: create the place-expression to the target, and then mutably borrow that. I.e. foo
gets re-borrowed as &mut *foo
.
In our case, we thus get String::clear((*d).z)
turning into String::clear(&mut *(*d).z)
. Here it is: we now no longer want to move anything, but we want to mutably borrow the place *(*d).z
(which the compiler writes as *d.z
in the error message, simplifying the notation using auto-deref). Now, when can a place be mutably borrowed and when only immutably? This is where the immutability of &G
references comes into play:
array[ix]
) steps: For mutable access to be allowed, _all dereferencing (and indexing) steps involved must support mutable access, as indicated by the relevant type not only implementing the Deref
trait but also DerefMut
(and for indexing not only Index<Idx>
but also IndexMut<Idx>
).This is where *(*d).z
fails: The step of dereferencing d
to *d
dereferences &G
which is only Deref
, not DerefMut
. This is what the error message means by
d
is a&
reference, so the data it refers to cannot be borrowed as mutable
By the way, the conditions listed above are only half of what’s checked. E.g. I haven’t listed any condisions where no access at all to a place expression would be allowed. The other half is borrow-checking, and variable initialization checks: For the variable that your place expression is based on[1], the borrow-checker tracks whether it was entirely, or partially borrowed mutably or immutably, and/or moved out of (i.e. no longer initialized) [or maybe never initialized in the first place]. The granularity here is not super high; but the compiler is smarter than merely considering each variable at a time as a whole, though exactly that approach of only considering the variable as a whole id often a good approximation for you as a user in anyways, and you can safely learn about more fine-grained tracking of the borrowed/initialized status later: I.e. in this case, the complete variable d
is initialized, and not borrowed at all, so there’s no further limitations from this point.
On the other hand, to give some straightforward negative examples: e.g. while d
is still about to be used later, the place g
could not be accessed mutably (neither moved), nor any places like g.z
or *g.z
, because g
is immutably borrowed. Similarly (but also slightly differently), a variable not marked mutable cannot be mutably borrowed (but in case of &mut T
variables or fields, their target can still be accessed (a bit of a special rule as well; but many people consider mut
annotations on variables essentially a strong lint only anyways, so don’t worry about the details here too much).
there are also place expressions that don’t start out with a local variable but that start by dereferencing or field-accessing a given (computed) value, in which case this second part becomes (at least mostly) irrelevant; this computed value is implicitly considered to be put in a so-called temporary (variable) which of course then is properly initialized and not borrowed, so no further problems arise, besides perhaps the lifetime of the temporary vs. how long you want to borrow it, but lifetimes are a separate thing I don’t go into in this post ↩︎
Thanks you so much. By reading it again and again, I think I have mastered reference
and dereference
deeply. You are born of teaching.
This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.