Documentation about special treating for impl Deref for Box

Hello everyone!


Seems dereference operator will be my stumbling block for a long time.
I get used to leave with it, but regularly just stuck reading std sources.

There is an interesting article about special case of Deref implementation for Box in rust ( - source) - Rust Tidbits: Box Is Special - In Pursuit of Laziness
It is not new, but seems still actual and at least sheds some light on what is going on.

Usually code is the main source to understand some mechanics, and may be it will be more clean if code itself(comment for implementation) will contain some explanation about magic recursive behavior for reader and why does it work?

PS To be honest I did not get why code contains "stub"/"fake" implementation and not some kind of "compiler native implemented" markup to say that behavior is implemented at the compiler level and code does not matter.

Thanks in advance for clarification.

I don't understand what your main question actually is (or if there is one), and the premise of your PS is false.

Box's Deref impl isn't a fake/stub implementation. It's simply that the dereferencing operator isn't syntactic sugar for Deref::deref() in the case of Box. It's instead a direct language built-in (just like arithmetic operators are on primitive numeric types). So you can use it to properly implement Deref for Box, because the *-dereferencing of boxen doesn't itself go through the Deref impl.

Thank for response!

Box's Deref impl isn't a fake/stub implementation.

Seems I used incorrect words in question, I just wanted to made an accent that looking at the code of implementation one will expect recursive behavior, but there will no be recursion in real application.

It does not seem to be just my impression at least cause of such article does exist :man_shrugging:

It's instead a direct language built-in

Does not it mean, that it(implementing deref for box) behaves not like it's written? In other words, implementing alternative LolBox with same syntax, will lead to stackoverflow. From developer point of view it is unclear why similar code behaves different, isn't it?

Example directly from the article:

use std::ops::Deref;

struct LolBox<T>(T);

impl<T> Deref for LolBox<T> {
    type Target = T;
    fn deref(&self) -> &T {

It does behave exactly like it's written. It's just not called where you would expect it to be called.

But if you would call it using Deref::deref(box) syntax then it would be called as usual and you would even be able to step through it (in unoptimized build, optimized build have as if rule).

Yes, but how is that different from addition for integers or multiplication for floats?

Box is not unique there.

Thanks for response!

It's just not called where you would expect it to be called.

And it is counterintuitive, isn't it? I mean I do understand that all those mechanics are not easy things to understand and remember for a one day, and intuition is not the thing one should rely on, but
my point is just that if new developer will try to copy code the same way it is written in std (if it works in std, so seems one can copy it as guided best practice, one might think) and will use it syntactically the same way (*box vs *lol_box), it will work differently for tricky reason.

Yes, but how is that different from addition for integers or multiplication for floats?
Box is not unique there.

To be honest I just did not get into "such trouble" with additions, but I'am not arguing against clarification in all places. I asked about the first which strike me hard during source code reading.
Why not to clarify all the code that has special treating(except developer/time resource)?

Rust already has a lot of explanations in rust docs and annotations giving a hint for a reader at least. Also I saw in some issues, that statements were rephrased to be more user friendly for new rustoceans. Would not it be better when reading std code, developer will get a hint about such scenarios?

Precisely because of that. Comments which describe obvious things is a drag on the productivity as lack of comments in non-trivial place.

This being said I think that one is tricky enough to justify an explanation. i8/i16/i32/i16 look special. They are not enums, not struct, etc are primitives, obviously special. But Box looks like a regular struct, it's special status is non-obvious.

But this suggestion should go into bug-tracker, I assume.

1 Like

Consider for comparison:

  • Built-in macros, which are defined with => {{ /* compiler built-in */ }}.
  • Box::new, which used to be defined as box x and is today defined as #[rustc_box] Box::new(x).
  • Box::drop, which is defined as // FIXME: do nothing, drop is currently performed by the compiler.

The difference between deref and drop is that you can actually call the deref function directly, and thus actually execute the code in that function. When you dereference a box, that's directly implemented by the compiler rather than calling <Box as Deref>::deref.

When looking at impl Deref for Box, there's no local indication that there's anything interesting going on (other than if there weren't, it'd be unconditional recursion). You have to know that Box is marked with #[lang = "owned_box"] and that this annotation confers a built-in implementation of dereferencing in order to track why there's magic here, as opposed to it being locally evident if there were a #[lang] or #[rustc] attribute applied, or if the definition were just a // compiler built-in comment. (It's not, though, as stated above, so the comment, if added, would be in addition to the apparently recursive definition.)

Even the primitive integers, which are primitives so you're more primed to expect magic compiler built in implementations than with Box, have some indication on their apparently recursive implementations, as they're marked with #[rustc_inherit_overflow_checks], indicating the presence of compiler magic.

Ultimately, I don't think it's an issue of any real note, other than as a curiosity; people are relatively unlikely to go looking at the definition of Box::deref, and especially unlikely to do so without noting that it's weirdly unconditionally recursive or at least that Box has a #[lang] attribute and gets special properties from that.

But I also agree that adding a small comment saying the implementation is built-in to the compiler would be (a minor but still) an improvement to the code. There's no risk to the comment becoming outdated, as any change to invalidate it would necessarily also change the impl.

The alternative would be to add a fn box_deref (and probably move fn box_free) to the intrinsics module, then call it; the intrinsics:: then obviously communicates the built-in quality in a self-documenting manner. (This is essentially how unchecked_add and the other unchecked math impls work for the primitives already, except with an actual intrinsic function instead of a function in the intrinsics module using syntax.)

If they look at the definition of Box itself, they'll see the #[lang] annotation, which will give them some pause, along with the Drop and new magic which is more clearly marked. If they still copy the Deref impl, though, they'll get the unconditional_recursion lint as a mitigation pointing out that what they've written doesn't make sense.

The wording is aggressively edited in documentation comments, since the are properly user facing. Internal comments are mostly just edited for accuracy, as they're aimed at stdlib developers rather than users. Much of the stdlib code is written nearly the same way it would be in user code (the exception being the use of unstable features), but the exception is anything with #[lang] semantics, where you're not really expected to be working against that code without knowing what the #[lang] attribute does in that case.

1 Like

Thank you for such a detailed explanation!

Oh, seems those "signs" were already there, but before you highlighted, I did not even mentioned them.
Will spend time reading for sake of self education about lang and rustc attributes.
Never read about intrinsics module - will read about it either.

Thank you so much for your time!

No. Again, *some_box doesn't call Box::deref. The implementation does behave exactly like it's written. It uses the primitive built-in dereferencing operator to obtain a reference to the inner value of the box, and returns is. And since built-in dereferencing does NOT call Deref::deref for box, there is no infinite recursion. There is no magic. It doesn't get any simpler than that.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.