This is a required skill to learn for coding in Rust, IME. The biggest thing, for me, was learning to look at the implemented traits.
I searched around a bit and couldn't find anything. I have now found explanations here and on SO. I'd expect something like that (syntax and common usage) to be explained in official docs.
Presumably it's not possible to get away from implementation details because the language is too low level (it's all memory oriented) and you need lots of plumbing to do things that are high concept/level.
I was thinking about this and why error messages in other language are not quite as insane. For instance Python also has types but it won't spit out type hierarchies. It'll just state when a member that should have been there was not there i.e. it'll say clearly what it needs. Usually that makes things an easy fix.
I'm not sure what you mean by type hierarchy. The Rust error showed the needed type vs actual type, and if the nested generic types were not shown then it would be much more difficult to figure out.
My guess is that types in Python are simpler. So comparatively, yes, Rust does have a complex type system. But it is not more complex than C++, which has comparable features (while Python does not).
Rust errors are pretty good, but can be improved and @ekuber filed an issue for the case you brought up:
It's not an atomic thing, it's a combination of two operations, which do show up if you search for them individually.
also listed in the rust book: B - Operators and Symbols - The Rust Programming Language
The combination does a deref (which makes it a place expression) and then taking a reference to the value at that place.
An alternative is that libraries should explain how things are working under the hood in their documentation because 100% guaranteed you're going to have to deal with anything and everything when using it.
And with that I mean actual written documentation, not the autogenerated stuff on docs.rs.
I think for context, dioxus recently added signals using something like a garbage collector to avoid lifetimes on Components. It's quite fascinating, and I think it's also fair to say its the bleeding edge, they're figuring this out and their API isn't stable yet.
As a result it seems this complex types will leak into the error output when things go wrong.
It's a tradeoff, and I'd love to see more diagnostic annotations like #[diagnostic::on_unimplemented]
be developed and used effectively by crates to make that tradeoff better. (I don't know how to use them effectively yet! Are there docs/blogs on this?).
It's actually worth raising an issue about your experience in dioxus' GitHub, they will likely be interested to hear your feedback.
These errors are often obtuse, but they're explicit and verbose and logical, and credit to the rust team (shout-out to @ekuber in particular) for just how good the diagnostics are in many cases. It's a phenomenal amount of effort that goes into making rust's errors so good.
Thank you for your efforts improving rust diagnostics!
As for visualizations there is a neat library that might help visualize things called RustViz.
It only works with simple programs but might be helpful.
I learned about it literally on the rust book. Like some people already said, if the problem was &* you'd figure that its something to do with a deref implementation. Not saying the error message couldn't improve but its not like this is undocumented either.
The comparison with Python is probably a little unusual. There are more differences than similarities between the languages. Python doesn't have custom smart pointers! That's what the &
and *
operators are for [1]. Python also doesn't have generics, and its type system is fully dynamic; type annotations are only suggestions. As far as I know, its type coercion capabilities are limited to numerics.
Some Python examples...
int
and float
coercions work:
>>> int(4) / float(3.14)
1.2738853503184713
>>> float(3.14) + int(4)
7.140000000000001
str
and bytes
coercions do not:
>>> b"foo" + "bar"
Traceback (most recent call last):
File "<python-input-16>", line 1, in <module>
b"foo" + "bar"
~~~~~~~^~~~~~~
TypeError: can't concat str to bytes
>>> "bar" + b"foo"
Traceback (most recent call last):
File "<python-input-19>", line 1, in <module>
"bar" + b"foo"
~~~~~~^~~~~~~~
TypeError: can only concatenate str (not "bytes") to str
If you choose to either decode the bytes
or encode the str
, then the concatenation will work (no type coercion):
>>> b"foo".decode("utf8") + "bar"
'foobar'
>>> b"foo" + "bar".encode("utf8")
b'foobar'
If you get the encode/decode backwards (which is a mistake that is way too easy to make), then it doesn't work again:
>>> b"foo".encode("utf8") + "bar"
Traceback (most recent call last):
File "<python-input-22>", line 1, in <module>
b"foo".encode("utf8") + "bar"
^^^^^^^^^^^^^
AttributeError: 'bytes' object has no attribute 'encode'. Did you mean: 'decode'?
>>> "bar".decode("utf8") + b"foo"
Traceback (most recent call last):
File "<python-input-23>", line 1, in <module>
"bar".decode("utf8") + b"foo"
^^^^^^^^^^^^
AttributeError: 'str' object has no attribute 'decode'. Did you mean: 'encode'?
These error messages are simplistic because the types involved are simplistic. This does not feel any easier to me than Rust.
If "type hierarchy" means the kind of type composition you can do with generic types, then that doesn't exist in Python by definition.
The Rust language design evolved a long time ago moving its language-built-in smart pointers [2] to standard library types like Box<T>
and Arc<T>
. In doing so they opened the door for other libraries to implement their own smart pointers as well!
I have hit the error in OP on occasion with the more common standard library smart pointers. There's an unstable syntax for this with Box<T>
. Which is cool, I guess, but it doesn't help for any other smart pointer types. I would be in favor of simplifying pattern matching with all smart pointers, whatever that might look like. Type errors are a part of library API surface and a fundamental shortcoming in Rust - #17 by CAD97 is a little better, especially if error messages can reliably nudge users in this direction.
This is a bit of a fib on my part. These operators are for references in general, not just smart pointers. And Python does have references! It's just that they are hidden from you. This design leads to the problem that deep copy tries to solve when function calls mutate objects passed by reference. Which in turn leads to problems because recursive objects are not copyable! This hidden complexity exists and is visible despite the effort to hide it. Joel Spolsky calls this "The law of leaky abstractions". ↩︎
The
~T
smart pointer type was replaced withBox<T>
in 0059-remove-tilde - The Rust RFC Book. There was a "prehistoric" garbage-collected smart pointer type@T
that is described and compared toRc<T>
in 0256-remove-refcounting-gc-of-t - The Rust RFC Book. The@T -> Gc<T>
transition took place in Remove `@` syntax for Gc'd pointers by alexcrichton · Pull Request #14835 · rust-lang/rust · GitHub. And FWIW,Rc<T>
was added way, way back in add task-local reference counted smart pointers by thestinger · Pull Request #6241 · rust-lang/rust · GitHub ↩︎
I think Python definitely has smart pointers under the hood. The difference is that it doesn't bother you with it when you're working on a different abstraction level.
Python has inheritance and the fact that generics don't exist means something you find in a container does not have to do what you expect it to do.
I have to say that I never had a need to seriously use either of these nor do I think it was ever really made clear what would be a good use of them. And I'm not exactly new to Rust (read a handful of books as well).
There's an unstable syntax for this with
Box<T>
match b {
Some(box n) if n < 0 => {
It doesn't look like it's adding anything really. Why wouldn't it be Some(n)
and make the fact that the number is in a box entirely transparent?
We seem to be speaking past each other.
I acknowledged that Python hides references, while also pointing out that it doesn't hide them effectively. It still causes a lot of problems in practice. Inheritance is not parametric type composition, and these should not be confused as equivalent.
It's literally adding syntax.
I think "why not make it transparent" is a great question, and the answer directly relates to something I touched on in my last reply as a side note. I think you're trying to get at why the pointer itself can't be transparent, or said another way, why is the pointer opaque?
The answer is because of the way patterns work. Match ergonomics made some important improvements, and it's the reason you write Some(n)
instead of &Some(ref n)
[1] in your match patterns when smart pointers are not involved. There has clearly been improvement toward making the references transparent! Just not enough (yet).
As for that thing I touched on in my last reply, here it is again for more visibility: The Law of Leaky Abstractions – Joel on Software.
edit: Oh wow, I completely forgot about
ref
! Combined with the pre-NLL borrower checker, I sometimes feel like an old man complaining that "kids have it so easy these days!" I am appreciative of the ergonomics progress made in the compiler since 1.0. ↩︎
Indeed, they are a hard part of Rust, and one that many people are looking to improve. However, keep in mind that it's so, so much easier to complain about something than to actually improve it.
That's a fact about your experience, not the smart pointers. Because they're extremely common and integral in the Rust ecosystem.
Because that's counter to everything Rust believes in. What if you're just going to box it again? What if it's an unsized type? What if it's a huge type and moving it is costly? What if it's a generic and any of the above could be true? You can't just make decisions like that globally, and Rust tries very hard not to.
We can safely make assumptions like "calling Deref is cheap", because that's always true for well behaved implementations (though not technically guaranteed). Assuming "they want this unboxed" is a much more dangerous assumption that'd lead to much confusion and frustration.
The compiler can already see that if T: Deref<Target=U>
then t.u_method()
is valid, and can chain deref()
arbitrarily many times (or at least twice, which I imagine covers 99% of cases). So I don't think it would be unreasonable for it to look at OP’s example and say “you gave me a T
, but I expected a U
, and through a finite (and small) number of repeated applications of deref
I can turn a T
into a U
, and therefore I will suggest &*
”. It seems like largely the same machinery that already exists in deref coercion; someone™ just™ needs to write the plumbing.
I was talking about the general case that the compiler can't always tell you how to get from one to the other type and therefore there should not be an expectation that it will always do that.
In this particular case, yes, sure, it's possible. I didn't focus on this particular case because a ticket was already filed for it and mentioned in this thread, so that would have been redundant.
Is this the relevant bit?
By default, identifier patterns bind a variable to a copy of or move from the matched value depending on whether the matched value implements
Copy
.
So Some(n)
(where n is boxed) would have to then Copy the boxed value but that is not always possible?
I thought n
would be able to bind to the same boxed value there which I guess is wrong.
Sure, but I'm not exactly in a position to improve it. Those who are should be aware that there's still miles to go here.
That's my point. The expectation should not be that this is always possible but doing it everywhere where it is even remotely possible has such immense dividends for developer experience that it would be silly not to push on this.
Yes, that’s part of it. Because the type inside Option
may not be copyable, you still need a way to get a reference to it. That’s what the ref
in the Some(ref n)
pattern is all about.
Another part is that this leads to another question: why ref n
and not &n
?
The answer is again because of the way patterns work: Some(&n)
matches Option<&i32>
, but not Option<Box<i32>>
and not Option<&mut i32>
. Even though these are "morally equivalent" and the reference should be somehow inferred.
But the case that makes this clearer is that the Some(&n)
pattern also does not match Option<i32>
. And how could it? i32
is not a reference. Some(ref n)
resolves the dilemma, allowing the caller to bind an inner reference through the Option
in all cases.
The default binding modes of match ergonomics lets you ignore some of these pattern quirks in most cases; Some(n)
just works. But at the same time, it can give users the wrong impression.
Python doesn't have custom smart pointers!
I think Python definitely has smart pointers under the hood. The difference is that it doesn't bother you with it when you're working on a different abstraction level.
The keyword being "custom". Python uses reference-counting everywhere, just like how Java uses garbage-collection everywhere. Rust gives you the choice of what memory scheme to use. Some functions expect a specific memory allocation strategy, and operate with the assumption that the strategy is being used. Many APIs don't require anything in particular, so they'll just ask for &T
, which can be acquired from any smart pointer. But a function like add_pointer_to_vec
will specifically require a Vec<T>
, so you can't pass in a Box<[T]>
.
I have to say that I never had a need to seriously use either of these nor do I think it was ever really made clear what would be a good use of them.
I'm guessing you've never used any memory allocation besides Vec
or String
then? It's at least good to know that those types really don't need to be used for simple tasks in Rust. But I can assure you that those types are useful, and used.
Python has inheritance
Rust doesn't have inheritance, so this only highlights another difference. The rest of the sentence doesn't do anything to justify the idea that Python handles long type names better than Rust does. The point that @parasyte made was that because Rust has generics, the type names are longer, which might explain why the error was confusing.
There is probably a higher-level language hiding within Rust that doesn't involve such careful consideration of memory allocation, but the goal of Rust is high-performance, so it does require you to specify which allocation scheme works best for your project.
I don't feel I've had an adequate explanation of their tradeoffs and I didn't really have the time or the need to play with them. I remember a book which had a table with all of them (Arc, Cow, Rc, etc.) but going through my library I can't find it anymore.
I think one of the core dichotomies which is stronger in Rust than in other languages is the difference between being a library user and a library author. As a library user you would like to live without any of those allocation methods (or if they do pop up lots of examples to copy-paste your way through), without lifetimes and any stuff like that.
The problem is when the abstraction between these two bleeds through.
I read all the changes that went in under the hood in this blogpost above and it is wild.
Sometimes the differences matter enough that you just have to know the type. For example, you can't mutate an Arc<T>
, but you can mutate a Box
. Cloning a Box
requires making a separate memory allocation, but Arc
doesn't.
You won't be able to do that in Rust. These things are all important to using Rust, in both libraries and executables - by design.