Why are negative value literals expressions?

Since I ran into this the other day, I was wondering: why does Rust treat negative number literals as a unary minus + positive literal expression, rather than a single token (?).

As is, this has consequences ranging from potentially puzzling:

-9i32.is_positive()   // ERROR: cannot apply unary operator `-` to type `bool`

macro_rules! tt { ($l:tt) => { } }
tt!(3)                                           // fine
tt!(-3)                                          // ERROR (it's an expr)

to potential silent error paper-cuts:

-9i32.abs()          != (-9i32).abs()            // -9 (!) vs. 9, respectively
-1i32.rotate_left(1) != (-1i32).rotate_left(1)   // -2 vs. -1, respectively

If it is merely to simplify parsing, it would seem to be that the pitfalls outweigh the benefits (?).

I don't know the answer, but it behaves according to what I would expect here. I normally think of unary minus as having the same precedence as plus and minus. Compare to −1² being −(1²) in mathematics, and a + b.to_int() being a + (b.to_int()), not (a+b).to_int().

Similarly, if I type -a.to_int(), I definitely want -(a.to_int()) and not (-a).to_int(), and it would be really inconsistent if -9i32.is_positive() behaved differently from -a.to_int().

1 Like

On that I have to disagree: -1 to me is a single, defined integer value (Rust parsing notwithstanding) that can directly be translated to an all-1s bit pattern (on 2s-complement machines) at compile-time, whereas -a is an expression that needs to be evaluated at run-time.

Edit: Come to think of it, the current approach is inconsistent in its own way: -128i8 shouldn't be a thing - it would be the negation of 128i8, but that is outside the value space of an 8-bit signed integer.

2 Likes

I think you're somewhat missing the point, which is that -𝛼.method() is always parsed as -(𝛼.method()) and never (-𝛼).method() regardless of whether 𝛼 is a literal, a variable, or some more complicated expression. That seems nicer to me than having a different rule for the - in negative literals. (It's also how it works in Python, which is the language I use most often.)

That is a bit weird, yes, but it still works mathematically, so it doesn't bother me that much personally. (I.e., 127i8.wrapping_add(1).wrapping_neg() == -128i8)

3 Likes

I get that that's the way it is. The disagreement is that I, personally, think that a negative value should be a single literal and token. The way it is, it's literally (heh) impossible to write literals for half the value space of signed integers.

IMHO that's more than a bit weird: I'm not writing 127i8.wrapping_add(1), I'm writing 128i8, and if I leave it at that I get a compiler error. I still get an error if I write std::ops::Neg::neg(128i8). Yet I am able to write -128i8 and suddenly it works, even though by rights it should be treated the same.

3 Likes

I can see this from both the mathematical and parsing points of view.

Ultimately it looks like there's a tension between the 2, and resolution seems to require picking one or the other.

Having said all that, going with the mathematical view would require context-sensitive parsing in order to make an exception for numeric literals. Context-sensitive parsing is a big no-no for modern programming languages (and yes, that means that I exclude C++ from that group of modern languages) for reasons of implementation complexity, theoretical limits of what you can do with the grammar of such a language (more expressive generally means more expensive to compute and manipulate) and probably reasons of simplicity in terms of usage, since it is much easier to remember a rule that applies everywhere than it is to remember "in situation A this goes, but in situation B that other thing goes, because of reasons".

8 Likes

I don't think that treating -9 as an operator and a literal 9 is in any way in conflict with a "mathematical" perspective. In mathematics there's no difference between the number -9 and the unary minus operator applied to the number 9; they're the same thing.

If you see -9! or -9² in a mathematical context you should certainly not interpret them as (-9)! and (-9)².


meta discussion

That said, this is really nothing more than a convention. Rust had to choose something and it went with the Python (more mathematical?) interpretation instead of the Ruby (more pragmatic?) one. There's nothing to gain here from arguing about how it should have been different.

(I suppose there could be a rule that negative literals always have to be parenthesized.)

7 Likes

That we agree on.

But I also think that the point that there's no literal for half of the value space of any i-literal (e.g. i8, isize etc) is a valid one, and it creates a couple of weird corner cases such as the one lined out above.

I still think that having -128i8 compile despite being a bit weird is a better trade off than having -9i32.is_positive() parse differently from -a.to_int().

Note that 128i8 not compiling is actually just a lint that you can ignore with #[allow(overflowing_literals)].

4 Likes

I think so too, which is ultimately why, given all the above, the correct decision was made.

Doesn't mean it doesn't lead to weirdness.

The correct solution is banning all unlabelled literals. Don't Computer Science classes teach labelling any more. :grimacing:

Define labeled literal? Do you mean something like 100u8?

Fine, have it your way - I guess I'll just have to make my own language, then! With blackjack. And negative value literals. In fact, forget the blackjack! :robot::anger:

:wink:

1 Like

I have always appreciated the symmetry between -9 == 0 - 9. So in fact, a negative number is just two positive integers and an operator.

1 Like

Do remember that negative numbers are a relatively recent concept in mathematics. The earliest known uses are around 200 A.D. in China.

4 Likes

Wait up a bit. I only just got used to the idea that 0 is a number :slight_smile:

2 Likes

Note, however, that 0 - 128i8 doesn't compile out of the box, either. :wink:

Also, 3.9e-8 is parsed as single token/literal, even though it stands for 3.9 * 10-8, a much more complex expression, including a negative value in the exponent. No unary operator required there. So the mathematical argument doesn't seem entirely water-tight, either.

Edit: INB4: I do acknowledge the arguments for the unary operator, besides it making parsing things like

- /* auto-generated comment */ 28i8

much less hairy.

... Then again, 3.9e+compute_exponent() would be fun to see... :nerd_face:

1 Like

Yikes. I think you may have just made his point for him (at least to me).

So, -128i8, will not parse, or, if you disable the lint you recommended it will have the value -((-128)i8) which is 128i8 which is (-128)i8 (via overflow). This seems like a crazy thing to have the compiler go though.

I'm not 100% sure if you're calling 200 AD recent, or using "Antiphrasis" for effect.

Given that the use of "counting numbers" , 1, 2, 3..., can be traced back to 3000BC or so I would say that the first known use of zero and hence "whole numbers" in about 600AD is pretty recent.

Strangely enough use of negative numbers can be traced back to about 100BC.

Which means that for a long time we had all the integers but with a hole in the middle at zero.

Obviously it's important to know if I owe you a goat or you owe me a goat. So negative goats is important.

But why waste your time inventing a symbol for no goats? It make no sense to label nothing.