"attempt to subtract with overflow" panic depending on order

I never came across this in any other language, so I'm just curious. If running in debug mode, this is panicking with "attempt to subtract with overflow"

let mut x:usize = 1;
x = x - 2 + 1;

but this is not

let mut x:usize = 1;
x = x + 1 - 2;

So it seems like the expression on the right hand side is evaluated from left to right and whenever the intermediate result is negative it will panic? What would be the problem with first evaluating the whole right hand side and then checking if it's negative? If somebody could point me to some more info about this, that'd be great.

1 Like

So the idea is to accept all calculations that would produce no problems if calculations always produce correct results and reject them otherwise?

That property, like most other properties, is undecidable, so there would be no problems, except you would never know whether compiler would accept correct program (in that definition) or not.

1 Like

https://doc.rust-lang.org/reference/expressions.html#expression-precedence

The precedence of Rust operators and expressions is ordered as follows, going from strong to weak. Binary Operators at the same precedence level are grouped in the order given by their associativity.

So, because it's left associative it would be:

x = (x - 2) + 1;

vs

x = (x + 1) - 2;
3 Likes

Left-to-right evaluation order of operations of equal precedence is used in Rust like in any other language with infix operators. Under the so-called "as if" rule, the compiler is perfectly free to exploit associativity or commutativity iff it does not change the semantics of the code. And panicking is a part of the semantics of the language, so if some code panics in some condition, making that panic not happen in that condition is not a change the compiler is permitted to make.

5 Likes

Evaluating a whole expression before bounds checking would require larger-precision arithmetic, for instance to evaluate x * 1000 / 1001 without overflow in the intermediate values would require more bits than x has.

3 Likes

But in some cases that doesn't happen (original could would work fine if you would just use wrapping arithmetic and may even be optimized into x - 1).

I wonder if such a language where expressions that could be calculated that way are accepted while others, that require arbitrary precision arithmetic, are rejected even exists, though.

And @hannes tells us that such approach is not just typical, but common (I never came across this in any other language) which kinda doesn't match my experience at all.

I know languages that detect overflow (Pascal with range checking is one example and it existed for more that half century), I know languages that ignore overflow (C or JavaScript), I even know of languages that have arbitrary-precision arithmetic (Python or Scheme), but languages that topicstarter asserts as common and typical… no idea! Not a single example comes to mind!

That would be a very different system of arithmetic types. I would love it if we moved from types that need every operation checked to types that return sufficiently-large results that only need checking on truncation. But that's not a trivial thing to do at all. For example, quick, what's 569936821221962380720³ + (−569936821113563493509)³ + (−472715493453327032)³? How much space do you need to evaluate it?

If you want to just do something in infinite precision, there's always num-bigint — Rust math library // Lib.rs

5 Likes

Thanks for all the remarks, they make good sense to me!

To clarify, I just put a trivial example to show the point. In my actual code there was a complicated expression and it was not immediately clear which terms are positive and which terms are negative. I had to rearrange the terms, starting with the positive ones for it to run in debug mode. That just surprised me :grin:

2 Likes

I don't see how anything like this could possibly work in Rust. Presumably this would mean that x + 1 (where x is u32) would no longer have type u32 which seems totally incompatible with the way it works now.

1 Like

There's a reason it's not an RFC. I wish more languages did that, but it's probably too drastic for Rust.

And yes, it would. But it'd mean that, for example, (x + y) / 2 gets you back to the type you started with, without overflow issues without needing a special https://doc.rust-lang.org/std/primitive.u32.html#method.midpoint.

1 Like

It would be great, but how would you actually implement that on machine code level?

Note that midpoint is nothing special, it just calculates not (x + y) / 2, but ((x ^ y) >> 1) + (x & y) instead.

Converting one expression into another is quite non-trivial and, as usual, undecidable, in general, task. Especially if you add things like special instructions that allow you to calculate modpoint more efficiently (yes, modern CPUs have them, although usually only for SIMD).

The same way that LLVM already does it if you just use a wider intermediate type in LLVM: https://llvm.godbolt.org/z/5nT48YW7o.

It's just an optimization issue. Even cranelift will optimize out unnecessary large intermediate results in some cases: https://github.com/bytecodealliance/wasmtime/blob/50d82f22e5ee5e0e9ef70a1e86d963e0fbc1cc04/cranelift/filetests/filetests/egraph/arithmetic.clif#L342-L357.

Yes and no. The fact that you can do some calculations that way but so very few while the majority would lead to huge slowdown would be immensely confusing.

We have chicken and egg problem here: introducing such a radical transformation is very scary till we know how often this would make people write extremely inefficient code and we wouldn't know it till change would actually be implemented.

I'd rather the default to be to actually give something correct, even in release mode. I shouldn't need to know tricks just because I want to do (a[0] + 2 * a[1] + a[2]) / 4.

It'd always be possible to truncate the intermediate results if you want fast results instead of correct ones, because if you truncate at each step then it'll never take "huge slowdown". And the profiler would clearly show the cost if it's a problem.

I don't think it'd be that surprising, either. It's only things like division that cause intermediates to grow. If you do a wrapping conversion at the end, you can do as much multiplication/addition/subtraction as you want without needing larger intermediate results.

2 Likes

I'm not sure why this surprised you? What do you think should have happened instead?

First off, subtraction is not commutative or associative.

Second, the whole "checking overflow" feature is in place to catch bugs. If you are writing operations that produce invalid intermediate results, the only place to catch those errors is right there and then.

If you expect the compiler to "wait" until "the end" of evaluation, then in all seriousness, when should it stop? Should it wait for the entire execution of main and then trace back meticulously to the point where some error happened? Should it arbitrarily decide the scope where it's "the best"? That doesn't make a modicum of sense to me.

3 Likes

I don't see how this is related to the topic. This is about the addition of integers, which is commutative and associative, -2 + 1 is equivalent to 1 - 2, and further, integers are closed under subtraction.
The point is that the compiler expects a natural number (including 0) and not an integer.

Well, one possibility would be to wait until the end of the evaluation of that expression on the RHS of the statement? If you tell me that the compiler checking after every single arithmetic operation is the only reasonable thing to do, then I accept that because I have no time to deeply study about what a compiler should or should not do. But I'm sure people could come up with a hundred different things what a compiler could do in this situation, but they might all be highly inefficient/complex/undecidable or whatnot.

And this means that statement boundaries are suddenly becoming semantically meaningful, in particular, they start to influence possible side-effects. Therefore, for example, it would be illegal to "inline" (internally) some such statement into the next one, since this will remove the possible panicking spot.

4 Likes

No it's not. x - 2 is parsed as a subtraction, not as the addition of (-2).

Unsigned integers are not closed under subtraction. usize is unsigned.

1 Like

I thought that the Rust compiler would reduce both down to x = x - 1,
I thought the -2 and + 1 parts are just constant and would fold away.

I think that is what "gcc" would do with "c" but have no reference besides my fading memories.

Does this math happen at runtime? Or does maybe the math all optimize away and only the panic code is left in the final binary?
Oh the simple little things can be so confusing.

You are confusing operational semantics with optimizations.