Is there a clean way to write overflowing adds?

Maybe it's only a matter of terminology that I'm not familiar with, but I still don't get it. If I work with unsigned numbers, then -x (where - is the unary minus!) is not defined. Hmmmm, unless I use two's complement. Hm!! But let's try this:

fn main() {
    let x: u8 = 13;
    println!("{}", -x);
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error[E0600]: cannot apply unary operator `-` to type `u8`
 --> src/main.rs:3:20
  |
3 |     println!("{}", -x);
  |                    ^^ cannot apply unary operator `-`
  |
  = note: unsigned values cannot be negated

For more information about this error, try `rustc --explain E0600`.
error: could not compile `playground` due to previous error

P.S.: I should have tried in release mode. But same result: Playground.

No.

Yes. What you are thinking about is Unspecified Behavior. Undefined Behavior means that any other part of the program can do absolutely whatever it wants. Integer overflow can erase your hard disk and still claim to be conforming.

It's §6.5.5 in C11:

2 Likes

Also note (in release mode):

#[allow(arithmetic_overflow)]

fn main() {
    let x: u8 = 13;
    let y: u8 = 15;
    println!("{}", x-y); // works
    println!("{}", x+(-y)); // doesn't work!
}

(Playground in release mode!)

Do I need two's complement to compute x-y? I don't think so. (But maybe that's a matter of terminology, and I'm wrong.)

:scream:

I prefer two's complement overflow to having my hard disk erased!! But seriously, with practical C compilers that exist, is it possible that anything else than the result is unspecified? And how about future C compilers. Is it likely that something like changing a value at another memory address could happen? Or an unexpected jump?

It's a bitwise operation :man_shrugging:. It generalizes to any base for that matter, e.g. 25 is congruent to -75 mod 100; the 10's complement of 25 is 99 - 25 + 1 which equals 75.

If you don't like subtracting but don't mind adding, you might find it useful in other contexts.

Subtraction of two unsigned 5-digit decimal numbers, converted
to addition by utilizing 10's complement

  20521       20521
- 17638 ==> + 82362
            -------
              02883 (mod 100000)

AFAIK, the bitwise operation that is called "two's complement" is what the unary minus does in Rust. And that isn't working on u8, for example. A "wrapping substraction" of two unsigned numbers isn't a two's complement (according to my knowledge, but I'm not too familiar with these terms).

You can fix this line by manually calculating the complement of y instead of using unary minus:

println!("{}", x + (!y + 1));
2 Likes

But then it's me doing the two's complement operation, and not Rust.

They're same. Only difference is that in C there's no safe section.

See how the very function you wrote compiles with the clang -O3 mode.

It optimizes out the entire if statement into unconditional return false;. Since signed integer overflow is UB, compiler assumes it never happens. Since it never happens the overflowing_add() function unconditionally returns false.

5 Likes

I mean, beyond bitwise, it's mathematical. Rust not supporting unary - on unsigned integers doesn't really have anything to do with it. I'll give it one more shot before I have the leave the computer, here's a decimal subtraction that underflows by "carrying forever":

  ...000173
- ...000425
-----------
  ...999748

If we take the (infinitely large) 10's complement of the result, we get ...000252. And sure enough, 173 - 425 == -252. If we concern ourselves only with the lowest 6 digits with an unsigned interpretation, the answer is 999748. But that's congruent to -252 (mod 1,000,000). You could consider any number with a 9 or maybe a 5..=9 in the most significant digit to be negative; under those systems, the interpretation of 999748 is -252. But it's just an interpretation; the math works either way.

It's an operation that can be defined in terms of the p-adic representation of integers[1], and modular systems like the binary you're used to or a 6-digit decimal system (your odometer?) or whatever are basically a fixed precision p-adic representation -- analogous to a fixed number of fractional digits, but instead you get congruence up to a power of p. Fixed precision to the left of the decimal point, if you will. In such a fixed system, interpretation of the numbers as positive or negative is arbitrary; you're working with congruences.

The math holds up even if Rust hasn't been designed around it. Rust also panics if you try to clamp with a max < min, when we could have defined it as the median of the three inputs. C'est la vie.


  1. Or rationals or extensions of the rationals (i.e. p-adic numbers); if we stick to rationals or integers though, p need not be prime. ↩︎

1 Like

C++ puzzle: what will this print when you type in 50000?

#include <iostream>
#include <cstdint>
using namespace std;

int main() {
    uint16_t a;
    cin >> a;
    uint32_t b = a * a;
    cout << b << "\n";
}

Try it out here:
http://cpp.sh/9ilpc

1 Like

Yes. Practical C compilers optimize aggressively today. I know mostly of LLVM which is the backend for Clang; it actively exploits undefined behavior for performing many optimizations, and you will get bitten in some way or another eventually. Usually not by having your hard drive erased, but by some "obvious" operation yielding counter-intuitive results (such as action at a distance, whereby a bug in one part of the code seemingly causes another, correct part of the code to crash).

TL;DR: don't ever rely on UB.

4 Likes

Anyway, since conversion between signed and unsigned types is only implementation-defined (i.e. theoretically need not be consistent across platforms/compilers but at least not UB), the following is a simple yet correct (as in not UB and works in practice too) solution assuming 2's complement for signed integers:

bool overflowing_add(int x, int y, int *z) {
    unsigned ux = x, uy = y;
    unsigned uz = ux + uy;
    *z = uz;
    if (x >= 0 && y >= 0) {
        return *z < x;
    }
    if (x < 0 && y < 0) {
        return *z > x;
    }
    return false;
}

For extra effect, you can make the return value tri-state and differentiate between over- and underflow using its sign.

Not since C++20, in C++20 it's modular arithmetic for conversions.

well but then that doesn't invalidate my implementation, because it gives a stronger guarantee, doesn't it? (anyway I was trying to write a C and not a C++ answer.)

Yes, I was saying this always works in C++20 onwards (I think you're right that in C it's still implementation defined).

Simpler version:

bool overflowing_add(int a, int b, int *c) {
    *c = unsigned(a) + unsigned(b);
    return (b < 0) != (*c < a);
}
2 Likes

I still don't see why we need any kind of "complement" (e.g. two's complement) to define overflowing behavior for unsigned integers. In case of overflow, we simply truncate. Building a complement isn't needed.

In the decimal system: 000173 - 000425 is …9999999748, and by truncating we get 999748. Of course, we could afterwards interpret it using the two's (edit: or ten's) complement, but that's not what Rust (or the CPU) does when substracting two unsigned numbers (I would assume). The CPU will just drop the last carry.

But maybe the term "two's complement" is defined or used in a more abstract way than I'm thinking. For me, it's either

  • an operation: invert all bits bit-wise and add one (with normal carrying rules)

or

  • an interpretation: interpret an unsigned number as a negative number (with the absolute value that you get when performing the two's complement operation above).

It still irritates me that two's complement is mentioned in the reference book in regard to unsigned number overflow arithmetic. I see how we could interpret an u8 using the two's complement, but we could also interpret it in any other way. It's not related to how the overflowing operation is actually carried out (in my opinion).

But I've neither studied math nor computer science, so my understanding of the term(s) might be wrong. Wikipedia says:

Two's complement is a mathematical operation on binary numbers, and is an example of a radix complement. It is used in computing as a method of signed number representation.

That's not Rust reference, it's the Book. I agree it makes no sense to talk about two's complement in this context.

Ewwwww… Isn't unsigned overflow well-defined?

Ooops, sorry, I missed that.