When can usize or u64 overflow?

You can keep them, but add a comment indicating that if any performance-related reasons to do so comes up, one could review them.

As mentioned above, the documentation of Arc should also be important. I haven't checked, but if they don't document the overflow behavior, I can imagine reasonable alternatives, for example the Arc could go into some kind of “at least one handle must have been leaked, so I'll never be freed, but you can keep using me” kind of special state. Or maybe that wouldn't be a viable option since Arc has API that exposes the weak and strong count, and such new behavior could maybe break code working with those... who knows?

I'm case you would later go ahead and remove the checks, you might also need to add comments to more places in your code that shall no longer be touched without reviewing the safety considerations that the overflow needs to stay impossible. I. e. it's not only documentation overhead at the place where the implementation occurs, but also in other places, and some maintainance overhead that it might need to be reviewed when touching the code again. So unless there is any - be it minor, but at least measurable - actual performance benefit, it might be a bad tradeoff.

1 Like

Interesting point.

Yeah, a sticky reference count is a perfectly reasonable choice, and works especially nicely with smaller-width count variables.

For example, it can be perfectly reasonable to say "look, if you hit 65535 simultaneous owners of that, we'll just never release it". Because, TBH, how many of those does a realistic program have that aren't just kept around forever anyway?

The one difficulty (and IIUC the reason Arc does the nonatomic abort) with sticky reference counts is that there isn't any atomic way to implement the desired semantics. If you want to do something clever, you need a fetch_update style CAS loop, which will hurt perf of refcount inc/dec, which is generally considered to be hot and worth optimizing (e.g. using Relaxed for some operations instead of always using AcqRel).

For comparison's sake, thread::scope also has a similar theoretical soundness hole given usize::MAX + spawned threads, but takes a different approach.

// Arc
let prev = count.fetch_add(1, Relaxed);
if prev > isize::MAX {
    intrinsics::abort();
}

// thread::scope
let prev = count.fetch_add(1, Relaxed);
if prev > isize::MAX {
    // IIUC could be Relaxed, as this doesn't synchronize
    // access to anything, but uses a shared codepath
    count.fetch_sub(1, Release);
    panic!()
}

You actually can't overflow the count here with a single thread::scope, as the isize::MAX threads you can successfully create can only each only begin to create one more, leaving the count at maximally usize::MAX - 1 when the main thread would check it. However, by nesting a fresh thread::scope, it's possible to create the additional couple of threads necessary to partially create the threads in the outer scope to get the count to wrap.


What would an intrinsic look like to address this use case? Increment is "just" an atomic saturating add(1), but decrement is significantly more complicated as an atomic sub(1) unless the existing value is !0.

I am not a CPU engineer. However, I will guess that IF we were to get hardware instructions for this, it would probably be along the lines of "compare not and op", i.e. CAS except 1) it succeeds if the previous value is not the provided prev value and 2) instead of providing a next value, you select one of the noncomparing simple atomic operations. It still produces the old value as an output.

This seems unlikely, though.

I just noticed that a u64 would not overflow in practice (if an actual operation happens that can't be optimized away, and assuming the value isn't increased otherwise or more than by 1 each time). So how about something like this:

fn count_up(counter: &mut usize) {
    if std::mem::size_of::<usize>() <= 8 && *counter == usize::MAX {
        panic!("counter overflow");
    }
    *counter += 1;
}

On 64-bit platforms, the std::mem::size_of::<usize>() <= 8 should short-circuit at compile-time, resulting in a check only being done on 32-bit or 16-bit platforms.

In full honesty: just eat the check (especially if it's on &mut and not &Atomic). The cost of an untaken branch is nearly zero.

The Rust stdlib used to elide the check preventing allocations from exceeding isize::MAX bytes on 64-bit targets, under the assumption that such an allocation would always fail.

First off, this actually meant that you could cause UB in sound code, with the only prevention being the actual call to the allocator failing... but under optimization, such an "impossible" allocation could succeed (e.g. if it was elided), allowing a too-large offset to occur, causing UB.

More relevant to this discussion, though, is that I changed that — now Layout manipulation always (without unsafe) maintains that the (rounded up) size is ≤ isize::MAX. This resulted in an average compile time regression of 0.5% (median 0.3%) across the primary per suite... from a file where functional noöp change can cause an average 1% regression (with a primary range of [-2% .. +5%]).

Your code is less hot than that. Including the check won't hurt. (And if it does show up in profiling somehow, you can always adjust it later.)

6 Likes

I totally agree I should keep the check in my code. In particular also because this code doesn't run very often.

I should have made clear that my question was more out of curiosity rather than having a real problem here. However, I often stumble upon situation like, ":thinking: wait, don't I need an overflow check here?" And then I end up writing .checked_add, .checked_sub, .checked_mul, etc. everywhere. And then I wonder why Rust doesn't do these checks by default on all operations. :man_shrugging:

Maybe the point is that many integer overflows do not cause UB. In my example above (linked on top of this post), the overflow would either panic (in debug mode) or overflow (in release mode). In my case, there is no unsafe code which relies on the integer being calculated correctly. However, the consequence of an overflow would be that a deadlock can happen. But even in less severe cases (e.g. where simply a wrong result is presented to the user but program flow isn't going wrong), I still think that programmers will want to avoid errorneous calculations.

This leaves me with the impression that using +, -, *, / should be much more rarely used than .checked_add, .checked_sub, .checked_mul, .checked_div, .checked_pow, .checked_….

In which situations can you really be sure that an addition or multiplication doesn't overflow? I feel like these situations are pretty rare.

I think you could look at this as part of the more general idea that how can you really be sure that any of your code is actually correct?

I do think that enabling -C overflow-checks=yes is the right choice even in release mode for many, many programs -- after all, one can always write wrapping_* in places where the profiler says that the checks are actually a problem.

But if you don't have any unsafe code, then overflows are "just" logic errors, not UB, so in some sense they're not any worse than accidentally doing a + 1 where it's supposed to be a - 1. And the advantage of wrapping_add as the release-mode default is that x - a + a at least gets you back to x even if it wrapped internally (unlike saturating, for example).

2 Likes

There is an alternative, namely using arbitrary-precision integers rather than i64 etc. The Zarith library in OCaml has an interesting approach to this where sufficiently small integers have an unboxed representation; otherwise gmp is used.

IMO these functions are more for the case where an overflow may legitimately happen and you want to handle it, rather than for debugging.

Normally what you should be doing is having an idea of what the ranges of your values are and why they fit in your data type. If you want to make sure again at runtime that what you think is true is really true, for debugging purposes, you should add assertions:

assert!(x <= 1000000);
assert!(y <= 1000000);
x + y

Hmmm, yeah I see. But for formulas which are a bit more complex than a simple addition or multiplication, it's pretty hard to track the boundaries of all intermediate results. And if I add assert!(x <= Y) in a lot of places, I could also just use the .checked_ operations. But yeah, an assert! in the beginning of a longer computation would be more efficient than turning on overflow/underflow checks for all operations within a longer calculation.

In that case the risk of overflow is higher, but checked operations don't protect you! If you ship your software with a bunch of checked operations, and your checked_add overflows for a user, it's a bug! When your computation is complicated, that makes it even more important that you carefully think about / prove why everything fits in i32 or whatever.

What protects you to some extent is testing -- and in debug mode by default you have overflow checks enabled.

Adding assertions makes sense because your assertions can (and often should) be stricter than just checking arithmetic doesn't overflow.

I am afraid that keeping track of the ranges of possible values is an important part of programming with fixed-size data types -- if you don't want to deal with that, use bignums, as suggested by @user16251.

ibig also does that.

1 Like