Vector len() - 1 comparison for usize

Not sure what to expect in runtime from self.value.len() - 1 condition when the len will be 0 so -1 in result.
Both usize types, no warning on compilation even usize type does not allow negative values

if some_usize < self.value.len() - 1 { // if len is 0?
    // ..
}

You'll get an underflow in production and a panic in dev.
Use saturating_sub() instead.

3 Likes

Thanks for advice,
just interesting, why compiler does not notice, rust-analyzer, clippy finally.. where is this wrong condition could be useful to not warn the developer by default?

I think that this is a design decision made by the Rust devs.
Using raw arithmetic operators is kinda like using unsafe arithmetic with the potential to over- or underflow.
If you want checked arithmetic, you can use the checked_*() methods of the primitive types (or, like in this case, the saturating_*() ones or wrapping_*() ones if you need it).
As to why Rust defaults to the "unsafe" way using the arithmetic operators, I can only speculate.
I would think that this is historic and due to backwards compatibility with early versions and maybe also due to ergonomic and performance reasons.
But you'd need to ask a compiler dev here.
Imho Rust should default to using checked_*() as per its safety philosophy. But it does not.

1 Like

You have the clippy::arithmetic_side_effects that will warn you about the usage of operators that may overflow or panic.

3 Likes

Thanks, it's lot to learn yet,
expected that everything in Rust should be super strict by default :slight_smile:

It tanks the performance, and introduces stack unwinding in trivial code. The real flaw is that len() should return isize instead of usize, so overflowing would never be a real issue.

2 Likes

I disagree. A negative length does not make sense.
And why would it cause stack unwinding?
You should handle the returned Option, not unwrap() it.

2 Likes

Sigh. I somehow guessed this response would come up. From a larger angle of view, one should always use signed types for numbers that represent an actual numeric value and use unsigned types if you are fiddling with the bits. And len() obviously belongs to former category.

The fact that -1 length does not make sense is irrelevant. The range of reasonable return value is not a bigger factor when deciding return type. Would you argue i8::leading_ones should returns u8 instead u32? I think not.

And don't even get started on how usize represents double the range. Not only it's only 1/31 (~3%) more on logarithm scale (which is what actually matter) on 32-bit, but also the fact that you almost never use that part anyway.

You just can't be seriously suggesting this. This just means ordinary arithmetic operator is useless. Rust is a tool for actual industrial usage, not "research only by design".

2 Likes

Yes, I am suggesting, that this would be useful to me.
No, I am not suggesting that Rust changes this now or in the future.

On the contrary. Having over- or underflow conditions is rarely a desired outcome of arithmetical operations. Hence having to deal with an Option instead of silently having an over- or underflow is imho more stable and explicit.

PS: In my actual use cases, for this very reason, I always end up using the checked_*, saturating_* or wrapping_* methods anyway, so yes, to me the current arithmetic operators are quite useless.

And I use it exactly for this: In industry.

Potentially overflowing arithmetic is not like unsafe. It's not UB, the results are defined and specified -- panic or wrapping based on the overflow-checks flag. (It's also not hard-coded based on the profile, just defaulted based on the profile. So if you want to panic on --release, you can configure things that way.)

Still a source of bugs though, yes.

RFC 560 introduced panicking overflow and is probably a good starting point if anyone wants to dig into the history and early discussions.

5 Likes

I think it would be best for len() to be generic and return any type you want: usize, isize, u8, u32, etc. The best type depends on use case. For example, if I write a chess game I want to index the chessboard using u8 not usize. The fact that len() and friends are usize only causes me to change almost all the integers in my programs to usize to avoid casting.

2 Likes

There is no reason to sigh.

It makes perfect sense that the return type of Vec::len() is the same that can be used for indexing and is of unsigned type because it makes no sense for it to be negative.

This is exactly what type safety is about: if some of your variables are unsigned (because negative values don't make sense) and some are signed (because negative values are meaningful), this helps prevent confusing them and gives hints to readers of the code what they are used for.

It also allows some lints: if you write my_vec.len() < 0, clippy will warn you.

Yes, unsigned types can more easily lead to unintended underflow, but these can be prevented by paying attention to the single one operation that can cause underflow: subtraction.

In your example, simply replace

if some_usize < self.value.len() - 1 { // if len is 0?
    // ..
}

by

if some_usize + 1 < self.value.len() { // if len is 0? No problem!
    // ..
}

and you're good!

6 Likes

From a "larger angle of view", one should use bigint ratios for everything because what if somebody down the line multiplies it by 4 quadrillion? And what if I want to divide it and get a perfect result?

...except, that's on the user down the line to deal with. Because we're not responsible for crippling everyone now so that somebody later might have a somewhat easier time. To me, your "larger angle of view" is actually quite a narrow, though far-sighted one.

1 Like

We have different interpretation of type safety then. I actively use i32 even if I know logically it's something can't be negative. And you can debate with google I guess.

And no, not every subtraction is easy to spot and/or easy to refactor into an addition.

This is strawman argument. I never said anything about bigint. Multiply by 1e64 is an absurd scenario, subtract length by one is a common operation. If you are equalizing these two operations deliberately I have to doubt your faith.

And how is returning isize crippling anyone? Because you can't index a slice with it? Yeah, that's just another problem with Rust I would say.

Edit: If by "crippling everyone", you mean changing the API from usize to isize now. Then yes, it is. The API is obviously frozen. But I never said the API should change now, I said it should have been isize. It's a mistake can't be saved.

C++ has a very different type safety culture; C has used int for just about everything for years now.

Okay, how about multiply by 2? Also quite common, but for any bounded integer type, it's guaranteed to overflow on half of the values in range. Those odds get worse and worse the higher the multiplicand.

And about division, that too is a common operation, and yet you're not advocating for f64.

No, I mean you're forcing downstream users to try and figure out what the invariants are, since they're not encoded in your types. And you're forcing them to waste time and space efficiency in the process.

The problem of making sure your types are more generally useful is best solved by giving users the tools to generalize them, not by doing so for them.

2 Likes

I think most of the argument applies to Rust as well. (Maybe except the part about UB.)

That just another bad argument. I literally preemptively refuted this. Half the range doesn't not mean half the occurrence rate.

Because integer division gives a meaningful algebraic result. But 0usize - 1 absolutely don't.

If pattern types are accepted and implemented, I might agree with encoding numeric range in the return type. But in current Rust, the benefit is marginal.

  1. Common sense exists.
  2. If you see len() -> isize and got confused, this is mostly because current official implementation uses usize so you may wonder what's the difference. But if the official one uses isize begin with I don't think it would be confusing.
  3. Doc comment exists. Writing out /// Return value is not negative is easy. Reading it even easier.

The link points to a discussion on integers in C++, where they recommend to use signed integers in general. I don't know how integers behave in C++, but the linked text seems to imply that singed integers are checked for overflow while unsigned integers are not. I don't think that applies to Rust.

At least in my code, when using usize for sizes and indices of collections, overflow is unlikely, since on the platforms I target, usize is u64, and there would be other problems long before that overflows.

And I am genuinely curious to see an example where subtraction cannot be easily spotted or refactored.

2 Likes

That's... uh... 32-bit? Speaking of hardly ever used...

In any case, this argument is quite unsatisfactory. There are plenty of situations where you do use that upper region, even more if you include intermediate results.

n / 0 has a meaningful algebraic result? And yes, that's a very particular case. So is subtraction from 0.

In any case, there's still the obvious pitfalls that could occur if you forget to factor in inaccuracy, just like subtraction has this pitfall.

  • Common sense has proven a remarkably bad fallback in situations like these; after all, common sense dictates that 0 - 1 on an unsigned type is bad.
  • Sure, just like the C community understands that everything is int. That doesn't make it good.
  • Rust is designed to make signatures as informative as possible. If all it takes is docs, then why are we using Rust? We should all be using Python without type annotations! Except even that is frowned upon nowadays.
1 Like