Is there a way to allow indexing `Vec` by `i32` in my program?

And what about using some unsigned integer like u8, u16, u32 for indexing?
would an RFC in this sense be reasonable?

5 Likes

Seems like a more reasonable idea indeed! :slight_smile:

There is at least one, which is that right now using something for indexing tells inference "BTW, this is a usizeā€ ", and it's likely that allowing more types there would break some existing programs. That's allowed breakage, but not something that people really want to do, so I think it'll end up blocked on some kind of "this is usize unless you're really sure it's not" feature, but that's hard.

ā€  Well, or a Range<usize> or whatever, but the point still stands

2 Likes

It will not produce any garbage, it will panic, which will be indication of a logical bug, the same way as out-of-bound indexing. And casting signed integers to usize is more likely to produce garbage results than panicking.

Please, don't speak for everyone so categorically. While I agree that usage of signed integers for indexing is usually indication of a bad design, in some cases it can be convenient to keep cursor as a signed integer, e.g. if you'll often perform offset operations on it.

I'm afraid this will become OT quiet soon.

You didn't read my sentence. I said, let's assume a function. I am not saying let's use the index function of the vec. You get me? Good.

Only if your vec is filled with over 10e18 elements, which is not the case in 99,9999%, so alsmost never.

And again: it was meant as a reaction of the fictional function which procudes garbage with negative values, not the index function. Please read my sentence more carfully, sorry if I write over complicated sentences or garbage, english is not my native language.

But this has nothing to do with indexing? Offset is a completly different topic and yes, signed values are the right thing to use there.

Isn't that exactly why indexing should allow signed integers to be used? Supposing that you have to start out with an i32, for some reason, casting gives you a bad error message, while a native impl Index<i32> could give you a better one.

What's worse is that in rare cases, casting to usize before indexing could silently produce incorrect behavior. Usually this isn't possible, because a negative signed integer cast to usize will always be > isize::MAX (even if the original integer type was smaller), and heap allocation sizes are limited at isize::MAX (ā€¦I don't remember where this is specified, though). But there might be ways to obtain an &[u8] slice that large, perhaps using a wrapper for OS mmap (there are definitely some 32-bit platforms that let you mmap a 2GB chunk of memory). In that case, you might be able to successfully index with a negative i32 after casting it to usize, even though a negative index would still be semantically nonsensical, and likely not what you wanted. A native impl Index<i32> could explicitly panic on negative indices rather than relying on the usize equivalent being out of range.

2 Likes

It's all about runtime vs compile time errors. When you try to index with a signed type, you get an error during compile and have to think about what to do next.
The other way it may (or may not) panic during runtime. You decide which is better :slight_smile:

We are discussing indexing and not some fictional functions, so I don't quite get why you have even mentioned it.

Or if logical error produces a big negative number (e.g. due to some bit-twiddling), or if we use mmap as written by @comex. Either way having 100% consistent panic is much better than 99% it will "almost never" produce garbage result.

And you still haven't answered the initial question: how trying to use negative index is principally different from out-of-band indexing?

I was talking about having cursor with signed integer type, so you'll be able to write:

fn apply_offset(&mut self, offset: isize) {
    self.cursor += offset;
}

fn get_item(&self) -> &Self::Item {
    &self.buffer[self.cursor]
}

Without any boilerplate as conversions.

Sorry. What does UB mean?

Undefined behavoir

1 Like

It doesn't make any sense. That's the difference. Please tell me, what would you expect when you try to index a vector/array with a negative value.

Indexing with negative values could also be confused with the Python syntax, where this means indexing from the end of the vector (list, in the python terminology)

1 Like

As I've already wrote several times explicitly I expect expression to panic, i.e. absolutely the same behavior as when we try to get 100th value from vector with length 10. Although an alternative to panic can be indeed to copy Python behavior and index from the end of slice/vector and panic if absolute value of the index greater or equal to the length.

Okay, and what do you think now is better. Panic at runtime or not be able to compile and get this error during compile time?

I think it's better to allow vec[signed_int] and panic on negative values, than force/recommend users to write vec[signed_int as usize]. You will get zero compile-time errors for the later and will have non-zero chance of subtle errors.

1 Like

Okay. That is your opinion.
I think the problem in your thinking is, that you really want to use an signed integer and I don't get why.
When the value never can get negative, don't use a signed integer. You have nothing from it. Nothing.

Do you think, that compile time errors are a bad thing? They are here to help you, to indicate that there might be a problem with your code and you could do it better.

I don't know who you are, or what background you have and I'd really like to discuss this further with you, but I'm quiet desperate and don't know how to explain this further so I cut this here. If you are really convinced, that this should be able, then start an RFC or whatsoever. GLHF.

3 Likes

Let's recap all of this.

Indexing a vec with a type other than usize is not possible. Why? Because some people decided, that only positive values would make sense and usize is a good candidate for a size of a vector.

Do you want to use a signed value nevertheless? Cast it, but make sure that you check the bounds (e.g. check if the value is less than 0)

Do you want further material: https://github.com/rust-lang/rust/issues/29010

In the end I want to say: Respect the types and their limits. Always make sure to double check, before a cast.

1 Like

Agreed, we are not getting anywhere. So let's finish this discussion.

Again, as was shown several times you get significantly reduced amount of as casting boilerplate in some cases. (arguably quite rare and, yes, it's debatable whether we should consider changes for them) So does not matter how many times you will say "nothing", it will not change the fact.

And again vec[signed_int as usize] will not produce any compile time errors and recommendation to use as usize is quite common. Yes, some users will consider to change their code, but significant amount will just slap as usize without thinking much and will continue with strengthened feeling that Rust is too verbose without necessity.

This link was already posted by me...

What confuses me in usize is that it feels like a type designed specifically for pointer arithmetic use cases. So it feels more natural to start using generic i32/u32 for newcomers (like me).

It's just the maximum possible memory location - not pointer arithmetic specifically.

Hypothetically, if you represented all of your RAM as a &[u8], then you would absolutely need (at least) usize to index it. Having usize stops your code from having to think about the address size of your target machine.

1 Like