Why does rust have [ ] indexing?

From the very beginning it seemed kind of strange to me that you can index any Vector foo as foo[x] and get a panic. How is it better than foo.get(x).unwrap(), which seems more explicit?

It's a shorthand for a common operation. Many times, the index is a value that you have just retrieved from the collection itself, so you know it is valid, and error handling is just noise.

Especially in numerical code where you might have lines like this, the .get().unwrap() equivalent detracts from readability a lot:

x[i][j] *= y[i][j];
3 Likes

While it's important to offer a way to do things that doesn't panic, it's also important to think about what people will realistically do on errors. If something like 99% of the time people will just say "shut up compiler; I checked that already", then it can actually be better for the short-named one to be the panicking one, with a separate try_blah-style method available as an option.

Note that this is generally only true when it's easy to http://wiki.c2.com/?LookBeforeYouLeap -- like it is with array indexing, where algorithms using indexes generally already ensure they're in-range. For things where checking the precondition is difficult or impossible, it's better to http://wiki.c2.com/?CoupleLeapingWithLooking and not have the panic-by-default design.

(Obligatory link: https://boats.gitlab.io/blog/post/2017-12-27-things-explicit-is-not/)

8 Likes

I suspect that if somebody had come to me extolling the virtues of this new fangled Rust language and then I found out that the way one does array indexing is foo.get(x).unwrap() my response would have been "Are you frikken joking me?" And I would have looked no further.

Simple array indexing has been a part of almost every language almost everybody uses since forever.

I like array indexing like that. I like that my program panics and dies when I step out of bounds. I find my bugs quicker that way.

What is not to like about it?

3 Likes

I guessed the general point that doing this is mainly for the sake of cleanliness. But it still feels a little bit opposite of pretty much all of the rest of rust. wouldn't it be better if it was called something like foo.get_unwrapped() where it looks more explicitely clear to the user that it may panic?

It's not clear to me how foo[ x ] is not explicit enough.

It's an array index operation. It might be out of bounds. So it would be expected to panic in a memory safe language.

We have the best of both worlds in Rust. If you really want to catch out of bounds access and handle it some how use .get() and check the result.

I'm honestly not sure what I'd like as the best answer either. I'm mostly thinking of complete newcomers to rust who might stick with [ x ] without ever using .get(x)
anyway I get the point. But I think at least some warning would be nice. clippy warns about things like unwrap_or(foo()) and suggests unwrap_or_else(||foo()). I think that it's justified for this to show up as a warning at least.

2 Likes

Hmm...

But the implications of such a clippy warning is that one should replace all foo[ x ] in ones code with 'foo.get(x).unwrap()`. And perhaps check the result.

Which I don't think is necessary.

It would be suggesting to make the code harder to read.

You can always suppress warnings. I think it'd be nicer for that to be on by default and if someone is sure they want to use [ x ] they can always turn it off.
I'm just saying clippy complains warns about far more trivial things than this.

1 Like

By using list[ix] you are explicitly saying "grab the ix'th item and blow up if it doesn't exist". It's well known that indexing into something may fail with an index-out-of-bounds, so the unwrap() doesn't actually provide any more "explicitness" while also adding a non-trivial amount of noise to the code.

If list.get(ix).unwrap() and list[ix] are exactly equivalent, then what benefit do we gain by using the longer form? More characters ≠ more explicit.

5 Likes

Half of the indexing code I write looks like this:

let oldest = self.render_samples[(self.render_sample_index + 1) 
            % ACTIVE_SAMPLES];

There's no improvement to this kind of code by using .get().unwrap(). It helps to look at real code to see how something is useful, rather than trying to extrapolate from vague principles with no context.

2 Likes

You say that as if [ x ] is the less desirable thing to do.

It's just an opinion but I disagree.

I might suggest a clippy warning that says:

"foo.get(x).unwrap()" is unnecessarily verbose and confusing. Suggest using "foo[ x ]"

That is a fair enough suggestion if you're using the unwrap(), I'm saying without the unwrap(), as in

println!("{}", foo[0]);

could trigger a warning

foo[0] may panic, consider using if let Some(val) = foo.get(0) { ... }

is what I'm suggesting.

I understand that the entire topic is trivial and unimportant. But a new to rust friend asked me why rust doesn't have nulls, and forces you to match on Option, but let's you do a good old index out of bound exception on any vector without any warning, and it seemed like something that would seem a little counter intuitive to newcomer.

1 Like

Imagine how awful this code would look if using .get().unwrap(). Or even worse if one handled the error:

pub fn convolution_safe(sample: &[f32], coeff: &[f32]) -> Vec<f32> {
    let mut out: Vec<f32> = vec![0.0; sample.len() - coeff.len() + 1];
    for i in 0..out.len() {
        let mut acc: f32 = 0.0;
        let window = &sample[i..i + coeff.len()];
        for j in 0..window.len() {
            acc += window[j] * coeff[j];
        }
        out[i] = acc;
    }
    out
}

As it happens the way to fix that is to rewrite it like so:

pub fn convolution_serial(sample: &[f32], coeff: &[f32]) -> Vec<f32> {
    sample
        // Get sequence of windows that "slide" over the sample data
        .windows(coeff.len())
        // Form the dot product of every window in the sequence
        .map(|window| {
            window
                .iter()
                .zip(coeff)
                .fold(Fast(0.), |acc, (&x, &y)| acc + Fast(x) * Fast(y))
                .get()
        })
        // Map produces an iterator so we have to assemble a vector from that.
        .collect()
}

Which not only get's rid of the array indexing if that offends ones aesthetic but turns out to enable better optimized and faster code.

I would have to think a while before I see how this relates to null references.

But riddle me this:

If we carry this train of thought far enough then all the arithmetic operators, +, -, *, /, etc can also fail with overflow, divide by zero, etc.

Clearly they should be done away with as well and replaced with "a.add(b).unwrap()" and so on. Or cause endless clippy warnings.

Would you really like to see that ?

10 Likes

In fact, clippy does have a warning for this: integer_arithmetic.

What it does

Checks for integer arithmetic operations which could overflow or panic.

Specifically, checks for any operators ( + , - , * , << , etc) which are capable of overflowing according to the Rust Reference, or which can panic ( / , % ). No bounds analysis or sophisticated reasoning is attempted.

Why is this bad

Integer overflow will trigger a panic in debug builds or will wrap in release mode. Division by zero will cause a panic in either mode. In some applications one wants explicitly checked, wrapping or saturating arithmetic.

Known problems

None.

Example

a + 1;

I think this is a fair question.
Indeed the [] access looks against the safety goal of Rust.
But I think it is important not only for syntax but for another goal of Rust: performance.

Checking an index on every access can not be performing, specially when most of it is sequential access through a counter and already bounded by the Vec size.
I have written tons of code like this C++, and don’t recall feeling I should use the checked interface nor the last time I had to solve a bug on something like this (maybe it is just about the way it is usually written which makes it safe as whole)

I would never inflict this upon an entire project, let alone make it enabled by default for all Rust users.

There is always a trade-off between correctness and convenience, and enabling integer_arithmetic and indexing lints by default would incur a monumental inconvenience to the user and decrease in code quality/readability for something that is a non-issue in practice.

The integer_arithmetic lint would be mainly intended for use in niche places where you can't afford accidental overflows or panics. Like if I were writing a crypto library and overflowing when I wasn't expecting it could lead to security flaws.

6 Likes

This is not true. Below code, though being really unidiomatic Rust, doesn't produce any bound check instruction so doesn't have any perf penalty compared to equivalent C/++ code.

pub fn foo(arr: &mut [i32]) {
    let len = arr.len();
    let mut i = 0;

    while i < len {
        arr[i] = some_logic(i);
        i += 1;
    }
}

Rust as a language is designed with powerful optimizing compiler in mind. If you clearly and correctly do sequential access through a counter already bounded by the Vec size, the release build would not perform any bound check and being at least as fast as C code. If you made any mistake, it panics instead of triggers UB and compiled into no-op function as a optimization.

6 Likes

I think you misunderstood my previous post. I did not mention C++ to make any comparison of Rust and C++. And I am well aware that Rust aims having zero cost abstractions and lots of compiler optimization to reach C like performance with high level code.
The reason I mentioned C++ is because is a language I know a lot. If the question was "Why on C++ to write to use vec[i] in a std::vector instead of vec.at(i)" , the answer would be exactly what I gave: more performing and most access are safe because the code already relies on the vector size.
I was just trying to make a parallel that the same story should apply to foo[x] and foo.get(x).unwrap() , as originally asked.

Had the evolution of Rust gone differently, it’s possible we would have ended up with x[i] returning a Result, so you’d see a lot of x[i]?s around in cases where the lookup is known not to fail.

But the ? operator is a relatively recent addition, and the indexing operator had to be designed before anyone had even thought of it. At this point, there’s not much point in debating how it should work rather than how it does work: it’s ingrained in so much code that changing its behavior is infeasible.

2 Likes