Why is slice end_index logic as it is?

jtagcat · February 5, 2023, 9:56pm

I am new to Rust, coming from Go.

I was doing rustlings primitive_types4.rs and slicing does not make sense for me.

let a = [1, 2, 3, 4, 5];
let nice_slice = &a[TODO];
assert_eq!([2, 3, 4], nice_slice);

The starting index makes sense, counting from 0th. Resulting in [1..TODO].

.. for me means 'through', confirmed by Bash:

$ echo {1..3}
1 2 3

The Rust Book says:

We create slices using a range within brackets by specifying [starting_index..ending_index].

a[3] == 4. 4's index is 3. Thus, I'd expect [1..3] to work, but it actually results in [2, 3].

As a second guess, I'd try [1..3] with the thinking of 'from index 1, result.len() == 3' ('I want n items.'), the result is the same, [2, 3].

For me, actual Rust behaviour ([1..4]) is the weirdest. I could define it as 'start index is start index; end index is result.len() - start_index, or end_index+1. If I have an output I want in mind, it requires the most thinking of the three.

How does this make sense for practical use? I see myself making off by one by errors with this. Is it heritage from C? Should I be thinking different?

scottmcm · February 5, 2023, 10:04pm

That's not "confirmed by"; that's "what Bash happens to do".

Half-open ranges -- start ≤ x < end -- are best for programming problems. The usual link about it, from 39 years ago now, is EWD831, which is titled as being about counting from zero but is actually about the different ways of specifying a range.

If you want inclusive, then you can use 1..=3 instead of 1..4, but I strongly suggest trying to get used to the half-open ranges instead.

Half-open has the huge advantage that it splits things nicely without overlap or missing things. So if you split 0..n at k, you get 0..k and k..n -- no ±1 fixups needed.

This is particularly true for slicing since Rust uses 0-based indexing. The indexes in an n-length slice are [0, n).

In languages where indexing is 1-based you can argue that [1, n] is ok, but I think even there it's better to use (0, n] for the same splitting reasons.

H2CO3 · February 5, 2023, 10:06pm

It's not 3 items, it's 3 - 1 items. A half-open interval x..y has exactly y - x elements, which simplifies the most typical use cases.

How is Bash's behavior any relevant when discussing Rust?

tspiteri · February 5, 2023, 10:13pm

It may be helpful to visualize this by thinking of the indices as if they are pointing to the space between elements. For example

0        1        2        3        4
| apple  | banana | cherry | date   |

That way, 0..1 refers to [apple], and 1..4 refers to [banana, cherry, date]. 0..0 or 2..2 both refer to an empty slice [].

BurntSushi · February 5, 2023, 10:20pm

See also: Why does `regex::Match::end` return length + 1? · Discussion #866 · rust-lang/regex · GitHub

I linked to Dijkstra's note in my answer there as well, but also elaborated on it in a more concrete fashion in the context of the regex crate. The short summary is that the biggest problem with m..n meaning "include both m and n in the range" is that you've lost the ability to easily specify a range that is empty. Namely, in your paradigm, 1..1 is the range containing one element, 1. So when a regex (or whatever) returns a match in your string that has zero length, what does its span look like? You can of course invent whatever convention you want for such cases, but I think you'll find any such convention to be far less intuitive or convenient than the status quo.

jtagcat · February 5, 2023, 10:25pm

Thanks, that makes sense. Turns out Go does the same, It just never occurred to me, that it's (in math notation) [start_index, finish_index).

I've read everything in absolute indexes. Reading 0..n as [0, 1, 2, 3, n], not [0, 1, 2, n-1]. Then in the same universe, 0..n at j, would be 0..j and j+1..n.

quinedot · February 5, 2023, 10:29pm

Rust.

let a = [1, 2, 3, 4, 5];
let s = &a[1..4];
println!("{s:?}");
// [2, 3, 4]

Go.

a := [5]int{1, 2, 3, 4, 5}
s := a[1:4]
fmt.Println(s)
// [2 3 4]

Edit: Ah, you realized this while I was putting it together, heh.

BurntSushi · February 5, 2023, 10:30pm

No, that's insufficient because it assumes every empty range is equivalent. Searching the empty regex on the haystack abc would return 0..0, 1..1, 2..2 and 3..3.

system · May 6, 2023, 10:30pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Slice range inconsistency help	6	1220	October 27, 2022
Why the string slice length can be used as a starting index? help	11	2134	June 4, 2020
Why is does Rust's range return fewer elements?	15	1414	January 12, 2023
Reference value out of the index help	10	579	January 22, 2021
[SOLVED] Need help with slice help	8	3201	January 12, 2023

Why is slice end_index logic as it is?

Related topics