Why the string slice length can be used as a starting index?

Hi. I'm new for Rust and confusing with this code now. I mean why the compiler does not panic if I specify the starting index that equals the length of string slice? Does it mean that not out of bounds if the index equals length?

#[cfg(test)]
mod tests {
    #[test]
    fn it_works() {
        assert_eq!(5, "hello".len());
    }

    #[test]
    #[should_panic]
    fn it_out_of_bounds() {
        let t = &"hello"[6..];
        assert_eq!("", t);
    }

    #[test]
    fn it_got_empty_string() {
        let t = &"hello"[5..];  // why it works?
        assert_eq!("", t);
    }
}

This is working as intended, x..y is non inclusive of y.

If we map out the numbers produced by the following we'll see this:

  • 0..5: 0, 1, 2, 3, 4 (Check: 0 <= 0 <= len(), 0 <= 5 <= len())
  • 0..=5: 0, 1, 2, 3, 4, 5 (Check: 0 <= 0 <= len(), 0 <= 5 < len())

If we change it to 5..=5 then we get an error.
Playground

Also to explain a bit further, x..x is a range that contains zero items (indices). It is safe to create an empty slice, and it is also safe to index with an empty range.

Thanks for your reply! Maybe I didn't describe my problem clearly. I wonder why 5.. works but 6.. doesn't work. I think 5 is out of range just like 6. In my opinion, the starting index should be a number in the range 0,1,2,3,4. So I'm curious why 5 can be used as a starting index.

Thanks for your reply! If x..x contains zero items, why 6..6 doesn't work?

1 Like

Going 1 off the edge to create an empty slice is fine, but no more than that

3 Likes

Yeah thanks for the clarification :smile:.

The start and end of the range must still be within the bounds of the slice. It's the same for slicing an empty array with a range of 0..: playground. If this was not allowed, there would be no way to slice an empty array or vector.

One idea that helped me to think about array indexes was to change from thinking of the indexes as being for particular elements, to being between the elements. So if there’s an array with N elements, 0 comes just before the first, then there are indexes between all the elements, 1, 2, 3, 4, and finally index N just after the last element. The indexes 0 to N inclusive are then the valid values for creating .. slices, both for the start and the end of the range.

The index N is equally special as 0, since they are the indexes just outside the array, without any real elements outside that range.

Thinking about it this way a range x..x is empty since it starts and ends at the same point, ie between two elements, rather than spanning over any elements. The empty slice isn’t special in this way of thinking, it is just that any of the indexes, 0 to N inclusive, can be used to create empty slices.

6 Likes

@parasyte Thank you for pointing out this special situation I didn't think about.
@Douglas Thanks for your nice idea!

I found this description from the official documents:

Panics if begin or end does not point to the starting byte offset of a character (as defined by is_char_boundary), if begin > end, or if end > len.

And this:

The start and end of the string (when index == self.len()) are considered to be boundaries.

If both indexes are considered to be boundaries, it is reasonable that "5.." can work correctly. Just like this photo:

image

It changed my mind about the slice index. :grinning:

2 Likes

Another way of seeing it:

min .. range indexing on a slice is sugar for min .. slice.len()

So indexing with slice.len() .. is just like indexing with slice.len() .. slice.len(), which, as @OptimisticPeach said, is a fine empty slice of indices, with which you can index your slice given that both indices are "in bounds", as you now know.

  • this can be seens as natural when considering split_at() operations:

    slice[a .. c] ~ concat(slice[a .. b], slice[b .. c])

    for any b in the a ..= c range, so that, for instance, you can split
    "xyz" as "x" + "yz", or "xy" + "z" or even "xyz" + "" (as well as "" + "xyz")

    If we did not made the "" cases valid, many algorithms would be incredibly more annoying to code, by having to special case these edge cases.

Whereas indexing with (slice.len() + 1) .. is like indexing with (slice.len() + 1) .. slice.len(), which doesn't make sense.

3 Likes

Obligatory Dijkstra reference: https://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF

It's very handy that a..b splits exactly (nothing skipped or duplicated) into a..k and k..b for all possible splits, including the ones that have an empty half. (Or even two empty halves if you "split" 2..2 into 2..2 and 2..2.)

5 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.