Windows method for &str? Is &str a slice?

Hi,

I just tried to use the windows method on slices on a &str. I was surprised to see that it is not implemented for &str.

I think my problem is that I learned to call &str a string slice. Is &str really a slice? Did I perhaps learn that wrong at some point?

How could I transform a &str into an actual slice that would allow me to iterate using windows?

Thanks :slight_smile:

&str is a "string slice", that's true. This means that it's more specific than other kinds of slices, and so does not support every operation that slices do.

The key question that needs to be answered is: iterate over windows of what? Bytes? Codepoints? Grapheme clusters?

2 Likes

Right, I see the problem now.

My use case right now is pretty simple, I am dealing with simple ASCII characters so I guess bytes satisfies my requirements.

1 Like
  1. str is [u8] but guaranteed to be valid UTF-8.

  2. So &str is &[u8] but its content is guaranteed to be valid UTF-8.

  3. char in Rust is a 4byte wide integer, as it represents unicode code point which spans from zero to 0x10FFFF, excluding surrogate pairs.

  4. UTF-8 encodes each character in variable size, and you can't get the length of each character without decoding UTF-8 sequentially.

  5. If you just want to iterate over windows of bytes(u8), you can call .as_bytes() on str to get backing bytes.

3 Likes

Makes sense. Thanks. I am always baffled by the complexity of strings.

1 Like

May I suggest the unicode_segmentation crate?
It will do what you want for ASCII, with the added benefit that if you ever extend to non-ASCII it'll just keep working.

2 Likes

Thanks for the suggestion!

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.