How to slice a `str` properly?


#1

Hi! I am having a hard time understanding how to work with str on a character level. Specifically, I want to slice a portion of string between(inclusive) characters, say between П and т:

"Привет, Мир!" -> "Привет"

I used .char_indices to get the byte index of П (start) and т (end). Am I correct that simply doing

"Привет, Мир"[start..(end + 1)]

won’t work because there is no guarantee that end + 1 is a char boundary? What should I use instead? Perhaps this:

"Привет, Мир"[start..(end + 'т'.len_utf8())]

?


#2

To elaborate on this a bit, it seems that I almost can use the next character from char_indices to get the offset I need, but this unfortunately does not work for the very last character, because there is no ''offset of the character after the last" in char_inidices.


#3

You can call char_indices_iterator.next().map_or(string.len(), |(i,_)| i). I agree it’s kind of ugly, but I think there’s no such api in std, which would make the code prettier. I think I’d just use the regex crate here.


#4

Yep, this is what I’ve ended up doing, Thanks for the map_or!


#5

Yeah, I would also look for the index of the next character, since that will be the upper bound. It’s also often some kind of whitespace or delimiter, that’s easy to find.