Can slice but can't index an str

tably · April 22, 2021, 9:03am

Hi,
I'm was writing some code in the playground to expericence with strs when I found something strange : str can't be indexed but can be sliced (from a range). Here's an example:

let a = "Some cool stuff";
let b = "Another awesome str";

let index: usize = 4;

// let a = &a[index]; Error !
let b = &b[index..index+3];

println!("a = {}, b = {}", a, b);

(Rust Playground)

I read some issues about why a str can't be indexed but I don't understand why it can be sliced. Also what's the most effective way to index a string ? The only I found is

let a = a.chars().skip(index-1).next().unwrap();

and it's a little long for a simple operation.

Thanks for reading and sorry if it has already been posted

alice · April 22, 2021, 9:11am

What operation do you want to do? Do you want the nth character? Do you want the character starting after n bytes? Something else?

tably · April 22, 2021, 9:12am

The nth character, like indexing a string in python

alice · April 22, 2021, 9:15am

Then you need to use the iterator chain. The operation is that long to highlight that you are doing something expensive here.

alice · April 22, 2021, 9:19am

What Python 3 really does is a bit different, but that's because it automatically converts any string containing non-ascii data to an Vec<char>. This makes indexing a much cheaper operation, but makes it take up four bytes per character.

If you want cheap by-character indexing into a non-ascii string, you can do the same and convert it into an Vec<char>. It can be indexed like any other vector.

tably · April 22, 2021, 9:20am

Ok, thanks.
I just thought, what about

let a = &a[index..index+1];

Is there a difference with the iterator chain? If yes at what cost ?

alice · April 22, 2021, 9:26am

That is much cheaper than the iterator chain. The chain will have to look through the entire string up to index, whereas indexing takes the same amount of time no matter how large the index is.

The problem is that non-ascii characters such as æ take up multiple bytes. When you write &a[x..y] in Rust, this is using byte indexing. This makes the indexing a cheap operation, so for example:

fn main() {
    let a = "aæb";
    println("{}", &a[0..1]);
    println("{}", &a[1..3]);
    println("{}", &a[3..4]);
}

a
æ
b

If you tried to do &a[1..2], then this will panic because 2 is not at a character boundary. If you don't know how long the character is, you could do this:

fn main() {
    let a = "aæb";
    println!("{}", a[0..].chars().next().unwrap());
    println!("{}", a[1..].chars().next().unwrap());
    println!("{}", a[3..].chars().next().unwrap());
}

If this is too verbose, you can define a helper function:

trait StrExt {
    fn char_at(&self, i: usize) -> char;
}
impl StrExt for str {
    fn char_at(&self, i: usize) -> char {
        self[i..].chars().next().unwrap()
    }
}


fn main() {
    let a = "aæb";
    println!("{}", a.char_at(0));
    println!("{}", a.char_at(1));
    println!("{}", a.char_at(3));
}

a
æ
b

tably · April 22, 2021, 9:28am

Ok thanks a lot !

jjpe · April 22, 2021, 10:02am

Note that the .chars() method on string slices may not be what you want.

From the docs:

It's important to remember that char represents a 
Unicode Scalar Value, and may not match your idea 
of what a 'character' is. Iteration over grapheme 
clusters may be what you actually want. This 
functionality is not provided by Rust's standard 
library, check crates.io instead.

In particular, depending on what exactly it is you want, you may want to check out the unicode-segmentation crate if you want access to graphemes rather than characters.

(on a side note: for something so commonly used, the unicode-segmentation crate has a name that is not nearly as easy to remember as it should be. I need to look it up every time)

H2CO3 · April 22, 2021, 11:10am

Topic		Replies	Views
Why String can be sliced with usize index? help	19	10571	February 7, 2022
Slices why can't I use just one number? help	16	887	August 18, 2020
Rust substring function?	7	19966	February 1, 2019
Is there another way of indexing a String rather than converting it to bytes?	29	2222	August 19, 2020
How to print out part of a string literal help	12	2422	January 21, 2019

Can slice but can't index an str

Related topics