Str::split() and "slice of `char`'s"

According to str - Rust,

If the pattern is a slice of chars, split on each occurrence of any of the characters:

let v: Vec<&str> = "2020-11-03 23:59".split(&['-', ' ', ':', '@'][..]).collect();
assert_eq!(v, ["2020", "11", "03", "23", "59"]);

&['-', ' ', ':', '@'][..] looks very strange to me... how does one parse that?
"- :@" is a UTF-8 encoded &str type.
Is ['-', ' ', ':', '@'] equivalent to "- :@"? If not, why couldn't split() have been written to accept that?
How does &['-', ' ', ':', '@'][..] get parsed to mean "a slice of chars"?

I'm just curious... I stuck this in my code, decided it looked too strange for me to be able to parse, replaced it with (the equivalent of):

let pattern = ['-', ' ', ':', '@'];
let x = some_str.split(&pattern[..]);

and I still think it looks strange and hard to parse. Oh well.

It is not equivalent -- as you say, a &str is UTF-8. chars are u32-sized Unicode scalar values.

Indexing a range (..) on an array ([char; 4]) gets you a slice ([char]), which is dynamically sized (and thus can't be passed as an argument, say); the & at the front turns this into a slice reference (&[char]) that you can pass to the method.


Sometimes the coercion from &[T; N] to &[T] can be automatic, but if it can't be for whatever reason, &var[..] can be used. I think I've ran into this more with Vec than arrays myself, though.

In this particular case, &[char] and &[char; N] and [char; N] are all Patterns, so you can do any of

  • .split(&['-', ' ', ':', '@'][..])
  • .split(&['-', ' ', ':', '@'])
  • .split( ['-', ' ', ':', '@'])

In any case, having them in a separate variable, or at least formatting the code different, is more legible to me.

    let v: Vec<&str> = "2020-11-03 23:59"
        .split(['-', ' ', ':', '@'])
        .collect();
2 Likes

['-', ' ', ':', '@'] is an array of char

The [..] means take a slice of that from beginning to end inclusive.

The & is a reference to the slice (because slices are unsized you can't refer to them directly).

1 Like

I didn't see that it been implemented for [char; N], I guess that's recent since const generics are fairly recent, that's a great ergonomic improvement :slight_smile:

1 Like

PR 86336 that landed in 1.58 less than two weeks ago -- quite recent indeed.

That PR closed issue 39511, which is basically this thread. As explained there, coercion can't kick in due to the generics involved.

3 Likes

Because strings are also patterns. Passing "- :@" to split() means "consider the separator to be literally "- :@"". That's not what the slice of chars means.

2 Likes

Thanks for the explanation.

--wpd

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.