Str::split() and "slice of `char`'s"

According to str - Rust,

If the pattern is a slice of chars, split on each occurrence of any of the characters:

let v: Vec<&str> = "2020-11-03 23:59".split(&['-', ' ', ':', '@'][..]).collect();
assert_eq!(v, ["2020", "11", "03", "23", "59"]);

&['-', ' ', ':', '@'][..] looks very strange to me... how does one parse that?
"- :@" is a UTF-8 encoded &str type.
Is ['-', ' ', ':', '@'] equivalent to "- :@"? If not, why couldn't split() have been written to accept that?
How does &['-', ' ', ':', '@'][..] get parsed to mean "a slice of chars"?

I'm just curious... I stuck this in my code, decided it looked too strange for me to be able to parse, replaced it with (the equivalent of):

let pattern = ['-', ' ', ':', '@'];
let x = some_str.split(&pattern[..]);

and I still think it looks strange and hard to parse. Oh well.

It is not equivalent -- as you say, a &str is UTF-8. chars are u32-sized Unicode scalar values.

Indexing a range (..) on an array ([char; 4]) gets you a slice ([char]), which is dynamically sized (and thus can't be passed as an argument, say); the & at the front turns this into a slice reference (&[char]) that you can pass to the method.

Sometimes the coercion from &[T; N] to &[T] can be automatic, but if it can't be for whatever reason, &var[..] can be used. I think I've ran into this more with Vec than arrays myself, though.

In this particular case, &[char] and &[char; N] and [char; N] are all Patterns, so you can do any of

  • .split(&['-', ' ', ':', '@'][..])
  • .split(&['-', ' ', ':', '@'])
  • .split( ['-', ' ', ':', '@'])

In any case, having them in a separate variable, or at least formatting the code different, is more legible to me.

    let v: Vec<&str> = "2020-11-03 23:59"
        .split(['-', ' ', ':', '@'])

['-', ' ', ':', '@'] is an array of char

The [..] means take a slice of that from beginning to end inclusive.

The & is a reference to the slice (because slices are unsized you can't refer to them directly).

1 Like

I didn't see that it been implemented for [char; N], I guess that's recent since const generics are fairly recent, that's a great ergonomic improvement :slight_smile:

1 Like

PR 86336 that landed in 1.58 less than two weeks ago -- quite recent indeed.

That PR closed issue 39511, which is basically this thread. As explained there, coercion can't kick in due to the generics involved.


Because strings are also patterns. Passing "- :@" to split() means "consider the separator to be literally "- :@"". That's not what the slice of chars means.


Thanks for the explanation.


This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.