No no no, it's UB. Please don't do that.
Let's starts with WHY it's bad. Try run your code with some real world input.
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=a2ade5743019641d323abbeca3e7a9cb
And it crashed with some message below. What happended on it?
Execution operation failed: Output was not valid UTF-8: invalid utf-8 sequence of 1 bytes from index 0
String slices are always valid UTF-8. And the UTF-8 is a variable-width character encoding, means each code point is represented in one or more bytes in the encoded text. For example, the string "์"
is represented in three bytes [236, 149, 136]
. And reversing its bytes produces invalid UTF-8 sequence.
Remember the String slices are always valid UTF-8
guarantee? All guarantees are proved and enforced by the compiler in safe Rust. But in unsafe{}
block, it's you who have responsibility to satisfy every guarantees defined by the language and the libraries. Otherwise it's UB, means you may observe crashes at best, or your entire memory address spaces got silently corrupted so totally unrelated part of your code will behave incorrectly.
As a conclusion, try your absolute best to avoid writing any unsafe{}
block by hand. It's main purpose is to write safe abstraction of building blocks, on some heavily audited codebase like stdlib, so everyone can play safely on those types like Vec<T>
and HashMap<K, V>
. Sometimes you may need to write some of it, like interacting with C FFI. In this case, try your best to write your logic in totally safe Rust and minimize the impact of unsafe
-ness.
Bonus, this is a totally safe and correct version of your function.
Note that this code only reverses code points between whitspces, so multi-codepoint-characters like this emoji ๐จโ๐ฉโ๐งโ๐ง produces some weird result. But it's the problem of the leetcode question itself. Blame leetcode to serve pre-unicode-era questions!
impl Solution {
pub fn reverse_words(mut s: String) -> String {
s.split_whitespace()
.map(|substr| substr.chars().rev())
.flatten()
.collect()
}
}